Elektrik Elektronik Mühendisliği Bölümü Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.11779/1941

Browse

Now showing 1 - 20 of 36

Citation - Scopus: 2
A Decade of Discriminative Language Modeling for Automatic Speech Recognition
(2015) Arısoy, Ebru; Saraçlar, Murat; Dikici, Erinc
This paper summarizes the research on discriminative language modeling focusing on its application to automatic speech recognition (ASR). A discriminative language model (DLM) is typically a linear or log-linear model consisting of a weight vector associated with a feature vector representation of a sentence. This flexible representation can include linguistically and statistically motivated features that incorporate morphological and syntactic information. At test time, DLMs are used to rerank the output of an ASR system, represented as an N-best list or lattice. During training, both negative and positive examples are used with the aim of directly optimizing the error rate. Various machine learning methods, including the structured perceptron, large margin methods and maximum regularized conditional log-likelihood, have been used for estimating the parameters of DLMs. Typically positive examples for DLM training come from the manual transcriptions of acoustic data while the negative examples are obtained by processing the same acoustic data with an ASR system. Recent research generalizes DLM training by either using automatic transcriptions for the positive examples or simulating the negative examples.
Citation - Scopus: 6
A Framework for Automatic Generation of Spoken Question-Answering Data
(Association for Computational Linguistics (ACL), 2022) Manav, Y.; Menevşe, M.Ü.; Özgür, A.; Arısoy, Ebru
This paper describes a framework to automatically generate a spoken question answering (QA) dataset. The framework consists of a question generation (QG) module to generate questions automatically from given text documents, a text-to-speech (TTS) module to convert the text documents into spoken form and an automatic speech recognition (ASR) module to transcribe the spoken content. The final dataset contains question-answer pairs for both the reference text and ASR transcriptions as well as the audio files corresponding to each reference text. For QG and ASR systems we used pre-trained multilingual encoder-decoder transformer models and fine-tuned these models using a limited amount of manually generated QA data and TTS-based speech data, respectively. As a proof of concept, we investigated the proposed framework for Turkish and generated the Turkish Question Answering (TurQuAse) dataset using Wikipedia articles. Manual evaluation of the automatically generated question-answer pairs and QA performance evaluation with state-of-the-art models on TurQuAse show that the proposed framework is efficient for automatically generating spoken QA datasets. To the best of our knowledge, TurQuAse is the first publicly available spoken question answering dataset for Turkish. The proposed framework can be easily extended to other languages where a limited amount of QA data is available. © 2022 Association for Computational Linguistics.
Citation - Scopus: 1
A Microwave Imaging Scheme for Detection of Pulmonary Edema and Hemorrhage
(IEEE, 2022) Ertek, Didem; Kucuk, Gokhan; Bilgin, Egemen
The microwave imaging systems have the potential to present a cost effective and less hazardous alternative to conventional medical imaging techniques. In this paper, a Contrast Source Inversion method based microwave imaging scheme is proposed and tested for the detection of pulmonary edema and hemorrhage. To this end, a realistic human torso phantom is used, and the electromagnetic parameters of the human tissues is determined via Cole-Cole model. The scattered field is simulated via Method of Moments at the operating frequency of 350 MHz, and a 50 dB white Gaussian noise is added to model a realistic measurement setup. The numerical tests performed with the proposed technique suggest that the method can be used to locate the pulmonary edema and hemorrhage, and it is capable of distinguishing these two medical conditions successfully.
Citation - Scopus: 1
A Ran/Sdn Controller Based Connectivity Management Platform for Mobile Service Providers
(Institute of Electrical and Electronics Engineers Inc., 2017) Ayhan, Gökhan; Koca, Melih; Zeydan, Engin; Tan, A. Serdar
In this demo, we demonstrate the integration of radio access network (RAN)/Software-Defined Networking (SDN) controller with a connectivity management platform designed for mobile wireless networks. This is an architecture designed throughout the EU Celtic-Plus project SIGMONA1. OpenDaylight based RAN/SDN controller and the application server are capable of collecting infrastructure and client related parameters from OpenFlow enabled switches and Android based phones respectively. The decision on the best access network selection is computed at the application server using a Multiple Attribute Decision Making (MADM) algorithm and instructed back to Android-based mobile client for execution of access network selection. © 2017 IFIP.
A Resonator Design For Mutual Coupling Reduction Between Microstrip Antennas In Mımo Applications At 28 Ghz
(Institute of Electrical and Electronics Engineers Inc., 2024) Gollu, A.A.; Polat, B.; Semerci, D.; Bilgin, E.
A simple resonator structure is proposed to reduce the mutual coupling between rectangular microstrip patch antennas positioned close to each other for MIMO applications at 28 GHz center frequency. Here, the frequency of 28 GHz is chosen because it is one of middle bands for 5G communication in USA. Two microstrip patch antennas with gaps using a common dielectric substrate and a ground plane are employed as antennas and the patches are closely placed with an edge-to-edge distance of 0.6 mm (approximately λ/18). In order to reduce the mutual coupling between these radiating elements and increase the isolation, a resonator is positioned between them and its parameters are optimized. In the simulations, it is observed that the proposed resonator reduces the coupling by approximately 10 dB. By this result, it can be concluded that the proposed structure may be suitable for tightly packed MIMO systems. © 2024 IEEE.
Citation - WoS: 3
Citation - Scopus: 10
An Fpga Implementation of a Risc-V Based Soc System for Image Processing Applications
(IEEE, 2021) Gholizadehazari, Erfan; Ayhan, Tuba; Ors, Berna
The Laplacian filter is one of the fundamental applications in image processing. In our work, the Laplacian filter has been applied to an image, and both hardware and software implementation of the filter has been studied. Our system consists of an OV7670 Camera module, Nexys 4 DDR FPGA board and VGA monitor to display the processed video stream. Mentioned process has forwarding tasks: camera module captures raw RGB data and writes to RAM, Laplacian filter IP processes raw image and the results written back to memory. VGA modules show output images to monitor. The Laplacian filter part considered in hardware and software implementation is compared in terms of time and area.
Citation - Scopus: 1
An Fpga Implementation of Givens Rotation Based Digital Architecture for Computing Eigenvalues of Asymmetric Matrix
(IEEE, 2022) Köseoğlu, İlayda; Yalçın, Mustak Erhan; Öztürk, Elif; Ayhan, Tuba
This paper proposes the digital circuit design that performs the eigenvalue calculation of asymmetric matrices with realvalued elements. Eigenvalues are computed iteratively through the QR algorithm. In the QR algorithm, the input matrix is factorized into orthogonal Q and upper triangular R matrix, then the RQ product is calculated to obtain an iterated matrix. For a time-efficient QR decomposition process, the Givens Rotation (GR) Principle is utilized to benefit from the parallelization feature. Parallelization is managed by the Systolic Array (SA) architecture that is created by placing Givens Generation (GG) and Row Updates (RU) blocks in a triangle array. In this paper, 4×4 input matrix is used to create a TSA architecture including n-1 diagonal (GG), and (n ∗ (n−1))/2 off-diagonal (RU) modules. In the results section, Givens Rotation is compared with the Gram Schmidt algorithm used in our previous study [1] in terms of error, and area usage.
Citation - WoS: 60
Bidirectional Recurrent Neural Network Language Models for Automatic Speech Recognition
(2015) Chen, Stanley; Sethy, Abhinav; Ramabhadran, Bhuvana; Arısoy, Ebru
Recurrent neural network language models have enjoyed great success in speech recognition, partially due to their ability to model longer-distance context than word n-gram models. In recurrent neural networks (RNNs), contextual information from past inputs is modeled with the help of recurrent connections at the hidden layer, while Long Short-Term Memory (LSTM) neural networks are RNNs that contain units that can store values for arbitrary amounts of time. While conventional unidirectional networks predict outputs from only past inputs, one can build bidirectional networks that also condition on future inputs. In this paper, we propose applying bidirectional RNNs and LSTM neural networks to language modeling for speech recognition. We discuss issues that arise when utilizing bidirectional models for speech, and compare unidirectional and bidirectional models on an English Broadcast News transcription task. We find that bidirectional RNNs significantly outperform unidirectional RNNs, but bidirectional LSTMs do not provide any further gain over their unidirectional counterparts.
Çevrimde Imza Doğrulama için Fpga Üzerinde Gerçek Zamanlı Sistem Tasarımı
(2020) Ayhan, Tuba; Orak, Remzi
Bu proje kapsamında, çevrimde imza doğrulama sistemi gerçeklenmiştir. Sistem dokunmatik ekran üzerinden imza (paraf ya da el yazısı bir karakter) alıp, belleğindeki imza öznitelikleri ile karşılaştırarak imzanın iddia edilen kişiye ait olup olmadığını göstermektedir. Orjinal imza resimleri bellekte tutulmadığından sistem imza hırsızlığına karşı bir miktar dayanıklıdır. Sistem dokunmatik ekran, Zynq-7000 geliştirme kartı ve dokunmatik ekran kaleminden oluşur. İmza atıldıktan 0.13 s sonra doğrulama sonucu ekranda verilir. Kullanım rahatlığı açısından atılan imzanın resmi ekranda da gösterilmektedir. Sistemin test ortamında sınıflama başarımı yetenekli taklitçi için %60 dolayında kalsa da sıradan taklitçi için %100?ü bulmaktadır. Proje kapsamında oluşturulup araştırmacılara açılan veri kümesinde tasniflenmiş 500 imza bulunmaktadır. Projenin tüm kaynak kodları github üzerinden açılmıştır. Proje ile ilgili bilgiler, kodlar, veri kümesi ve kısa video da proje sayfası (https://sites.google.com/mef.edu.tr/imza) üzerinde yayındadır.
Citation - Scopus: 1
Cnn-Based Emotion Recognition Using Data Augmentation and Preprocessing Methods
(Institute of Electrical and Electronics Engineers Inc., 2023) Toktaş, Tolga; Kırbız, Serap; Kayaoğlu, Bora
In this paper, a system that recognizes emotion from human faces is designed using Convolutional Neural Networks (CNN). CNN is known to perform well when trained with a large database. The lack of large and balanced publicly available databases that can be used by deep learning methods for emotion recognition is still a challenge. To overcome this problem, the number of data is increased by merging FER+, CK+ and KDEF databases; and preprocessing is applied to the face images in order to reduce the variations in the database. Data augmentation methods are used to reduce the imbalance in the data distribution that still remains despite the increasing number of data in the merged database. The CNN-based method developed using database merging, image preprocessing and data augmentation, achieved emotion recognition with 80% accuracy.
Çok Düşük Enerji Tüketen Taşınabilir Kullanıma Uygun Yapay Sinir Ağlarının Donanım Gerçeklemeleri
(2023) Kumbasar, Tufan; Altun, Mustafa; Ayhan, Tuba
Yapay sinir ağları (artificial neural networks, ANN) ile ilgili literatürde yer alan araştırmalar ve bunların endüstriyel uygulamaları son yıllarda hızlı bir şekilde artmaktadır. Buradaki temel motivasyon, geleneksel yöntemler ile yüksek doğruluklu olarak çözülmesi zor problemlerin ANN?ler ile çözülebilmesidir. Diğer taraftan, ANN?lerin kullanımı geleneksel yöntemlere göre, başta enerji olmak üzere, çok daha fazla donanımsal kaynak gerektirmektedir. Örnek vermek gerekirse, 16×16 boyutunda 256 adet piksel içeren oldukça küçük bir görüntünün her bir pikselinin ve ANN ağırlıklarının 8-bitlik girişler ile temsil edildiğini varsayalım. Bu durumda, tek bir yapay nöron, 256 adet 8-bitlik çarpma işlemi, bu çarpım sonuçlarının toplanması için minimum 16-bitlik 255 adet toplama işlemi ve bu toplam sonucunun normalize edilmesi için bir aktivasyon fonksiyonu gerektirir. Görece küçük büyüklükteki bir ANN?de bu nöronlardan yüzlerce olduğu düşünülürse, bu kadar ağırlığın bellekte tutulmasının ve yapılacak aritmetik işlemlerin, özellikle enerji tüketimi açısından, oldukça maliyetli olacağı açıktır. Bu durum ANN?lerin taşınabilir cihazlarda kullanılabilmelerini fazlasıyla kısıtlamaktadır ve bu çalışmanın temel motivasyonlarından biridir. Önerilen çalışmada, çok düşük enerji tüketen ANN?ler önerilen yeni sayı hibrit gösterimi kullanılarak tasarlanmıştır, donanım optimizasyonları yapılmıştır ve nesne takibi uygulamalarında kullanılmıştır. Yapılan çalışmalar aşağıdaki üç ana başlıkta değerlendirilebilir. Bu üç ana başlık çalışmanın desteklediği 119E507 Nolu TÜBİTAK projesinde üç iş paketi olarak yer almaktadır. ? ANN enerji tasarrufu için yeni sayı gösterimlerinin sunulması ve devre bloklarının tasarımının yapılması. ? Enerji odaklı ANN donanım tasarımları ve optimizasyonunun yapılması. ? Nesne takibi yapan ANN tasarımlarının özel tümleşik devreler (application specific integration circuits, ASIC) ve alanda programlanabilir kapı dizileri (field programmable gate arrays, FPGA) tasarım platformlarında gerçeklenmesi.
Citation - WoS: 2
Citation - Scopus: 5
Compositional Neural Network Language Models for Agglutinative Languages
(2016) Saraçlar, Murat; Arısoy, Ebru
Continuous space language models (CSLMs) have been proven to be successful in speech recognition. With proper training of the word embeddings, words that are semantically or syntactically related are expected to be mapped to nearby locations in the continuous space. In agglutinative languages, words are made up of concatenation of stems and suffixes and, as a result, compositional modeling is important. However, when trained on word tokens, CSLMs do not explicitly consider this structure. In this paper, we explore compositional modeling of stems and suffixes in a long short-term memory neural network language model. Our proposed models jointly learn distributed representations for stems and endings (concatenation of suffixes) and predict the probability for stem and ending sequences. Experiments on the Turkish Broadcast news transcription task show that further gains on top of a state-of-theart stem-ending-based n-gram language model can be obtained with the proposed models.
Differential Microwave Imaging of Cerebral Hemorrhage Via Dort Method
(IEEE, 2023) Dilman, İsmail; Bilgin, Egemen; Doğu, Semih
Bleeding in the brain tissues may cause fatal health conditions and continuous monitoring of the change in this blood accumulation becomes important in the first few hours after the incident. The continuous post-event monitoring aims to detect the variations in the size and the shape of the hemorrhage regions. To this end, the human head is illuminated by non-ionizing electromagnetic radiation, and the scattered field is measured in different time instants. The decomposition of the time-reversal (DORT) method is then used as the microwave imaging algorithm to produce an indicator function. The performance of the proposed technique is assessed via numerical simulations involving a realistic human head phantom. The results suggest that the DORT method is capable of detecting the changes in multiple simultaneous cerebral hemorrhage regions successfully.
Citation - WoS: 1
Citation - Scopus: 1
Domain Adaptation Approaches for Acoustic Modeling
(IEEE, 2020) Arısoy, Ebru; Fakhan, Enver
In the recent years, with the development of neural network based models, ASR systems have achieved a tremendous performance increase. However, this performance increase mostly depends on the amount of training data and the computational power. In a low-resource data scenario, publicly available datasets can be utilized to overcome data scarcity. Furthermore, using a pre-trained model and adapting it to the in-domain data can help with computational constraint. In this paper we have leveraged two different publicly available datasets and investigate various acoustic model adaptation approaches. We show that 4% word error rate can be achieved using a very limited in-domain data.
Citation - Scopus: 2
Feasibility of Distorted Born Iterative Method for Detecting Early Stage of Heart Failure
(IEEE, 2020) Akıncı, Mehmet Nuri; Bilgin, Egemen; Joof, Sulayman; Doğu, Semih
In this paper, we analyze the feasibility of using microwaves to detect early stage of congestive heart failure, which causes water accumulation in the lungs. To this aim, a slice from realistic human torso phantom, which consists of all human tissues and organs, is considered. Constitutive parameters of the phantom are calculated by multiple order Cole-Cole model at operating frequency. Then, the scattered field is calculated via method of moment and a 30 dB additive white Gaussian noise is added to create a more realistic scenario. In the solution of inverse scattering phase, distorted Born iterative method is utilized. The presented results show the feasibility of the proposed method.
Highlighting of Lecture Video Closed Captions
(IEEE, 2020) Yıldırım, Göktuğ; Öztufan, Huseyin Efe; Arısoy, Ebru
The main purpose of this study is to automatically highlight important regions of lecture video subtitles. Even though watching videos is an effective way of learning, the main disadvantage of video-based education is limited interaction between the learner and the video. With the developed system, important regions that are automatically determined in lecture subtitles will be highlighted with the aim of increasing the learner's attention to these regions. In this paper first the lecture videos are converted into text by using an automatic speech recognition system. Then continuous space representations for sentences or word sequences in the transcriptions are generated using Bidirectional Encoder Representations from Transformers (BERT). Important regions of the subtitles are selected using a clustering method based on the similarity of these representations. The developed system is applied to the lecture videos and it is found that using word sequence representations in determining the important regions of subtitles gives higher performance than using sentence representations. This result is encouraging in terms of automatic highlighting of speech recognition outputs where sentence boundaries are not defined explicitly.
Impact of Hardware Sources on Feature Selection for Online Signature Verification
(IEEE, 2020) Ayhan, Tuba; Orak, Remzi
This work analyzes time series features gathered from a touchpad which is a part of online signature verification system. A DTW processing unit is implemented on FPGA to be used in time series analysis. To support different feature groups, this unit can be reconfigured without altering the memory structure. By using this reconfigurable unit, features are evaluated according to the area cost that they introduce. Moreover, a method to predict the value of features for classification is introduced. This way, minimum requirements to implement an online signature verification system on FPGA are partially obtained.
Citation - WoS: 1
Citation - Scopus: 2
Improving the Usage of Subword-Based Units for Turkish Speech Recognition
(IEEE, 2020) Çetinkaya, Gözde; Saraçlar, Murat; Arısoy, Ebru
Subword units are often utilized to achieve better performance in speech recognition because of the high number of observed words in agglutinative languages. In this study, the proper use of subword units is explored in recognition by a reconsideration of details such as silence modeling and position-dependent phones. A modified lexicon by finite-state transducers is implemented to represent the subword units correctly. Also, we experiment with different types of word boundary markers and achieve the best performance by adding a marker both to the left and right side of a subword unit. In our experiments on a Turkish broadcast news dataset, the subword models do outperform word-based models and naive subword implementations. Results show that using proper subword units leads to a relative word error rate (WER) reductions, which is 2.4%, compared with the word level automatic speech recognition (ASR) system for Turkish.
İnsan-robot Dokunsal (haptik) Etkileşimi için Makine Öğrenme Tabanlı Admitans Kontrolü
(2021) Başdoğan, Çağatay; Patoğlu, Volkan; Niaz, Pouya Pourakbarian; Aydın, Yusuf; Necipoğlu, Serkan; Şirintuna, Doğanay; Çaldıran, Ozan
Yakın gelecekte, fabrika, ev, hastane gibi farklı ortamlarda, insanlar ve robotların birlikte çalışarak, fiziksel etkileşim gerektiren görevleri ortaklaşa yerine getirebilmeleri beklenmektedir. Fiziksel insan-robot etkileşimi konusundaki önemli araştırma konularından birisi, ortaklar arasında doğal bir iletişimin kurulmasıdır. İnsan-robot etkileşimi konusunda hali hazırda çeşitli sayıda çalışmalar bulunmasına rağmen, ortaklar arasındaki fiziksel etkileşimi, bilhassa dokunsal (haptik) tabanlı iletişimi inceleyen çalışmalar sınırlı sayıdadır ve bu tip sistemlerdeki etkileşim hala doğal insan-insan etkileşimine kıyaslandığında yapay kalmaktadır. Bu projede, insanla beraber ortak görevler yapabilecek işbirlikçi bir robot için kesir dereceli ve uyarlamalı (adaptif) bir admitans kontrolcü geliştirildi. Bilgimiz dahilinde kesir dereceli bir admitans kontrolcü insan-robot fiziksel etkileşimi için daha önce kullanılmamıştır. Kesir dereceli kontrolcülerin en önemli özelliği, tamsayı olmayan türev ve integralin kullanılabilmesidir ki bu da bize birleşik sistemin (insan-robot) dinamiğinin modellenmesinde ve denetlenmesinde, tam sayılı bir kontrolcüye göre, esneklik sağlamıştır. Ayrıca, kesir dereceli bir admitans kontrolcünün makine öğrenmesi algoritmaları vasıtasıyla uyarlanabilir şekilde kullanıldığına dair bir örnek literatürde mevcut değildir. Makine öğrenmesi algoritmaları, bizim görev sırasında insanın niyetini anlamamızı ve buna göre görev performansını optimize edecek şekilde kontrolcü parametrelerini seçmemizi sağladı. Projede geliştirilen yöntemlerin etkinliğini sınamak için laboratuvar ortamında, insan ve robot arasında fiziksel etkileşim gerektiren kontrollü deneyler 12 adet denekle yapıldı. Bu deneylerde, denekler, robot koluna bağlanmış bir matkap aracılığıyla dik ve düz tahta bir yüzey üzerinde delikler açtılar. Makina öğrenmesi teknikleri kullanılarak kullanıcın hangi alt-görevi (textit{Bekleme, Serbest Hareket, ve Temas}) yerine getirdiği gerçek zamanlı olarak tespit edildi ve buna göre kontrolcünün parametreleri uyarlandı. Bu sayede, robotun insan tarafından yönlendirilip delik açılacak noktaya yaklaştırılırken (textit{Serbest Hareket}) insana düşük direnç (şeffaflık), delme sırasında (textit{Temas}) ise oluşacak titreşimleri azaltarak sistemi daha kararlı ve güvenli hale getirecek şekilde yüksek direnç göstermesi sağlandı. Bu deneylerden elde edilen sonuçlar, insan-robot etkileşimi için, uyarlamalı ve kesir dereceli bir kontrolcünün tam sayılı ve sabit parametreli bir kontrolcüye göre, görev performanı açısından, çok daha etkili olduğunu gösterdi. Son olarak, projede geliştirilen sistemin endüstriyel ortamda geçerliliğini sınamak için, endüstriyel ortağımız olan As-Metal şirketinden 3 adet işçi laboratuvarımıza davet edildi ve eğrili (curved) bir tahta yüzeyde delik açma deneyleri yapıldı. İşçilerden yüzey üzerinde 3 farklı noktada ve 3 farklı açıda delik açmaları istendi. İşçiler bu görevi yerine getirirken hem işbirlikçi robotumuzdan hem de bir artırılmış gerçeklik arayüzünden destek aldılar. Deneylerden sonra, işçilerden geliştirilen sistem hakkında fikirlerini iletebilecekleri bir anket doldurmaları istendi. Bu anket ve işçilerle yapılan kişisel görüşmeler vasıtasıyla robotun güvenirliği, kullanım kolaylığı ve görevi gerçekleştirmesindeki katkısı ölçüldü. Bu anketten elde edilen sonuçlar bize geliştirilen bu insan-robot etkileşim sisteminin endüstriyel uygulamlar için uygun, kolay, ve etkili olduğunu gösterdi.
Integration and Management of Wi-Fi Offloading in Service Provider Infrastructures
(2016) Zeydan, Engin; Tan, A. Serdar
Integration of offloading technologies into mobile network operator's infrastructures that provide heterogeneous access services is a challenging task for mobile operators. A connectivity management platform is a key element for heterogeneous mobile network operators in order to enable optimal offloading. In this study, development and integration of a connectivity management platform that uses a novel multiple attribute decision making algorithms for efficient Wi-Fi Offloading in heterogeneous wireless networks is presented. The proposed platform collects several terminal and network level attributes via infrastructure and client Application Programming Interfaces (APIs) and decides the best network access technology to connect for requested users. Through experimentation, we provide details on the platform integration with service provider's network and sensitivity analysis of the multiple attribute decision making algorithm.

Browse

Browsing Elektrik Elektronik Mühendisliği Bölümü Koleksiyonu by WoS Q "N/A"