Elektrik Elektronik Mühendisliği Bölümü Koleksiyonu
Permanent URI for this collectionhttps://hdl.handle.net/20.500.11779/1941
Browse
Browsing Elektrik Elektronik Mühendisliği Bölümü Koleksiyonu by Title
Now showing 1 - 20 of 66
- Results Per Page
- Sort Options
Article A Bayesian Allocation Model Based Approach To Mixed Membership Stochastic Blockmodels(Taylor and Francis Ltd., 2022) Kırbız, Serap; Hızlı, ÇağlarAlthough detecting communities in networks has attracted considerable recent attention, estimating the number of communities is still an open problem. In this paper, we propose a model, which replicates the generative process of the mixed-membership stochastic block model (MMSB) within the generic allocation framework of Bayesian allocation model (BAM) and BAM-MMSB. In contrast to traditional blockmodels, BAM-MMSB considers the observations as Poisson counts generated by a base Poisson process and marks according to the generative process of MMSB. Moreover, the optimal number of communities for BAM-MMSB is estimated by computing the variational approximations of the marginal likelihood for each model order. Experiments on synthetic and real data sets show that the proposed approach promises a generalized model selection solution that can choose not only the model size but also the most appropriate decomposition.Conference Object Citation - Scopus: 2A Decade of Discriminative Language Modeling for Automatic Speech Recognition(2015) Arısoy, Ebru; Saraçlar, Murat; Dikici, ErincThis paper summarizes the research on discriminative language modeling focusing on its application to automatic speech recognition (ASR). A discriminative language model (DLM) is typically a linear or log-linear model consisting of a weight vector associated with a feature vector representation of a sentence. This flexible representation can include linguistically and statistically motivated features that incorporate morphological and syntactic information. At test time, DLMs are used to rerank the output of an ASR system, represented as an N-best list or lattice. During training, both negative and positive examples are used with the aim of directly optimizing the error rate. Various machine learning methods, including the structured perceptron, large margin methods and maximum regularized conditional log-likelihood, have been used for estimating the parameters of DLMs. Typically positive examples for DLM training come from the manual transcriptions of acoustic data while the negative examples are obtained by processing the same acoustic data with an ASR system. Recent research generalizes DLM training by either using automatic transcriptions for the positive examples or simulating the negative examples.Conference Object Citation - Scopus: 4A Framework for Automatic Generation of Spoken Question-Answering Data(Association for Computational Linguistics (ACL), 2022) Manav, Y.; Menevşe, M.Ü.; Özgür, A.; Arısoy, EbruThis paper describes a framework to automatically generate a spoken question answering (QA) dataset. The framework consists of a question generation (QG) module to generate questions automatically from given text documents, a text-to-speech (TTS) module to convert the text documents into spoken form and an automatic speech recognition (ASR) module to transcribe the spoken content. The final dataset contains question-answer pairs for both the reference text and ASR transcriptions as well as the audio files corresponding to each reference text. For QG and ASR systems we used pre-trained multilingual encoder-decoder transformer models and fine-tuned these models using a limited amount of manually generated QA data and TTS-based speech data, respectively. As a proof of concept, we investigated the proposed framework for Turkish and generated the Turkish Question Answering (TurQuAse) dataset using Wikipedia articles. Manual evaluation of the automatically generated question-answer pairs and QA performance evaluation with state-of-the-art models on TurQuAse show that the proposed framework is efficient for automatically generating spoken QA datasets. To the best of our knowledge, TurQuAse is the first publicly available spoken question answering dataset for Turkish. The proposed framework can be easily extended to other languages where a limited amount of QA data is available. © 2022 Association for Computational Linguistics.Conference Object Citation - Scopus: 1A Microwave Imaging Scheme for Detection of Pulmonary Edema and Hemorrhage(IEEE, 2022) Ertek, Didem; Kucuk, Gokhan; Bilgin, EgemenThe microwave imaging systems have the potential to present a cost effective and less hazardous alternative to conventional medical imaging techniques. In this paper, a Contrast Source Inversion method based microwave imaging scheme is proposed and tested for the detection of pulmonary edema and hemorrhage. To this end, a realistic human torso phantom is used, and the electromagnetic parameters of the human tissues is determined via Cole-Cole model. The scattered field is simulated via Method of Moments at the operating frequency of 350 MHz, and a 50 dB white Gaussian noise is added to model a realistic measurement setup. The numerical tests performed with the proposed technique suggest that the method can be used to locate the pulmonary edema and hemorrhage, and it is capable of distinguishing these two medical conditions successfully.Conference Object Citation - WoS: 1A Modified Newton Method Formulation for Microwave Imaging(IEEE, 2020) Coşğun, Sema; Çayören, Mehmet; Bilgin, Egemen; Doğu, SemihA new variant of Newton type methods has been developed for quantitative microwave imaging. To deal with the ill-posedness of the inverse problems, standard Newton type methods involve a linearization of the so called data equation using the Fréchet derivative with respect to the contrast function. Here, the formulation is expanded to include the object equation, therefore, the formulation seeks to reduce the errors in both the data and the object equations. While this modification does not remove the need to solve forward problem at each step, it nevertheless significantly improves convergence rate and the performance. To assess the efficiency of the proposed technique, numerical simulations with synthetic and experimental data have been carried out. The results demonstrate that the proposed variant outperforms the standard Newton method, and shows comparable performance to the contrast source inversion (CSI) algorithm with fewer iterations.Conference Object Citation - Scopus: 1A Ran/Sdn Controller Based Connectivity Management Platform for Mobile Service Providers(Institute of Electrical and Electronics Engineers Inc., 2017) Ayhan, Gökhan; Koca, Melih; Zeydan, Engin; Tan, A. SerdarIn this demo, we demonstrate the integration of radio access network (RAN)/Software-Defined Networking (SDN) controller with a connectivity management platform designed for mobile wireless networks. This is an architecture designed throughout the EU Celtic-Plus project SIGMONA1. OpenDaylight based RAN/SDN controller and the application server are capable of collecting infrastructure and client related parameters from OpenFlow enabled switches and Android based phones respectively. The decision on the best access network selection is computed at the application server using a Multiple Attribute Decision Making (MADM) algorithm and instructed back to Android-based mobile client for execution of access network selection. © 2017 IFIP.Conference Object A Resonator Design For Mutual Coupling Reduction Between Microstrip Antennas In Mımo Applications At 28 Ghz(Institute of Electrical and Electronics Engineers Inc., 2024) Gollu, A.A.; Polat, B.; Semerci, D.; Bilgin, E.A simple resonator structure is proposed to reduce the mutual coupling between rectangular microstrip patch antennas positioned close to each other for MIMO applications at 28 GHz center frequency. Here, the frequency of 28 GHz is chosen because it is one of middle bands for 5G communication in USA. Two microstrip patch antennas with gaps using a common dielectric substrate and a ground plane are employed as antennas and the patches are closely placed with an edge-to-edge distance of 0.6 mm (approximately λ/18). In order to reduce the mutual coupling between these radiating elements and increase the isolation, a resonator is positioned between them and its parameters are optimized. In the simulations, it is observed that the proposed resonator reduces the coupling by approximately 10 dB. By this result, it can be concluded that the proposed structure may be suitable for tightly packed MIMO systems. © 2024 IEEE.Article Citation - WoS: 37Citation - Scopus: 44Adaptive Human Force Scaling Via Admittance Control for Physical Human-Robot Interaction(IEEE, 2021) Başdoğan, Çağatay; Aydın, Yusuf; Hamad, Yahya M.The goal of this article is to design an admittance controller for a robot to adaptively change its contribution to a collaborative manipulation task executed with a human partner to improve the task performance. This has been achieved by adaptive scaling of human force based on her/his movement intention while paying attention to the requirements of different task phases. In our approach, movement intentions of human are estimated from measured human force and velocity of manipulated object, and converted to a quantitative value using a fuzzy logic scheme. This value is then utilized as a variable gain in an admittance controller to adaptively adjust the contribution of robot to the task without changing the admittance time constant. We demonstrate the benefits of the proposed approach by a pHRI experiment utilizing Fitts’ reaching movement task. The results of the experiment show that there is a) an optimum admittance time constant maximizing the human force amplification and b) a desirable admittance gain profile which leads to a more effective co-manipulation in terms of overall task performance.Article Citation - WoS: 19Citation - Scopus: 21An Adaptive Admittance Controller for Collaborative Drilling With a Robot Based on Subtask Classification Via Deep Learning(Elsevier, 2022) Başdoğan, Çağatay; Niaz, P. Pouya; Aydın, Yusuf; Güler, Berk; Madani, AlirezaIn this paper, we propose a supervised learning approach based on an Artificial Neural Network (ANN) model for real-time classification of subtasks in a physical human–robot interaction (pHRI) task involving contact with a stiff environment. In this regard, we consider three subtasks for a given pHRI task: Idle, Driving, and Contact. Based on this classification, the parameters of an admittance controller that regulates the interaction between human and robot are adjusted adaptively in real time to make the robot more transparent to the operator (i.e. less resistant) during the Driving phase and more stable during the Contact phase. The Idle phase is primarily used to detect the initiation of task. Experimental results have shown that the ANN model can learn to detect the subtasks under different admittance controller conditions with an accuracy of 98% for 12 participants. Finally, we show that the admittance adaptation based on the proposed subtask classifier leads to 20% lower human effort (i.e. higher transparency) in the Driving phase and 25% lower oscillation amplitude (i.e. higher stability) during drilling in the Contact phase compared to an admittance controller with fixed parameters.Conference Object Citation - Scopus: 5An Antipodal Vivaldi Antenna Design for Torso Imaging in a Coupling Medium(IEEE, 2021) Çayören, Mehmet; Bilgin, Egemen; Joof, Sulayman; Doğu, SemihAn antipodal Vivaldi antenna designed to operate in a coupling medium with a relative dielectric constant of epsilon(r) = 25 for microwave imaging of torso is presented in this paper. The proposed antenna is similar to the conventional antipodal Vivaldi antenna but with optimized parameters to radiate in the desired coupling medium. The antenna has a size of 120x70 mm(2) and operating over 230-1000 MHz frequency bandwidth with a peak gain of 5.42 dBi and peak front-to-back ratio of 143 dB. The designed antenna shows a better performance compared to other antennas used for microwave torso imaging. To assess the actual performance, a realistic human torso phantom is implemented to detect the water accumulation in the lungs, and as the inversion method linear sampling method is used. The 3-D reconstruction results show that the proposed antenna can be a candidate for microwave torso imaging applications.Conference Object Citation - WoS: 2Citation - Scopus: 10An Fpga Implementation of a Risc-V Based Soc System for Image Processing Applications(IEEE, 2021) Gholizadehazari, Erfan; Ayhan, Tuba; Ors, BernaThe Laplacian filter is one of the fundamental applications in image processing. In our work, the Laplacian filter has been applied to an image, and both hardware and software implementation of the filter has been studied. Our system consists of an OV7670 Camera module, Nexys 4 DDR FPGA board and VGA monitor to display the processed video stream. Mentioned process has forwarding tasks: camera module captures raw RGB data and writes to RAM, Laplacian filter IP processes raw image and the results written back to memory. VGA modules show output images to monitor. The Laplacian filter part considered in hardware and software implementation is compared in terms of time and area.Conference Object Citation - Scopus: 1An Fpga Implementation of Givens Rotation Based Digital Architecture for Computing Eigenvalues of Asymmetric Matrix(IEEE, 2022) Köseoğlu, İlayda; Yalçın, Mustak Erhan; Öztürk, Elif; Ayhan, TubaThis paper proposes the digital circuit design that performs the eigenvalue calculation of asymmetric matrices with realvalued elements. Eigenvalues are computed iteratively through the QR algorithm. In the QR algorithm, the input matrix is factorized into orthogonal Q and upper triangular R matrix, then the RQ product is calculated to obtain an iterated matrix. For a time-efficient QR decomposition process, the Givens Rotation (GR) Principle is utilized to benefit from the parallelization feature. Parallelization is managed by the Systolic Array (SA) architecture that is created by placing Givens Generation (GG) and Row Updates (RU) blocks in a triangle array. In this paper, 4×4 input matrix is used to create a TSA architecture including n-1 diagonal (GG), and (n ∗ (n−1))/2 off-diagonal (RU) modules. In the results section, Givens Rotation is compared with the Gram Schmidt algorithm used in our previous study [1] in terms of error, and area usage.Article Citation - WoS: 21Audio Source Separation Using Variational Autoencoders and Weak Class Supervision(Institute of Electrical and Electronics Engineers (IEEE), 2019) Kırbız, Serap; Karamatlı, Ertuğ; Cemgil, Ali TaylanIn this letter, we propose a source separation method that is trained by observing the mixtures and the class labels of the sources present in the mixture without any access to isolated sources. Since our method does not require source class labels for every time-frequency bin but only a single label for each source constituting the mixture signal, we call this scenario as weak class supervision. We associate a variational autoencoder (VAE) with each source class within a non negative (compositional) model. Each VAE provides a prior model to identify the signal from its associated class in a sound mixture. After training the model on mixtures, we obtain a generative model for each source class and demonstrate our method on one-second mixtures of utterances of digits from 0 to 9. We show that the separation performance obtained by source class supervision is as good as the performance obtained by source signal supervision.Conference Object Citation - WoS: 58Bidirectional Recurrent Neural Network Language Models for Automatic Speech Recognition(2015) Chen, Stanley; Sethy, Abhinav; Ramabhadran, Bhuvana; Arısoy, EbruRecurrent neural network language models have enjoyed great success in speech recognition, partially due to their ability to model longer-distance context than word n-gram models. In recurrent neural networks (RNNs), contextual information from past inputs is modeled with the help of recurrent connections at the hidden layer, while Long Short-Term Memory (LSTM) neural networks are RNNs that contain units that can store values for arbitrary amounts of time. While conventional unidirectional networks predict outputs from only past inputs, one can build bidirectional networks that also condition on future inputs. In this paper, we propose applying bidirectional RNNs and LSTM neural networks to language modeling for speech recognition. We discuss issues that arise when utilizing bidirectional models for speech, and compare unidirectional and bidirectional models on an English Broadcast News transcription task. We find that bidirectional RNNs significantly outperform unidirectional RNNs, but bidirectional LSTMs do not provide any further gain over their unidirectional counterparts.Research Project Çevrimde Imza Doğrulama için Fpga Üzerinde Gerçek Zamanlı Sistem Tasarımı(2020) Ayhan, Tuba; Orak, RemziBu proje kapsamında, çevrimde imza doğrulama sistemi gerçeklenmiştir. Sistem dokunmatik ekran üzerinden imza (paraf ya da el yazısı bir karakter) alıp, belleğindeki imza öznitelikleri ile karşılaştırarak imzanın iddia edilen kişiye ait olup olmadığını göstermektedir. Orjinal imza resimleri bellekte tutulmadığından sistem imza hırsızlığına karşı bir miktar dayanıklıdır. Sistem dokunmatik ekran, Zynq-7000 geliştirme kartı ve dokunmatik ekran kaleminden oluşur. İmza atıldıktan 0.13 s sonra doğrulama sonucu ekranda verilir. Kullanım rahatlığı açısından atılan imzanın resmi ekranda da gösterilmektedir. Sistemin test ortamında sınıflama başarımı yetenekli taklitçi için %60 dolayında kalsa da sıradan taklitçi için %100?ü bulmaktadır. Proje kapsamında oluşturulup araştırmacılara açılan veri kümesinde tasniflenmiş 500 imza bulunmaktadır. Projenin tüm kaynak kodları github üzerinden açılmıştır. Proje ile ilgili bilgiler, kodlar, veri kümesi ve kısa video da proje sayfası (https://sites.google.com/mef.edu.tr/imza) üzerinde yayındadır.Conference Object Citation - Scopus: 1Cnn-Based Emotion Recognition Using Data Augmentation and Preprocessing Methods(Institute of Electrical and Electronics Engineers Inc., 2023) Toktaş, Tolga; Kırbız, Serap; Kayaoğlu, BoraIn this paper, a system that recognizes emotion from human faces is designed using Convolutional Neural Networks (CNN). CNN is known to perform well when trained with a large database. The lack of large and balanced publicly available databases that can be used by deep learning methods for emotion recognition is still a challenge. To overcome this problem, the number of data is increased by merging FER+, CK+ and KDEF databases; and preprocessing is applied to the face images in order to reduce the variations in the database. Data augmentation methods are used to reduce the imbalance in the data distribution that still remains despite the increasing number of data in the merged database. The CNN-based method developed using database merging, image preprocessing and data augmentation, achieved emotion recognition with 80% accuracy.Research Project Çok Düşük Enerji Tüketen Taşınabilir Kullanıma Uygun Yapay Sinir Ağlarının Donanım Gerçeklemeleri(2023) Kumbasar, Tufan; Altun, Mustafa; Ayhan, TubaYapay sinir ağları (artificial neural networks, ANN) ile ilgili literatürde yer alan araştırmalar ve bunların endüstriyel uygulamaları son yıllarda hızlı bir şekilde artmaktadır. Buradaki temel motivasyon, geleneksel yöntemler ile yüksek doğruluklu olarak çözülmesi zor problemlerin ANN?ler ile çözülebilmesidir. Diğer taraftan, ANN?lerin kullanımı geleneksel yöntemlere göre, başta enerji olmak üzere, çok daha fazla donanımsal kaynak gerektirmektedir. Örnek vermek gerekirse, 16×16 boyutunda 256 adet piksel içeren oldukça küçük bir görüntünün her bir pikselinin ve ANN ağırlıklarının 8-bitlik girişler ile temsil edildiğini varsayalım. Bu durumda, tek bir yapay nöron, 256 adet 8-bitlik çarpma işlemi, bu çarpım sonuçlarının toplanması için minimum 16-bitlik 255 adet toplama işlemi ve bu toplam sonucunun normalize edilmesi için bir aktivasyon fonksiyonu gerektirir. Görece küçük büyüklükteki bir ANN?de bu nöronlardan yüzlerce olduğu düşünülürse, bu kadar ağırlığın bellekte tutulmasının ve yapılacak aritmetik işlemlerin, özellikle enerji tüketimi açısından, oldukça maliyetli olacağı açıktır. Bu durum ANN?lerin taşınabilir cihazlarda kullanılabilmelerini fazlasıyla kısıtlamaktadır ve bu çalışmanın temel motivasyonlarından biridir. Önerilen çalışmada, çok düşük enerji tüketen ANN?ler önerilen yeni sayı hibrit gösterimi kullanılarak tasarlanmıştır, donanım optimizasyonları yapılmıştır ve nesne takibi uygulamalarında kullanılmıştır. Yapılan çalışmalar aşağıdaki üç ana başlıkta değerlendirilebilir. Bu üç ana başlık çalışmanın desteklediği 119E507 Nolu TÜBİTAK projesinde üç iş paketi olarak yer almaktadır. ? ANN enerji tasarrufu için yeni sayı gösterimlerinin sunulması ve devre bloklarının tasarımının yapılması. ? Enerji odaklı ANN donanım tasarımları ve optimizasyonunun yapılması. ? Nesne takibi yapan ANN tasarımlarının özel tümleşik devreler (application specific integration circuits, ASIC) ve alanda programlanabilir kapı dizileri (field programmable gate arrays, FPGA) tasarım platformlarında gerçeklenmesi.Conference Object Citation - WoS: 2Citation - Scopus: 5Compositional Neural Network Language Models for Agglutinative Languages(2016) Saraçlar, Murat; Arısoy, EbruContinuous space language models (CSLMs) have been proven to be successful in speech recognition. With proper training of the word embeddings, words that are semantically or syntactically related are expected to be mapped to nearby locations in the continuous space. In agglutinative languages, words are made up of concatenation of stems and suffixes and, as a result, compositional modeling is important. However, when trained on word tokens, CSLMs do not explicitly consider this structure. In this paper, we explore compositional modeling of stems and suffixes in a long short-term memory neural network language model. Our proposed models jointly learn distributed representations for stems and endings (concatenation of suffixes) and predict the probability for stem and ending sequences. Experiments on the Turkish Broadcast news transcription task show that further gains on top of a state-of-theart stem-ending-based n-gram language model can be obtained with the proposed models.Conference Object Citation - Scopus: 1Design and Fpga Implementation of Uav Simulator for Fast Prototyping(IEEE, 2023) Aydın, Yusuf; Ayhan, Tuba; Akyavaş , İrfanAs production and advances in motor and battery cell technology progress, unmanned aerial vehicles (UAVs) are gaining more and more acceptance and popularity. Unfortunately, the design and prototyping of UAVs is an expensive and long process. This paper proposes a fast, component based simulation environment for UAVs so that they can be roughly tested without a damage risk. Moreover, the combined effect of individual component choices can be observed with the simulator to reduce design time. The simulator is flexible in the sense that detailed aerodynamic effects and selected components models can be included. In this work, the simulator is proposed, model parameters are extracted for a particular UAV for testing the simulator and it is implemented on an field programmable gate array (FPGA) to increase simulation speed. The simulator calculates battery state of charge (SOC), position, velocity and acceleration of the UAV with gravity, drag, propeller air inflow velocity. The simulator runs on the FPGA fabric of AMD-XCKU13P with simulation steps of 1 ms.Conference Object Citation - WoS: 3Citation - Scopus: 5Developing an Automatic Transcription and Retrieval System for Spoken Lectures in Turkish(2017) Arısoy, EbruWith the increase of online video lectures, using speech and language processing technologies for education has become quite important. This paper presents an automatic transcription and retrieval system developed for processing spoken lectures in Turkish. The main steps in the system are automatic transcription of Turkish video lectures using a large vocabulary continuous speech recognition (LVCSR) system and finding keywords on the lattices obtained from the LVCSR system using a speech retrieval system based on keyword search. While developing this system, first a state-of-the-art LVCSR system was developed for Turkish using advance acoustic modeling methods, then keywords were extracted automatically front word sequences in the reference transcriptions of video lectures, and a speech retrieval system was developed for searching these keywords in the lattice output of the LVCSR system. The spoken lecture processing system yields 14.2% word error rate and 0.86 maximum term weighted value on the test data.

