Elektrik Elektronik Mühendisliği Bölümü Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.11779/1941

Browse

Now showing 1 - 20 of 44

A Bayesian Allocation Model Based Approach To Mixed Membership Stochastic Blockmodels
(Taylor and Francis Ltd., 2022) Kırbız, Serap; Hızlı, Çağlar
Although detecting communities in networks has attracted considerable recent attention, estimating the number of communities is still an open problem. In this paper, we propose a model, which replicates the generative process of the mixed-membership stochastic block model (MMSB) within the generic allocation framework of Bayesian allocation model (BAM) and BAM-MMSB. In contrast to traditional blockmodels, BAM-MMSB considers the observations as Poisson counts generated by a base Poisson process and marks according to the generative process of MMSB. Moreover, the optimal number of communities for BAM-MMSB is estimated by computing the variational approximations of the marginal likelihood for each model order. Experiments on synthetic and real data sets show that the proposed approach promises a generalized model selection solution that can choose not only the model size but also the most appropriate decomposition.
Citation - Scopus: 2
A Decade of Discriminative Language Modeling for Automatic Speech Recognition
(Springer-Verlag Berlin, 2015) Arısoy, Ebru; Saraçlar, Murat; Dikici, Erinc
This paper summarizes the research on discriminative language modeling focusing on its application to automatic speech recognition (ASR). A discriminative language model (DLM) is typically a linear or log-linear model consisting of a weight vector associated with a feature vector representation of a sentence. This flexible representation can include linguistically and statistically motivated features that incorporate morphological and syntactic information. At test time, DLMs are used to rerank the output of an ASR system, represented as an N-best list or lattice. During training, both negative and positive examples are used with the aim of directly optimizing the error rate. Various machine learning methods, including the structured perceptron, large margin methods and maximum regularized conditional log-likelihood, have been used for estimating the parameters of DLMs. Typically positive examples for DLM training come from the manual transcriptions of acoustic data while the negative examples are obtained by processing the same acoustic data with an ASR system. Recent research generalizes DLM training by either using automatic transcriptions for the positive examples or simulating the negative examples.
Citation - Scopus: 1
A Microwave Imaging Scheme for Detection of Pulmonary Edema and Hemorrhage
(IEEE, 2022) Ertek, Didem; Kucuk, Gokhan; Bilgin, Egemen
The microwave imaging systems have the potential to present a cost effective and less hazardous alternative to conventional medical imaging techniques. In this paper, a Contrast Source Inversion method based microwave imaging scheme is proposed and tested for the detection of pulmonary edema and hemorrhage. To this end, a realistic human torso phantom is used, and the electromagnetic parameters of the human tissues is determined via Cole-Cole model. The scattered field is simulated via Method of Moments at the operating frequency of 350 MHz, and a 50 dB white Gaussian noise is added to model a realistic measurement setup. The numerical tests performed with the proposed technique suggest that the method can be used to locate the pulmonary edema and hemorrhage, and it is capable of distinguishing these two medical conditions successfully.
Citation - WoS: 1
A Modified Newton Method Formulation for Microwave Imaging
(IEEE, 2020) Coşğun, Sema; Çayören, Mehmet; Bilgin, Egemen; Doğu, Semih
A new variant of Newton type methods has been developed for quantitative microwave imaging. To deal with the ill-posedness of the inverse problems, standard Newton type methods involve a linearization of the so called data equation using the Fréchet derivative with respect to the contrast function. Here, the formulation is expanded to include the object equation, therefore, the formulation seeks to reduce the errors in both the data and the object equations. While this modification does not remove the need to solve forward problem at each step, it nevertheless significantly improves convergence rate and the performance. To assess the efficiency of the proposed technique, numerical simulations with synthetic and experimental data have been carried out. The results demonstrate that the proposed variant outperforms the standard Newton method, and shows comparable performance to the contrast source inversion (CSI) algorithm with fewer iterations.
Citation - WoS: 42
Citation - Scopus: 50
Adaptive Human Force Scaling Via Admittance Control for Physical Human-Robot Interaction
(IEEE, 2021) Başdoğan, Çağatay; Aydın, Yusuf; Hamad, Yahya M.
The goal of this article is to design an admittance controller for a robot to adaptively change its contribution to a collaborative manipulation task executed with a human partner to improve the task performance. This has been achieved by adaptive scaling of human force based on her/his movement intention while paying attention to the requirements of different task phases. In our approach, movement intentions of human are estimated from measured human force and velocity of manipulated object, and converted to a quantitative value using a fuzzy logic scheme. This value is then utilized as a variable gain in an admittance controller to adaptively adjust the contribution of robot to the task without changing the admittance time constant. We demonstrate the benefits of the proposed approach by a pHRI experiment utilizing Fitts’ reaching movement task. The results of the experiment show that there is a) an optimum admittance time constant maximizing the human force amplification and b) a desirable admittance gain profile which leads to a more effective co-manipulation in terms of overall task performance.
Citation - WoS: 22
Citation - Scopus: 24
An Adaptive Admittance Controller for Collaborative Drilling With a Robot Based on Subtask Classification Via Deep Learning
(Elsevier, 2022) Başdoğan, Çağatay; Niaz, P. Pouya; Aydın, Yusuf; Güler, Berk; Madani, Alireza
In this paper, we propose a supervised learning approach based on an Artificial Neural Network (ANN) model for real-time classification of subtasks in a physical human–robot interaction (pHRI) task involving contact with a stiff environment. In this regard, we consider three subtasks for a given pHRI task: Idle, Driving, and Contact. Based on this classification, the parameters of an admittance controller that regulates the interaction between human and robot are adjusted adaptively in real time to make the robot more transparent to the operator (i.e. less resistant) during the Driving phase and more stable during the Contact phase. The Idle phase is primarily used to detect the initiation of task. Experimental results have shown that the ANN model can learn to detect the subtasks under different admittance controller conditions with an accuracy of 98% for 12 participants. Finally, we show that the admittance adaptation based on the proposed subtask classifier leads to 20% lower human effort (i.e. higher transparency) in the Driving phase and 25% lower oscillation amplitude (i.e. higher stability) during drilling in the Contact phase compared to an admittance controller with fixed parameters.
Citation - Scopus: 5
An Antipodal Vivaldi Antenna Design for Torso Imaging in a Coupling Medium
(IEEE, 2021) Çayören, Mehmet; Bilgin, Egemen; Joof, Sulayman; Doğu, Semih
An antipodal Vivaldi antenna designed to operate in a coupling medium with a relative dielectric constant of epsilon(r) = 25 for microwave imaging of torso is presented in this paper. The proposed antenna is similar to the conventional antipodal Vivaldi antenna but with optimized parameters to radiate in the desired coupling medium. The antenna has a size of 120x70 mm(2) and operating over 230-1000 MHz frequency bandwidth with a peak gain of 5.42 dBi and peak front-to-back ratio of 143 dB. The designed antenna shows a better performance compared to other antennas used for microwave torso imaging. To assess the actual performance, a realistic human torso phantom is implemented to detect the water accumulation in the lungs, and as the inversion method linear sampling method is used. The 3-D reconstruction results show that the proposed antenna can be a candidate for microwave torso imaging applications.
Citation - WoS: 3
Citation - Scopus: 10
An Fpga Implementation of a Risc-V Based Soc System for Image Processing Applications
(IEEE, 2021) Gholizadehazari, Erfan; Ayhan, Tuba; Ors, Berna
The Laplacian filter is one of the fundamental applications in image processing. In our work, the Laplacian filter has been applied to an image, and both hardware and software implementation of the filter has been studied. Our system consists of an OV7670 Camera module, Nexys 4 DDR FPGA board and VGA monitor to display the processed video stream. Mentioned process has forwarding tasks: camera module captures raw RGB data and writes to RAM, Laplacian filter IP processes raw image and the results written back to memory. VGA modules show output images to monitor. The Laplacian filter part considered in hardware and software implementation is compared in terms of time and area.
Citation - WoS: 22
Audio Source Separation Using Variational Autoencoders and Weak Class Supervision
(Institute of Electrical and Electronics Engineers (IEEE), 2019) Kırbız, Serap; Karamatlı, Ertuğ; Cemgil, Ali Taylan
In this letter, we propose a source separation method that is trained by observing the mixtures and the class labels of the sources present in the mixture without any access to isolated sources. Since our method does not require source class labels for every time-frequency bin but only a single label for each source constituting the mixture signal, we call this scenario as weak class supervision. We associate a variational autoencoder (VAE) with each source class within a non negative (compositional) model. Each VAE provides a prior model to identify the signal from its associated class in a sound mixture. After training the model on mixtures, we obtain a generative model for each source class and demonstrate our method on one-second mixtures of utterances of digits from 0 to 9. We show that the separation performance obtained by source class supervision is as good as the performance obtained by source signal supervision.
Citation - WoS: 60
Bidirectional Recurrent Neural Network Language Models for Automatic Speech Recognition
(IEEE, 2015) Chen, Stanley; Sethy, Abhinav; Ramabhadran, Bhuvana; Arısoy, Ebru
Recurrent neural network language models have enjoyed great success in speech recognition, partially due to their ability to model longer-distance context than word n-gram models. In recurrent neural networks (RNNs), contextual information from past inputs is modeled with the help of recurrent connections at the hidden layer, while Long Short-Term Memory (LSTM) neural networks are RNNs that contain units that can store values for arbitrary amounts of time. While conventional unidirectional networks predict outputs from only past inputs, one can build bidirectional networks that also condition on future inputs. In this paper, we propose applying bidirectional RNNs and LSTM neural networks to language modeling for speech recognition. We discuss issues that arise when utilizing bidirectional models for speech, and compare unidirectional and bidirectional models on an English Broadcast News transcription task. We find that bidirectional RNNs significantly outperform unidirectional RNNs, but bidirectional LSTMs do not provide any further gain over their unidirectional counterparts.
Citation - WoS: 2
Citation - Scopus: 5
Compositional Neural Network Language Models for Agglutinative Languages
(Isca-INT Speech Communication Assoc, 2016) Saraçlar, Murat; Arısoy, Ebru
Continuous space language models (CSLMs) have been proven to be successful in speech recognition. With proper training of the word embeddings, words that are semantically or syntactically related are expected to be mapped to nearby locations in the continuous space. In agglutinative languages, words are made up of concatenation of stems and suffixes and, as a result, compositional modeling is important. However, when trained on word tokens, CSLMs do not explicitly consider this structure. In this paper, we explore compositional modeling of stems and suffixes in a long short-term memory neural network language model. Our proposed models jointly learn distributed representations for stems and endings (concatenation of suffixes) and predict the probability for stem and ending sequences. Experiments on the Turkish Broadcast news transcription task show that further gains on top of a state-of-theart stem-ending-based n-gram language model can be obtained with the proposed models.
Citation - WoS: 3
Citation - Scopus: 5
Developing an Automatic Transcription and Retrieval System for Spoken Lectures in Turkish
(IEEE, 2017) Arısoy, Ebru
With the increase of online video lectures, using speech and language processing technologies for education has become quite important. This paper presents an automatic transcription and retrieval system developed for processing spoken lectures in Turkish. The main steps in the system are automatic transcription of Turkish video lectures using a large vocabulary continuous speech recognition (LVCSR) system and finding keywords on the lattices obtained from the LVCSR system using a speech retrieval system based on keyword search. While developing this system, first a state-of-the-art LVCSR system was developed for Turkish using advance acoustic modeling methods, then keywords were extracted automatically front word sequences in the reference transcriptions of video lectures, and a speech retrieval system was developed for searching these keywords in the lattice output of the LVCSR system. The spoken lecture processing system yields 14.2% word error rate and 0.86 maximum term weighted value on the test data.
Citation - WoS: 1
Citation - Scopus: 1
Domain Adaptation Approaches for Acoustic Modeling
(IEEE, 2020) Arısoy, Ebru; Fakhan, Enver
In the recent years, with the development of neural network based models, ASR systems have achieved a tremendous performance increase. However, this performance increase mostly depends on the amount of training data and the computational power. In a low-resource data scenario, publicly available datasets can be utilized to overcome data scarcity. Furthermore, using a pre-trained model and adapting it to the in-domain data can help with computational constraint. In this paper we have leveraged two different publicly available datasets and investigate various acoustic model adaptation approaches. We show that 4% word error rate can be achieved using a very limited in-domain data.
Citation - WoS: 6
Citation - Scopus: 6
Evaluation of Diaphragm Conditions in Aac Floor Structureswith Rc Beams
(Springer, 2018) İlki, Alper; Uğurlu, Koray; Demir, Cem; Comert, Mustafa; Halıcı, Ömer Faruk
Diaphragm action in floor structures is an important aspect that affects both local behaviors of individual members and consequently, the global response of a structure. The diaphragm action of a built structure, therefore needs to be compatible with the assumed diaphragm condition in the design phase to prevent unpredicted overloading of load bearing members in a seismic action. Autoclaved aerated concrete (AAC) is a cost-effective, lightweight and energy efficient material, and its usage as a construction material has rapidly increased in recent decades. However, there is a limited experience regarding the in-plane behavior of the floor structures made of AAC panels in terms of diaphragm action. In this paper, the in-plane response of AAC floors is experimentally investigated and the floor performance of a typical building is analytically investigated according to ASCE 7-16 (ASCE/SEI in Minimum design loads for buildings and other structures, The American Society of Civil Engineers, Reston, 2016). Full-scale experiments carried out through loading AAC floors in lateral directions to the panels, either parallel or perpendicular, provided important information about the damage progress and overall performance of such floors. A number of finite element modeling techniques that are generally used for modeling of AAC floors were examined and then validated through comparisons with test results. Finally, the diaphragm condition of a three-story building made of AAC walls and floor panels was assessed. The results indicated that the AAC floors in the examined building can be idealized as rigid diaphragms according to ASCE 7-16.
Citation - WoS: 23
Citation - Scopus: 23
Experimental Observation of Temperature and Pressure Induced Frequency Fluctuations in Silicon Mems Resonators
(IEEE, 2021) Zhao, Chun; Mustafazade, Arif; Pandit, Milind; Seshia A, Ashwin; Sobreviela, Guillermo; Zou, Xudong; Seshia, Ashwin A.
Silicon MEMS resonators are increasingly being adopted for applications in timing and frequency control, as well as precision sensing. It is well established that a key limitation to performance is associated with sensitivity to environmental variables such as temperature and pressure. As a result, technical approaches to address these factors such as vacuum sealing and ovenization of the resonators in a temperature controlled system have been introduced. However, residual sensitivity to such effects can still serve as a significant source of frequency fluctuations and drift in precision devices. This is experimentally demonstrated in this paper for a precision oven-controlled and vacuum-sealed silicon resonators. The frequency fluctuations of oscillators constructed using two separate nearly-identical co-located resonators on the same chip are analysed and differential frequency fluctuations are examined as a means of reducing the impact of common-mode effects such as temperature and pressure. For this configuration, our results show that the mismatch of temperature and pressure coefficients between the resonators ultimately limits the frequency stability.
Citation - Scopus: 2
Feasibility of Distorted Born Iterative Method for Detecting Early Stage of Heart Failure
(IEEE, 2020) Akıncı, Mehmet Nuri; Bilgin, Egemen; Joof, Sulayman; Doğu, Semih
In this paper, we analyze the feasibility of using microwaves to detect early stage of congestive heart failure, which causes water accumulation in the lungs. To this aim, a slice from realistic human torso phantom, which consists of all human tissues and organs, is considered. Constitutive parameters of the phantom are calculated by multiple order Cole-Cole model at operating frequency. Then, the scattered field is calculated via method of moment and a 30 dB additive white Gaussian noise is added to create a more realistic scenario. In the solution of inverse scattering phase, distorted Born iterative method is utilized. The presented results show the feasibility of the proposed method.
Highlighting of Lecture Video Closed Captions
(IEEE, 2020) Yıldırım, Göktuğ; Öztufan, Huseyin Efe; Arısoy, Ebru; Yildirm, Goktug
The main purpose of this study is to automatically highlight important regions of lecture video subtitles. Even though watching videos is an effective way of learning, the main disadvantage of video-based education is limited interaction between the learner and the video. With the developed system, important regions that are automatically determined in lecture subtitles will be highlighted with the aim of increasing the learner's attention to these regions. In this paper first the lecture videos are converted into text by using an automatic speech recognition system. Then continuous space representations for sentences or word sequences in the transcriptions are generated using Bidirectional Encoder Representations from Transformers (BERT). Important regions of the subtitles are selected using a clustering method based on the similarity of these representations. The developed system is applied to the lecture videos and it is found that using word sequence representations in determining the important regions of subtitles gives higher performance than using sentence representations. This result is encouraging in terms of automatic highlighting of speech recognition outputs where sentence boundaries are not defined explicitly.
Citation - WoS: 1
Citation - Scopus: 1
İlişkisel Veri Ayrıştırılmasında Model Seçimi
(IEEE, 2019) Kırbız, Serap; Cemgil, Taylan; Hızlı, Çağlar
Abstract—As a fundamental problem in relational data analysis, model selection for relational data factorization is still an open problem. In our work, we propose to estimate model order for mixed membership blockmodels (MMSB) within the generic allocation framework of Bayesian allocation model (BAM). We describe how relational data is represented as Poisson counts of the allocation model, and demonstrate our results both on synthetic and real-world data sets. We believe that the generic allocation perspective promises a generalized model selection solution where we do not only select the model order, but also choose the most appropriate factorization.
Impact of Hardware Sources on Feature Selection for Online Signature Verification
(IEEE, 2020) Ayhan, Tuba; Orak, Remzi
This work analyzes time series features gathered from a touchpad which is a part of online signature verification system. A DTW processing unit is implemented on FPGA to be used in time series analysis. To support different feature groups, this unit can be reconfigured without altering the memory structure. By using this reconfigurable unit, features are evaluated according to the area cost that they introduce. Moreover, a method to predict the value of features for classification is introduced. This way, minimum requirements to implement an online signature verification system on FPGA are partially obtained.
Citation - WoS: 1
Citation - Scopus: 2
Improving the Usage of Subword-Based Units for Turkish Speech Recognition
(IEEE, 2020) Çetinkaya, Gözde; Saraçlar, Murat; Arısoy, Ebru
Subword units are often utilized to achieve better performance in speech recognition because of the high number of observed words in agglutinative languages. In this study, the proper use of subword units is explored in recognition by a reconsideration of details such as silence modeling and position-dependent phones. A modified lexicon by finite-state transducers is implemented to represent the subword units correctly. Also, we experiment with different types of word boundary markers and achieve the best performance by adding a marker both to the left and right side of a subword unit. In our experiments on a Turkish broadcast news dataset, the subword models do outperform word-based models and naive subword implementations. Results show that using proper subword units leads to a relative word error rate (WER) reductions, which is 2.4%, compared with the word level automatic speech recognition (ASR) system for Turkish.

Browse

Browsing Elektrik Elektronik Mühendisliği Bölümü Koleksiyonu by Publication Index "WoS"