Arısoy Saraçlar, Ebru

Loading...
Profile Picture
Name Variants
Arısoy, Ebru
Job Title
Email Address
saraclare@mef.edu.tr
Main Affiliation
02.05. Department of Electrical and Electronics Engineering
Status
Current Staff
Website
Scopus Author ID
Turkish CoHE Profile ID
Google Scholar ID
WoS Researcher ID

Sustainable Development Goals

4

QUALITY EDUCATION
QUALITY EDUCATION Logo

1

Research Products
Documents

42

Citations

1376

h-index

14

Documents

28

Citations

623

Scholarly Output

19

Articles

0

Views / Downloads

3734/2056

Supervised MSc Theses

3

Supervised PhD Theses

0

WoS Citation Count

79

Scopus Citation Count

46

WoS h-index

3

Scopus h-index

5

Patents

0

Projects

3

WoS Citations per Publication

4.16

Scopus Citations per Publication

2.42

Open Access Source

4

Supervised Theses

3

Google Analytics Visitor Traffic

JournalCount
2020 28th Signal Processing and Communications Applications Conference (SIU)3
Turkish Natural Language Processing2
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings -- Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 -- 20 May 2024 through 25 May 2024 -- Hybrid, Torino -- 1996201
Conference: 16th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2015) Location: Dresden, GERMANY Date: SEP 06-10, 20151
Conference: 17th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2016) Location: San Francisco, CA Date: SEP 08-12, 20161
Current Page: 1 / 3

Scopus Quartile Distribution

Competency Cloud

GCRIS Competency Cloud

Scholarly Output Search Results

Now showing 1 - 10 of 19
  • Conference Object
    Citation - WoS: 2
    Citation - Scopus: 5
    Compositional Neural Network Language Models for Agglutinative Languages
    (2016) Saraçlar, Murat; Arısoy, Ebru
    Continuous space language models (CSLMs) have been proven to be successful in speech recognition. With proper training of the word embeddings, words that are semantically or syntactically related are expected to be mapped to nearby locations in the continuous space. In agglutinative languages, words are made up of concatenation of stems and suffixes and, as a result, compositional modeling is important. However, when trained on word tokens, CSLMs do not explicitly consider this structure. In this paper, we explore compositional modeling of stems and suffixes in a long short-term memory neural network language model. Our proposed models jointly learn distributed representations for stems and endings (concatenation of suffixes) and predict the probability for stem and ending sequences. Experiments on the Turkish Broadcast news transcription task show that further gains on top of a state-of-theart stem-ending-based n-gram language model can be obtained with the proposed models.
  • Book Part
    Language Modeling for Turkish Text and Speech Processing
    (Springer, 2018) Arısoy, Ebru; Saraçlar, Murat
    This chapter presents an overview of language modeling followed by a discussion of the challenges in Turkish language modeling. Sub-lexical units are commonly used to reduce the high out-of-vocabulary (OOV) rates of morphologically rich languages. These units are either obtained by morphological analysis or by unsupervised statistical techniques. For Turkish, the morphological analysis yields word segmentations both at the lexical and surface forms which can be used as sub-lexical language modeling units. Discriminative language models, which outperform generative models for various tasks, allow for easy integration of morphological and syntactic features into language modeling. The chapter provides a review of both generative and discriminative approaches for Turkish language modeling.
  • Conference Object
    Citation - Scopus: 2
    Dealing With Data Scarcity in Spoken Question Answering
    (European Language Resources Association (ELRA), 2024) Arısoy, Ebru; Özgür, Arzucan; Ünlü Menevşe, Merve; Manav, Yusufcan
    This paper focuses on dealing with data scarcity in spoken question answering (QA) using automatic question-answer generation and a carefully selected fine-tuning strategy that leverages limited annotated data (paragraphs and question-answer pairs). Spoken QA is a challenging task due to using spoken documents, i.e., erroneous automatic speech recognition (ASR) transcriptions, and the scarcity of spoken QA data. We propose a framework for utilizing limited annotated data effectively to improve spoken QA performance. To deal with data scarcity, we train a question-answer generation model with annotated data and then produce large amounts of question-answer pairs from unannotated data (paragraphs). Our experiments demonstrate that incorporating limited annotated data and the automatically generated data through a carefully selected fine-tuning strategy leads to 5.5% relative F1 gain over the model trained only with annotated data. Moreover, the proposed framework is also effective in high ASR errors. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.
  • Conference Object
    Citation - WoS: 1
    Citation - Scopus: 2
    Improving the Usage of Subword-Based Units for Turkish Speech Recognition
    (IEEE, 2020) Çetinkaya, Gözde; Saraçlar, Murat; Arısoy, Ebru
    Subword units are often utilized to achieve better performance in speech recognition because of the high number of observed words in agglutinative languages. In this study, the proper use of subword units is explored in recognition by a reconsideration of details such as silence modeling and position-dependent phones. A modified lexicon by finite-state transducers is implemented to represent the subword units correctly. Also, we experiment with different types of word boundary markers and achieve the best performance by adding a marker both to the left and right side of a subword unit. In our experiments on a Turkish broadcast news dataset, the subword models do outperform word-based models and naive subword implementations. Results show that using proper subword units leads to a relative word error rate (WER) reductions, which is 2.4%, compared with the word level automatic speech recognition (ASR) system for Turkish.
  • Conference Object
    Citation - WoS: 1
    Citation - Scopus: 1
    Domain Adaptation Approaches for Acoustic Modeling
    (IEEE, 2020) Arısoy, Ebru; Fakhan, Enver
    In the recent years, with the development of neural network based models, ASR systems have achieved a tremendous performance increase. However, this performance increase mostly depends on the amount of training data and the computational power. In a low-resource data scenario, publicly available datasets can be utilized to overcome data scarcity. Furthermore, using a pre-trained model and adapting it to the in-domain data can help with computational constraint. In this paper we have leveraged two different publicly available datasets and investigate various acoustic model adaptation approaches. We show that 4% word error rate can be achieved using a very limited in-domain data.
  • Conference Object
    Citation - WoS: 1
    Citation - Scopus: 5
    Uncertainty-Aware Representations for Spoken Question Answering
    (Institute of Electrical and Electronics Engineers Inc., 2021) Ünlü, Merve; Arısoy, Ebru
    This paper describes a spoken question answering system that utilizes the uncertainty in automatic speech recognition (ASR) to mitigate the effect of ASR errors on question answering. Spoken question answering is typically performed by transcribing spoken con-tent with an ASR system and then applying text-based question answering methods to the ASR transcriptions. Question answering on spoken documents is more challenging than question answering on text documents since ASR transcriptions can be erroneous and this degrades the system performance. In this paper, we propose integrating confusion networks with word confidence scores into an end-to-end neural network-based question answering system that works on ASR transcriptions. Integration is performed by generating uncertainty-aware embedding representations from confusion networks. The proposed approach improves F1 score in a question answering task developed for spoken lectures by providing tighter integration of ASR and question answering.
  • Book Part
    Turkish Speech Recognition
    (2018) Arısoy, Ebru; Saraçlar, Murat
    Automatic speech recognition (ASR) is one of the most important applications of speech and language processing, as it forms the bridge between spoken and written language processing. This chapter presents an overview of the foundations of ASR, followed by a summary of Turkish language resources for ASR and a review of various Turkish ASR systems. Language resources include acoustic and text corpora as well as linguistic tools such as morphological parsers, morphological disambiguators, and dependency parsers, discussed in more detail in other chapters. Turkish ASR systems vary in the type and amount of data used for building the models. The focus of most of the research for Turkish ASR is the language modeling component covered in Chap. 4.
  • Conference Object
    Citation - WoS: 4
    Citation - Scopus: 4
    Multi-Stream Long Short-Term Memory Neural Network Language Model
    (2015) Saraçlar, Murat; Arısoy, Ebru
    Long Short-Term Memory (LSTM) neural networks are recurrent neural networks that contain memory units that can store contextual information from past inputs for arbitrary amounts of time. A typical LSTM neural network language model is trained by feeding an input sequence. i.e., a stream of words, to the input layer of the network and the output layer predicts the probability of the next word given the past inputs in the sequence. In this paper we introduce a multi-stream LSTM neural network language model where multiple asynchronous input sequences are fed to the network as parallel streams while predicting the output word sequence. For our experiments, we use a sub-word sequence in addition to a word sequence as the input streams, which allows joint training of the LSTM neural network language model using both information sources.
  • Conference Object
    Citation - Scopus: 5
    A Framework for Automatic Generation of Spoken Question-Answering Data
    (Association for Computational Linguistics (ACL), 2022) Manav, Y.; Menevşe, M.Ü.; Özgür, A.; Arısoy, Ebru
    This paper describes a framework to automatically generate a spoken question answering (QA) dataset. The framework consists of a question generation (QG) module to generate questions automatically from given text documents, a text-to-speech (TTS) module to convert the text documents into spoken form and an automatic speech recognition (ASR) module to transcribe the spoken content. The final dataset contains question-answer pairs for both the reference text and ASR transcriptions as well as the audio files corresponding to each reference text. For QG and ASR systems we used pre-trained multilingual encoder-decoder transformer models and fine-tuned these models using a limited amount of manually generated QA data and TTS-based speech data, respectively. As a proof of concept, we investigated the proposed framework for Turkish and generated the Turkish Question Answering (TurQuAse) dataset using Wikipedia articles. Manual evaluation of the automatically generated question-answer pairs and QA performance evaluation with state-of-the-art models on TurQuAse show that the proposed framework is efficient for automatically generating spoken QA datasets. To the best of our knowledge, TurQuAse is the first publicly available spoken question answering dataset for Turkish. The proposed framework can be easily extended to other languages where a limited amount of QA data is available. © 2022 Association for Computational Linguistics.
  • Master Thesis
    Clustering of News in Publications
    (MEF Üniversitesi, Fen Bilimleri Enstitüsü, 2018) Sülün, Erhan; Arısoy Saraçlar, Ebru
    In today’s world, high volume of text is produced and stored continuously by the help of computer systems and Internet. And again by the help of Internet, those huge amount of text data is accessible to everyone. But when considering the size of the produced text, it is really hard for people to analyze the huge amounts of text data and discover the meaningful information in that data. Machine learning techniques and computer power emerges at this point, in order to analyze data and discover meaningful information to help people to access the summarized information. First step to analyze text data is to represent data in a numerical format, as machine learning techniques can only use numerical inputs. There are several methods for data representation; such as TF-IDF (Term Frequency - Inverse Document Frequency), Bag of Words, Word2Vec and Doc2Vec. Second step to analyze text data is to use machine learning algorithms by using the numerical representation of text data as input. There are supervised and unsupervised machine learning techniques to be decided to be used according to the structure of the problem and the data. In this study, news documents published in some publications in United States, such as New York Times, Reuters and Washington Post will be clustered into topics in order to categorize them and ease the investigation of them. Three types of data representation methods will be examined in detail and will be used, which are Bag of Words, TF-IDF and Doc2Vec representations. And finally, as the news data is an unlabeled set of documents, K-Means clustering algorithm will be used which is an unsupervised learning technique, by using both Euclidean Distance and Cosine Similarity metrics. Categorization will be performed multiple times with different category counts, meaning with different K values, and most meaningful category count will be determined after examining the clustering results.