Demir, Şeniz
Loading...
Profile URL
Name Variants
Demi̇r, Şeni̇z
Demir, Seniz
Demir, S
Demir, Seniz
Demir, S
Job Title
Email Address
demirse@mef.edu.tr
Main Affiliation
02.02. Department of Computer Engineering
Status
Current Staff
Website
ORCID ID
Scopus Author ID
Turkish CoHE Profile ID
Google Scholar ID
WoS Researcher ID
Sustainable Development Goals
SDG data is not available

Documents
34
Citations
508
h-index
13

Documents
0
Citations
0

Scholarly Output
14
Articles
7
Views / Downloads
8/0
Supervised MSc Theses
4
Supervised PhD Theses
0
WoS Citation Count
30
Scopus Citation Count
50
WoS h-index
3
Scopus h-index
3
Patents
0
Projects
3
WoS Citations per Publication
2.14
Scopus Citations per Publication
3.57
Open Access Source
8
Supervised Theses
4
| Journal | Count |
|---|---|
| 27th Signal Processing and Communications Applications Conference, SIU 2019 | 1 |
| -- 33rd IEEE Conference on Signal Processing and Communications Applications, SIU 2025 -- Istanbul; Isik University Sile Campus -- 211450 | 1 |
| ACM Transactions on Asian and Low-Resource Language Information Processing | 1 |
| Computer Speech & Language | 1 |
| Engineering Science and Technology, an International Journal | 1 |
Current Page: 1 / 2
Scopus Quartile Distribution
Competency Cloud

14 results
Scholarly Output Search Results
Now showing 1 - 10 of 14
Article Citation - WoS: 7Citation - Scopus: 12Graph-Based Turkish Text Normalization and Its Impact on Noisy Text Processing(Elsevier, 2022) Topçu, Berkay; Demir, ŞenizUser generated texts on the web are freely-available and lucrative sources of data for language technology researchers. Unfortunately, these texts are often dominated by informal writing styles and the language used in user generated content poses processing difficulties for natural language tools. Experienced performance drops and processing issues can be addressed either by adapting language tools to user generated content or by normalizing noisy texts before being processed. In this article, we propose a Turkish text normalizer that maps non-standard words to their appropriate standard forms using a graph-based methodology and a context-tailoring approach. Our normalizer benefits from both contextual and lexical similarities between normalization pairs as identified by a graph-based subnormalizer and a transformation-based subnormalizer. The performance of our normalizer is demonstrated on a tweet dataset in the most comprehensive intrinsic and extrinsic evaluations reported so far for Turkish. In this article, we present the first graph-based solution to Turkish text normalization with a novel context-tailoring approach, which advances the state-of-the-art results by outperforming other publicly available normalizers. For the first time in the literature, we measure the extent to which the accuracy of a Turkish language processing tool is affected by normalizing noisy texts before being processed. An analysis of these extrinsic evaluations that focus on more than one Turkish NLP task (i.e., part-of-speech tagger and dependency parser) reveals that Turkish language tools are not robust to noisy texts and a normalizer leads to remarkable performance improvements once used as a preprocessing tool in this morphologically-rich language.Conference Object Citation - Scopus: 3An Xml Parser for Turkish Wikipedia(IEEE, 2019) Demir, Şeniz; Vardar, Uluç Furkan; Devran, İlkay TevfikNowadays, visual and written data that can be easily accessed over the internet has enabled the development of research in many different fields. However, the availability of data is not sufficient by itself. It is of great importance that these data can be effectively utilized and interpreted in accordance with the requirements. Access to written content in the Wikipedia encyclopedia, which is becoming increasingly common in Turkish natural language processing, can be done via XML dumps. In this study, our aim is to develop and demonstrate the applicability of an XML parser for the processing of Turkish Wikipedia dumps. The use of the open-source parser, which allows information extraction at different levels of granularity, is reported on pages containing biography infoboxes and textual contents.Conference Object Does Prompt Engineering Help Turkish Named Entity Recognition?(Institute of Electrical and Electronics Engineers Inc., 2024) Pektezol, A.S.; Ulugergerli, A.B.; Öztoklu, V.; Demir, ŞenizThe extraction of entity mentions in a text (named entity recognition) has been traditionally formulated as a sequence labeling problem. In recent years, this approach has evolved from recognizing entities to answering formulated questions related to entity types. The questions, constructed as prompts, are used to elicit desired entity mentions and their types from large language models. In this work, we investigated prompt engineering in Turkish named entity recognition and studied two prompting strategies to guide pretrained language models toward correctly identifying mentions. In particular, we examined the impact of zero-shot and few-shot prompting on the recognition of Turkish named entities by conducting experiments on two large language models. Our evaluations using different prompt templates revealed promising results and demonstrated that carefully constructed prompts can achieve high accuracy on entity recognition, even in languages with complex morphology. © 2024 IEEE.Article Neural Coreference Resolution for Turkish(2023) Demir, ŞenizCoreference resolution deals with resolving mentions of the same underlying entity in a given text. This challenging task is an indispensable aspect of text understanding and has important applications in various language processing systems such as question answering and machine translation. Although a significant amount of studies is devoted to coreference resolution, the research on Turkish is scarce and mostly limited to pronoun resolution. To our best knowledge, this article presents the first neural Turkish coreference resolution study where two learning-based models are explored. Both models follow the mention-ranking approach while forming clusters of mentions. The first model uses a set of hand-crafted features whereas the second coreference model relies on embeddings learned from large-scale pre-trained language models for capturing similarities between a mention and its candidate antecedents. Several language models trained specifically for Turkish are used to obtain mention representations and their effectiveness is compared in conducted experiments using automatic metrics. We argue that the results of this study shed light on the possible contributions of neural architectures to Turkish coreference resolution.Master Thesis The use of pretrained language models in sentiment analysis(MEF Üniversitesi, 2022) Yürütücü, Ömer Yiğit; Demir, ŞenizDoğal dil işleme, dil bilim ve yapay zekânın alt konularından biridir. Duygu analizi herhangi bir konuda bir metni öznel içeriğine göre sınıflandırma yapar. Genellikle bireylerin çeşitli platformlarda bir konu hakkında düşünce, duygu ya da tutumu gibi verileri irdelemek, analiz etmek ve yorumlamak amacıyla kullanılan yöntemlerden biridir. Sosyal medya paylaşımlarındaki artış bu platformlarda yapılan duygu analizi çalışmalarını da artırmıştır. Duygu analizi yapılırken farklı yöntemlerden yararlanılır. Makine öğrenmesi ve doğal dil işleme algoritmaları ile duygu tespiti ile sınıflandırma yapılır. Son yıllarda önceden eğitilmiş dil modelleri makine öğrenmesi metotlarıyla birlikte ya da tek başına kullanılmaya başlamıştır. Bu tezin amacı önceden eğitilmiş dil modelleri ile sosyal medya yorumlarında duygu analizinin varsayımsal avantajlarını test etmektir. Bu amaçla Twitterdaki Covid-19 ile ilgili tweetler için duygu analizi yapılmıştır. Önceden eğitilmiş dil modelleri kullanılarak duygu yoğunlukları tespit edilmiş ve sonuçları karşılaştırılmıştır. Analizlerde BERT, RoBERTa ve BERTweet'ten yararlanılmıştır. Sonuçlar, duygu analizi için NLP tekniklerinin diğer teknikler kadar başarılı olduğunu göstermektedir.Article Ön Eğitimli Dil Modelleriyle Duygu Analizi(İstanbul Sabahattin Zaim Üniversitesi Fen Bilimleri Enstitüsü, 2023) Yürütücü, Ömer Yiğit; Demir, ŞenizDuygu analizi, çeşitli platformlarda bir konu hakkında düşünce, duygu ya da tutumu irdelemek, analiz etmek ve yorumlamak amacıyla kullanılan yöntemlerden biridir. Farklı konulardaki metinlerin öznel içeriklerine göre sınıflandırılabildiği duygu analizinde makine öğrenmesi ve derin öğrenme modellerinden sıklıkla faydalanılmaktadır.Bu çalışmada, önceden eğitilmiş dil modellerinden yararlanılarak Covid-19 tweet metinleri üzerinde duygu analizi yapılmıştır. Naive Bayes sınıflandırıcıya ek olarak BERT, RoBERTa ve BERTweet dil modelleri kullanılarak farklı sınıflandırıcılar eğitilmiş ve tweet veri kümesi üzerinde elde edilen sonuçlar kıyaslanmıştır. Bildiride aktarılan çalışmanın ileride bu alanda yürütülecek araştırmalara bir zemin oluşturacağı öngörülmektedir.Conference Object Dil Modelleri ile Akademik Özet Üretimi(Institute of Electrical and Electronics Engineers Inc., 2025) Bektas, Busra; Gultekin, Ali Ozgun; Ozdemiroglu, Emre; Yilmaz, Zeynep; Dikici, Buse; Demir, SenizIn recent years, large language models have demonstrated extraordinary capabilities in natural language processing tasks. The integration of these models to text summarization has highlighted the need for evaluating varying model performances under a standardized benchmarking framework. In this study, the performance of different large language models in generating abstracts of scientific papers which has a common structure and unique language is compared through an extensive experimental analysis. The abstracts automatically generated by these models using prompt engineering were evaluated via various evaluation metrics based on content overlap and semantic similarity. The results that we obtained demonstrated the effectiveness of large language models in abstract generation. © 2025 Elsevier B.V., All rights reserved.Article Mention Detection in Turkish Coreference Resolution(Tubitak Scientific & Technological Research Council Turkey, 2024) Demir, Seniz; Akdag, Hanifi IbrahimA crucial step in understanding natural language is detecting mentions that refer to real-world entities in a text and correctly identifying their boundaries. Mention detection is commonly considered a preprocessing step in coreference resolution which is shown to be helpful in several language processing applications such as machine translation and text summarization. Despite recent efforts on Turkish coreference resolution, no standalone neural solution to mention detection has been proposed yet. In this article, we present two models designed for detecting Turkish mentions by using feed-forward neural networks. Both models extract all spans up to a fixed length from input text as candidates and classify them as mentions or not mentions. The models differ in terms of how candidate text spans are represented. The first model represents a span by focusing on its first and last words, whereas the representation also covers the preceding and proceeding words of a span in the second model. Mention span representations are formed by using contextual embeddings, part-of-speech embeddings, and named-entity embeddings of words in interest where contextual embeddings are obtained from pretrained Turkish language models. In our evaluation studies, we not only assess the impact of mention representation strategies on system performance but also demonstrate the usability of different pretrained language models in resolution task. We argue that our work provides useful insights to the existing literature and the first step in understanding the effectiveness of neural architectures in Turkish mention detection.Article Citation - WoS: 1Citation - Scopus: 4Turkish Data-To Generation Using Sequence-To Neural Networks(Assoc Computing Machinery, 2023) Demir, ŞenizEnd-to-end data-driven approaches lead to rapid development of language generation and dialogue systems. Despite the need for large amounts of well-organized data, these approaches jointly learn multiple components of the traditional generation pipeline without requiring costly human intervention. End-to-end approaches also enable the use of loosely aligned parallel datasets in system development by relaxing the degree of semantic correspondences between training data representations and text spans. However, their potential in Turkish language generation has not yet been fully exploited. In this work, we apply sequenceto-sequence (Seq2Seq) neural models to Turkish data-to-text generation where the input data given in the form of a meaning representation is verbalized. We explore encoder-decoder architectures with attention mechanism in unidirectional, bidirectional, and stacked recurrent neural network (RNN) models. Our models generate one-sentence biographies and dining venue descriptions using a crowdsourced dataset where all field value pairs that appear in meaning representations are fully captured in reference sentences. To support this work, we also explore the performances of our models on a more challenging dataset, where the content of a meaning representation is too large to fit into a single sentence, and hence content selection and surface realization need to be learned jointly. This dataset is retrieved by coupling introductory sentences of person-related Turkish Wikipedia articles with their contained infobox tables. Our empirical experiments on both datasets demonstrate that Seq2Seq models are capable of generating coherent and fluent biographies and venue descriptions from field value pairs. We argue that the wealth of knowledge residing in our datasets and the insights obtained fromthis study hold the potential to give rise to the development of new end-to-end generation approaches for Turkish and other morphologically rich languages.Master Thesis Interviewster: A chatbot evaluating competency based interviews using transformer models(MEF Üniversitesi, 2022) Atıcı, Onur Emre; Demir, Şenizİşe alım, insan kaynaklarının en sözel ve iletişimsel alanlarından biridir. Bu departmanın insan faktörünün baskın olması nedeniyle yeniliğe açık olduğu kadar önyargıya da açık olan birçok yönü bulunmaktadır. Bu da yapay zeka teknolojilerindeki ilerlemeyle birlikte birçok inovasyon ihtiyacını (ve şansını) beraberinde getirmektedir. Bu çalışmada adayları karşılayan, bilgi toplayan (ad-soyad, iş durumu, bilgisayar bilgisi, eğitimi, hobileri gibi) ve geçmiş deneyimleri hakkında yetkinlik bazlı sorular sunan ve bu soruları doğru cevaplayabilmesi için onlara yardımcı olan "Interviewster" adlı bir sohbet robotru oluşturmaya odaklanılmaktadır. Bu sohbet robotu adayı karşılar ve konuşmayı başlatır, adaydan toplanan verileri kaydeder, yetkinlik bazlı görüşme yapar ve sinir ağları mimarileri ve transformer tabanlı teknolojileri kullanan doğal dil işleme teknikleri ile adayın gerekli yetkinliğe sahip olup olmadığına karar verir. Web üzerinde çalışmakta olan bu sohbet robotu Python ile kodlanmış ve Flask ile web'de yayınlanmış olup Mysql veritabanını kullanan bir Python çekirdeği üzerinde çalışmaktadır. Bu tezde ilk olarak mülakat uygulamaları tanıtılmakta ve yetkinlik bazlı mülakatların yöntem ve uygulamaları anlatılmaktadır. Sonrasında Interviewster olarak adlandırılan sohbet robotunun mimarisi, kullanılan teknolojiler, kütüphaneler ve makine öğrenmesi teknikleri, detayları verilerek açıklanmıştır. Son olarak da transformer tabanlı modeller olan BERT, DeBERTa ve ELECTRA modellerinin gerçek adayların yetkinlik bazlı mülakat sonuçlarına uygulandığı bir değerlendirme çalışmasının sonuçları detaylı olarak tartışılmıştır.

