Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11779/1985
Full metadata record
DC FieldValueLanguage
dc.contributor.authorDemir, Şeniz-
dc.date.accessioned2023-10-18T12:06:13Z-
dc.date.available2023-10-18T12:06:13Z-
dc.date.issued2023-
dc.identifier.citationDemir, S. (2023). Turkish Data-to-Text Generation Using Sequence-to-Sequence Neural Networks. ACM Transactions on Asian and Low-Resource Language Information Processing, 22(2), 1-27.en_US
dc.identifier.issn2375-4699-
dc.identifier.issn2375-4702-
dc.identifier.urihttps://hdl.handle.net/20.500.11779/1985-
dc.identifier.urihttps://doi.org/10.1145/3543826-
dc.descriptionTUBITAK-ARDEB [117E977]en_US
dc.descriptionThis work is supported by TUBITAK-ARDEB under the grant number 117E977.en_US
dc.description.abstractEnd-to-end data-driven approaches lead to rapid development of language generation and dialogue systems. Despite the need for large amounts of well-organized data, these approaches jointly learn multiple components of the traditional generation pipeline without requiring costly human intervention. End-to-end approaches also enable the use of loosely aligned parallel datasets in system development by relaxing the degree of semantic correspondences between training data representations and text spans. However, their potential in Turkish language generation has not yet been fully exploited. In this work, we apply sequenceto-sequence (Seq2Seq) neural models to Turkish data-to-text generation where the input data given in the form of a meaning representation is verbalized. We explore encoder-decoder architectures with attention mechanism in unidirectional, bidirectional, and stacked recurrent neural network (RNN) models. Our models generate one-sentence biographies and dining venue descriptions using a crowdsourced dataset where all field value pairs that appear in meaning representations are fully captured in reference sentences. To support this work, we also explore the performances of our models on a more challenging dataset, where the content of a meaning representation is too large to fit into a single sentence, and hence content selection and surface realization need to be learned jointly. This dataset is retrieved by coupling introductory sentences of person-related Turkish Wikipedia articles with their contained infobox tables. Our empirical experiments on both datasets demonstrate that Seq2Seq models are capable of generating coherent and fluent biographies and venue descriptions from field value pairs. We argue that the wealth of knowledge residing in our datasets and the insights obtained fromthis study hold the potential to give rise to the development of new end-to-end generation approaches for Turkish and other morphologically rich languages.en_US
dc.language.isoenen_US
dc.publisherAssoc Computing Machineryen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectOf-the-arten_US
dc.subjectSequence-to-sequence modelen_US
dc.subjectTurkishen_US
dc.subjectWikipediaen_US
dc.subjectNatural-language generationen_US
dc.subjectData-to-text generationen_US
dc.titleTurkish Data-To Generation Using Sequence-To Neural Networksen_US
dc.typeArticleen_US
dc.identifier.doi10.1145/3543826-
dc.identifier.scopus2-s2.0-85152906599en_US
dc.description.PublishedMonthAralıken_US
dc.description.woscitationindexScience Citation Index Expanded-
dc.identifier.wosqualityQ4-
dc.description.WoSDocumentTypearticle-
dc.description.WoSInternationalCollaborationUluslararası işbirliği ile yapılmayan - HAYIRen_US
dc.description.WoSPublishedMonthNisanen_US
dc.description.WoSIndexDate2023en_US
dc.description.WoSYOKperiodYÖK - 2022-23en_US
dc.identifier.scopusqualityQ2-
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.identifier.issue2en_US
dc.identifier.volume22en_US
dc.departmentMühendislik Fakültesi, Elektrik Elektronik Mühendisligi Bölümüen_US
dc.relation.journalAcm Transactions on Asian and Low-Resource Language Information Processingen_US
dc.identifier.wosWOS:000963394900006en_US
dc.institutionauthorDemir, Şeniz-
item.grantfulltextnone-
item.fulltextNo Fulltext-
item.languageiso639-1en-
item.openairetypeArticle-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.cerifentitytypePublications-
crisitem.author.dept02.02. Department of Computer Engineering-
Appears in Collections:Elektrik Elektronik Mühendisliği Bölümü Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection
Show simple item record



CORE Recommender

SCOPUSTM   
Citations

2
checked on Nov 16, 2024

WEB OF SCIENCETM
Citations

1
checked on Nov 16, 2024

Page view(s)

66
checked on Nov 18, 2024

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.