Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11779/1807
Full metadata record
DC FieldValueLanguage
dc.contributor.authorDemir, Şeniz-
dc.contributor.authorÖktem, Seza-
dc.date.accessioned2022-07-22T11:32:03Z-
dc.date.available2022-07-22T11:32:03Z-
dc.date.issued2022-
dc.identifier.citationDemir, S., & Oktem, S. (16 July 2022). A benchmark dataset for Turkish data-to-text generation. Computer Speech & Language. pp.1-45. https://doi.org/10.1016/j.csl.2022.101433en_US
dc.identifier.issn0885-2308-
dc.identifier.urihttps://doi.org/10.1016/j.csl.2022.101433-
dc.identifier.urihttps://hdl.handle.net/20.500.11779/1807-
dc.description.abstractIn the last decades, data-to-text (D2T) systems that directly learn from data have gained a lot of attention in natural language generation. These systems need data with high quality and large volume, but unfortunately some natural languages suffer from the lack of readily available generation datasets. This article describes our efforts to create a new Turkish dataset (Tr-D2T) that consists of meaning representation and reference sentence pairs without fine-grained word alignments. We utilize Turkish web resources and existing datasets in other languages for producing meaning representations and collect reference sentences by crowdsourcing native speakers. We particularly focus on the generation of single-sentence biographies and dining venue descriptions. In order to motivate future Turkish D2T studies, we present detailed benchmarking results of different sequence-to-sequence neural models trained on this dataset. To the best of our knowledge, this work is the first of its kind that provides preliminary findings and lessons learned from the creation of a new Turkish D2T dataset. Moreover, our work is the first extensive study that presents generation performances of transformer and recurrent neural network models from meaning representations in this morphologically-rich language.en_US
dc.language.isoenen_US
dc.publisherElsevieren_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectTurkishen_US
dc.subjectNeural modelsen_US
dc.subjectDining venue domainen_US
dc.subjectBiography domainen_US
dc.subjectData-to-text generationen_US
dc.subjectCrowdsourcingen_US
dc.titleA Benchmark Dataset for Turkish Data-To Generationen_US
dc.typeArticleen_US
dc.identifier.doi10.1016/j.csl.2022.101433-
dc.identifier.scopus2-s2.0-85134849907en_US
dc.authoridŞeniz Demir / 0000-0003-4897-4616-
dc.authoridSeza Öktem / 0000-0003-2885-7359-
dc.description.PublishedMonthTemmuzen_US
dc.description.woscitationindexScience Citation Index Expanded-
dc.identifier.wosqualityQ2-
dc.description.WoSDocumentTypeArticle
dc.description.WoSInternationalCollaborationUluslararası işbirliği ile yapılmayan - HAYIRen_US
dc.description.WoSPublishedMonthAğustosen_US
dc.description.WoSIndexDate2022en_US
dc.description.WoSYOKperiodYÖK - 2021-22en_US
dc.identifier.scopusqualityQ1-
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.identifier.endpage45en_US
dc.identifier.startpage1en_US
dc.departmentMühendislik Fakültesi, Bilgisayar Mühendisliği Bölümüen_US
dc.relation.journalComputer Speech & Languageen_US
dc.identifier.wosWOS:000834597200001en_US
dc.institutionauthorDemir, Şeniz-
dc.institutionauthorÖktem, Seza-
item.fulltextWith Fulltext-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.languageiso639-1en-
item.openairetypeArticle-
item.grantfulltextembargo_20400101-
item.cerifentitytypePublications-
crisitem.author.dept02.02. Department of Computer Engineering-
Appears in Collections:Bilgisayar Mühendisliği Bölümü Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection
Files in This Item:
File Description SizeFormat 
1-s2.0-S0885230822000614-main.pdf
  Until 2040-01-01
Full Text - Article1.49 MBAdobe PDFView/Open    Request a copy
Show simple item record



CORE Recommender

SCOPUSTM   
Citations

1
checked on Nov 9, 2024

WEB OF SCIENCETM
Citations

1
checked on Nov 9, 2024

Page view(s)

24
checked on Nov 4, 2024

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.