Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.11779/1807
Title: | A Benchmark Dataset for Turkish Data-To Generation | Authors: | Demir, Şeniz Öktem, Seza |
Keywords: | Turkish Neural models Dining venue domain Biography domain Data-to-text generation Crowdsourcing |
Publisher: | Elsevier | Source: | Demir, S., & Oktem, S. (16 July 2022). A benchmark dataset for Turkish data-to-text generation. Computer Speech & Language. pp.1-45. https://doi.org/10.1016/j.csl.2022.101433 | Abstract: | In the last decades, data-to-text (D2T) systems that directly learn from data have gained a lot of attention in natural language generation. These systems need data with high quality and large volume, but unfortunately some natural languages suffer from the lack of readily available generation datasets. This article describes our efforts to create a new Turkish dataset (Tr-D2T) that consists of meaning representation and reference sentence pairs without fine-grained word alignments. We utilize Turkish web resources and existing datasets in other languages for producing meaning representations and collect reference sentences by crowdsourcing native speakers. We particularly focus on the generation of single-sentence biographies and dining venue descriptions. In order to motivate future Turkish D2T studies, we present detailed benchmarking results of different sequence-to-sequence neural models trained on this dataset. To the best of our knowledge, this work is the first of its kind that provides preliminary findings and lessons learned from the creation of a new Turkish D2T dataset. Moreover, our work is the first extensive study that presents generation performances of transformer and recurrent neural network models from meaning representations in this morphologically-rich language. | URI: | https://doi.org/10.1016/j.csl.2022.101433 https://hdl.handle.net/20.500.11779/1807 |
ISSN: | 0885-2308 |
Appears in Collections: | Bilgisayar Mühendisliği Bölümü Koleksiyonu Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
1-s2.0-S0885230822000614-main.pdf Until 2040-01-01 | Full Text - Article | 1.49 MB | Adobe PDF | View/Open Request a copy |
CORE Recommender
SCOPUSTM
Citations
1
checked on Jan 18, 2025
WEB OF SCIENCETM
Citations
1
checked on Jan 18, 2025
Page view(s)
54
checked on Jan 13, 2025
Google ScholarTM
Check
Altmetric
Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.