Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11779/2303
Full metadata record
DC FieldValueLanguage
dc.contributor.authorArısoy, Ebru-
dc.contributor.authorÖzgür, Arzucan-
dc.contributor.authorÜnlü Menevşe, Merve-
dc.contributor.authorManav, Yusufcan-
dc.date.accessioned2024-06-21T17:28:17Z-
dc.date.available2024-06-21T17:28:17Z-
dc.date.issued2024-
dc.identifier.isbn9782493814104-
dc.identifier.urihttps://hdl.handle.net/20.500.11779/2303-
dc.descriptionAequa-Tech; Baidu; Bloomberg; Dataforce (Transperfect); et al.; Intesa San Paolo Banken_US
dc.description.abstractThis paper focuses on dealing with data scarcity in spoken question answering (QA) using automatic question-answer generation and a carefully selected fine-tuning strategy that leverages limited annotated data (paragraphs and question-answer pairs). Spoken QA is a challenging task due to using spoken documents, i.e., erroneous automatic speech recognition (ASR) transcriptions, and the scarcity of spoken QA data. We propose a framework for utilizing limited annotated data effectively to improve spoken QA performance. To deal with data scarcity, we train a question-answer generation model with annotated data and then produce large amounts of question-answer pairs from unannotated data (paragraphs). Our experiments demonstrate that incorporating limited annotated data and the automatically generated data through a carefully selected fine-tuning strategy leads to 5.5% relative F1 gain over the model trained only with annotated data. Moreover, the proposed framework is also effective in high ASR errors. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.en_US
dc.language.isoenen_US
dc.publisherEuropean Language Resources Association (ELRA)en_US
dc.relation.ispartof2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings -- Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 -- 20 May 2024 through 25 May 2024 -- Hybrid, Torino -- 199620en_US
dc.subjectSpoken question answeringen_US
dc.subjectQuestion generationen_US
dc.titleDealing With Data Scarcity in Spoken Question Answeringen_US
dc.typeConference Objecten_US
dc.identifier.scopus2-s2.0-85195947153en_US
dc.authorscopusid58137783500-
dc.authorscopusid57219551922-
dc.authorscopusid14030977200-
dc.authorscopusid56230487200-
dc.description.PublishedMonthMayısen_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US
dc.identifier.endpage4455en_US
dc.identifier.startpage4449en_US
dc.departmentMühendislik Fakültesi, Elektrik Elektronik Mühendisliği Bölümüen_US
dc.institutionauthorArısoy, Ebru-
dc.identifier.citationcount0-
item.grantfulltextrestricted-
item.fulltextWith Fulltext-
item.languageiso639-1en-
item.openairetypeConference Object-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.cerifentitytypePublications-
crisitem.author.dept02.05. Department of Electrical and Electronics Engineering-
Appears in Collections:Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
Files in This Item:
File SizeFormat 
Full Text - Article.pdf
  Restricted Access
991.32 kBAdobe PDFView/Open    Request a copy
Show simple item record



CORE Recommender

Page view(s)

64
checked on Nov 18, 2024

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.