Dealing With Data Scarcity in Spoken Question Answering

dc.contributor.author Menevse, Merve Unlu
dc.contributor.author Manavi, Yusufcan
dc.contributor.author Arisoy, Ebru
dc.contributor.author Ozgur, Arzucan
dc.date.accessioned 2026-03-05T15:02:45Z
dc.date.available 2026-03-05T15:02:45Z
dc.date.issued 2024
dc.description.abstract This paper focuses on dealing with data scarcity in spoken question answering (QA) using automatic question-answer generation and a carefully selected fine-tuning strategy that leverages limited annotated data (paragraphs and question-answer pairs). Spoken QA is a challenging task due to using spoken documents, i.e., erroneous automatic speech recognition (ASR) transcriptions, and the scarcity of spoken QA data. We propose a framework for utilizing limited annotated data effectively to improve spoken QA performance. To deal with data scarcity, we train a question-answer generation model with annotated data and then produce large amounts of question-answer pairs from unannotated data (paragraphs). Our experiments demonstrate that incorporating limited annotated data and the automatically generated data through a carefully selected fine-tuning strategy leads to 5.5% relative F1 gain over the model trained only with annotated data. Moreover, the proposed framework is also effective in high ASR errors. en_US
dc.identifier.isbn 9782493814104
dc.identifier.issn 2951-2093
dc.identifier.uri https://hdl.handle.net/20.500.11779/3238
dc.language.iso en en_US
dc.publisher Assoc Computational Linguistics-acl en_US
dc.relation.ispartof 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation-LREC-COLING -- May 20-25, 2024 -- Torino, ITALY en_US
dc.relation.ispartofseries International Conference on Computational Linguistics Language Resources and Evaluation
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Spoken Question Answering en_US
dc.subject Question Generation en_US
dc.title Dealing With Data Scarcity in Spoken Question Answering en_US
dc.type Conference Object en_US
dspace.entity.type Publication
gdc.description.department Mef University en_US
gdc.description.departmenttemp [Menevse, Merve Unlu; Manavi, Yusufcan; Ozgur, Arzucan] Bogazici Univ, Istanbul, Turkiye; [Arisoy, Ebru] MEF Univ, Istanbul, Turkiye en_US
gdc.description.endpage 4455 en_US
gdc.description.publicationcategory Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q2
gdc.description.startpage 4449 en_US
gdc.description.woscitationindex Conference Proceedings Citation Index - Science - Conference Proceedings Citation Index - Social Science & Humanities
gdc.description.wosquality N/A
gdc.identifier.wos WOS:001592980400397
gdc.index.type WoS
gdc.wos.citedcount 1
relation.isOrgUnitOfPublication a6e60d5c-b0c7-474a-b49b-284dc710c078
relation.isOrgUnitOfPublication.latestForDiscovery a6e60d5c-b0c7-474a-b49b-284dc710c078

Files