Dealing With Data Scarcity in Spoken Question Answering
| dc.contributor.author | Menevse, Merve Unlu | |
| dc.contributor.author | Manavi, Yusufcan | |
| dc.contributor.author | Arisoy, Ebru | |
| dc.contributor.author | Ozgur, Arzucan | |
| dc.date.accessioned | 2026-03-05T15:02:45Z | |
| dc.date.available | 2026-03-05T15:02:45Z | |
| dc.date.issued | 2024 | |
| dc.description.abstract | This paper focuses on dealing with data scarcity in spoken question answering (QA) using automatic question-answer generation and a carefully selected fine-tuning strategy that leverages limited annotated data (paragraphs and question-answer pairs). Spoken QA is a challenging task due to using spoken documents, i.e., erroneous automatic speech recognition (ASR) transcriptions, and the scarcity of spoken QA data. We propose a framework for utilizing limited annotated data effectively to improve spoken QA performance. To deal with data scarcity, we train a question-answer generation model with annotated data and then produce large amounts of question-answer pairs from unannotated data (paragraphs). Our experiments demonstrate that incorporating limited annotated data and the automatically generated data through a carefully selected fine-tuning strategy leads to 5.5% relative F1 gain over the model trained only with annotated data. Moreover, the proposed framework is also effective in high ASR errors. | en_US |
| dc.identifier.isbn | 9782493814104 | |
| dc.identifier.issn | 2951-2093 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.11779/3238 | |
| dc.language.iso | en | en_US |
| dc.publisher | Assoc Computational Linguistics-acl | en_US |
| dc.relation.ispartof | 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation-LREC-COLING -- May 20-25, 2024 -- Torino, ITALY | en_US |
| dc.relation.ispartofseries | International Conference on Computational Linguistics Language Resources and Evaluation | |
| dc.rights | info:eu-repo/semantics/closedAccess | en_US |
| dc.subject | Spoken Question Answering | en_US |
| dc.subject | Question Generation | en_US |
| dc.title | Dealing With Data Scarcity in Spoken Question Answering | en_US |
| dc.type | Conference Object | en_US |
| dspace.entity.type | Publication | |
| gdc.description.department | Mef University | en_US |
| gdc.description.departmenttemp | [Menevse, Merve Unlu; Manavi, Yusufcan; Ozgur, Arzucan] Bogazici Univ, Istanbul, Turkiye; [Arisoy, Ebru] MEF Univ, Istanbul, Turkiye | en_US |
| gdc.description.endpage | 4455 | en_US |
| gdc.description.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | en_US |
| gdc.description.scopusquality | Q2 | |
| gdc.description.startpage | 4449 | en_US |
| gdc.description.woscitationindex | Conference Proceedings Citation Index - Science - Conference Proceedings Citation Index - Social Science & Humanities | |
| gdc.description.wosquality | N/A | |
| gdc.identifier.wos | WOS:001592980400397 | |
| gdc.index.type | WoS | |
| gdc.wos.citedcount | 1 | |
| relation.isOrgUnitOfPublication | a6e60d5c-b0c7-474a-b49b-284dc710c078 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | a6e60d5c-b0c7-474a-b49b-284dc710c078 |
