Dealing With Data Scarcity in Spoken Question Answering

Arısoy, Ebru; Özgür, Arzucan; Ünlü Menevşe, Merve; Manav, Yusufcan; Menevse, Merve Unlu; Manavi, Yusufcan

Dealing With Data Scarcity in Spoken Question Answering

Files

Full Text - Article.pdf (959.36 KB)

Date

2024

Authors

Publisher

European Language Resources Association (ELRA)

Abstract

This paper focuses on dealing with data scarcity in spoken question answering (QA) using automatic question-answer generation and a carefully selected fine-tuning strategy that leverages limited annotated data (paragraphs and question-answer pairs). Spoken QA is a challenging task due to using spoken documents, i.e., erroneous automatic speech recognition (ASR) transcriptions, and the scarcity of spoken QA data. We propose a framework for utilizing limited annotated data effectively to improve spoken QA performance. To deal with data scarcity, we train a question-answer generation model with annotated data and then produce large amounts of question-answer pairs from unannotated data (paragraphs). Our experiments demonstrate that incorporating limited annotated data and the automatically generated data through a carefully selected fine-tuning strategy leads to 5.5% relative F1 gain over the model trained only with annotated data. Moreover, the proposed framework is also effective in high ASR errors. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Description

Aequa-Tech; Baidu; Bloomberg; Dataforce (Transperfect); et al.; Intesa San Paolo Bank

Keywords

Spoken question answering, Question generation

WoS Q

N/A

Scopus Q

N/A

Source

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings -- Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 -- 20 May 2024 through 25 May 2024 -- Hybrid, Torino -- 199620

Start Page

4449

End Page

4455

URI

https://hdl.handle.net/20.500.11779/2303
https://doi.org/

Collections

Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Full item page

SCOPUS™ Citations

2

checked on Jun 11, 2026

Web of Science™ Citations

1

checked on Jun 11, 2026

Page Views

71

checked on Jun 11, 2026

Google Scholar™

Check

Dealing With Data Scarcity in Spoken Question Answering

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

Description

Keywords

Fields of Science

Citation

WoS Q

Scopus Q

Source

Volume

Issue

Start Page

End Page

URI

Collections

SCOPUS™ Citations

2

Web of Science™ Citations

1

Page Views

71

Google Scholar™

Sustainable Development Goals

SDG data is not available