Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11779/2303
Full metadata record
DC FieldValueLanguage
dc.contributor.authorMenevşe,M.Ü.-
dc.contributor.authorManav,Y.-
dc.contributor.authorArisoy,E.-
dc.contributor.authorÖzgür,A.-
dc.date.accessioned2024-06-21T17:28:17Z-
dc.date.available2024-06-21T17:28:17Z-
dc.date.issued2024-
dc.identifier.isbn978-249381410-4-
dc.identifier.urihttps://hdl.handle.net/20.500.11779/2303-
dc.descriptionAequa-Tech; Baidu; Bloomberg; Dataforce (Transperfect); et al.; Intesa San Paolo Banken_US
dc.description.abstractThis paper focuses on dealing with data scarcity in spoken question answering (QA) using automatic question-answer generation and a carefully selected fine-tuning strategy that leverages limited annotated data (paragraphs and question-answer pairs). Spoken QA is a challenging task due to using spoken documents, i.e., erroneous automatic speech recognition (ASR) transcriptions, and the scarcity of spoken QA data. We propose a framework for utilizing limited annotated data effectively to improve spoken QA performance. To deal with data scarcity, we train a question-answer generation model with annotated data and then produce large amounts of question-answer pairs from unannotated data (paragraphs). Our experiments demonstrate that incorporating limited annotated data and the automatically generated data through a carefully selected fine-tuning strategy leads to 5.5% relative F1 gain over the model trained only with annotated data. Moreover, the proposed framework is also effective in high ASR errors. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.en_US
dc.language.isoenen_US
dc.publisherEuropean Language Resources Association (ELRA)en_US
dc.relation.ispartof2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings -- Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 -- 20 May 2024 through 25 May 2024 -- Hybrid, Torino -- 199620en_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectquestion generationen_US
dc.subjectspoken question answeringen_US
dc.titleDealing with Data Scarcity in Spoken Question Answeringen_US
dc.typeConference Objecten_US
dc.identifier.scopus2-s2.0-85195947153en_US
dc.authorscopusid58137783500-
dc.authorscopusid57219551922-
dc.authorscopusid14030977200-
dc.authorscopusid56230487200-
dc.identifier.wosqualityN/A-
dc.identifier.scopusqualityN/A-
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US
dc.identifier.endpage4455en_US
dc.identifier.startpage4449en_US
dc.departmentMef Universityen_US
dc.identifier.citationcount0-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.grantfulltextnone-
item.languageiso639-1en-
item.cerifentitytypePublications-
item.fulltextNo Fulltext-
item.openairetypeConference Object-
Appears in Collections:Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
Show simple item record



CORE Recommender

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.