Improving the Usage of Subword-Based Units for Turkish Speech Recognition
| dc.contributor.author | Çetinkaya, Gözde | |
| dc.contributor.author | Saraçlar, Murat | |
| dc.contributor.author | Arısoy, Ebru | |
| dc.date.accessioned | 2021-10-09T07:26:12Z | |
| dc.date.available | 2021-10-09T07:26:12Z | |
| dc.date.issued | 2020 | |
| dc.description.abstract | Subword units are often utilized to achieve better performance in speech recognition because of the high number of observed words in agglutinative languages. In this study, the proper use of subword units is explored in recognition by a reconsideration of details such as silence modeling and position-dependent phones. A modified lexicon by finite-state transducers is implemented to represent the subword units correctly. Also, we experiment with different types of word boundary markers and achieve the best performance by adding a marker both to the left and right side of a subword unit. In our experiments on a Turkish broadcast news dataset, the subword models do outperform word-based models and naive subword implementations. Results show that using proper subword units leads to a relative word error rate (WER) reductions, which is 2.4%, compared with the word level automatic speech recognition (ASR) system for Turkish. | |
| dc.description.sponsorship | Istanbul Medipol Univ | |
| dc.identifier.citation | G. Çetinkaya, E. Arısoy and M. Saraçlar, (5-7 Oct. 2020). Improving the Usage of Subword-Based Units for Turkish Speech Recognition, 2020 28th Signal Processing and Communications Applications Conference (SIU), pp. 1-4, doi: 10.1109/SIU49456.2020.9302043. | |
| dc.identifier.doi | 10.1109/SIU49456.2020.9302043 | |
| dc.identifier.isbn | 9781728172064 | |
| dc.identifier.issn | 2165-0608 | |
| dc.identifier.scopus | 2-s2.0-85100307964 | |
| dc.identifier.uri | https://doi.org/10.1109/SIU49456.2020.9302043 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.11779/1572 | |
| dc.language.iso | tr | |
| dc.publisher | IEEE | |
| dc.relation.ispartof | 2020 28th Signal Processing and Communications Applications Conference (SIU) | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.subject | Konuşma tanıma | |
| dc.subject | Language modelling | |
| dc.subject | Acoustic modelling | |
| dc.subject | Speech recognition | |
| dc.subject | Akustik modelleme | |
| dc.subject | Dil modelleme | |
| dc.title | Improving the Usage of Subword-Based Units for Turkish Speech Recognition | |
| dc.title.alternative | Türkçe konuşma tanıma için sözcük altı birimlerin kullanımının iyileştirilmesi | |
| dc.type | Conference Object | |
| dspace.entity.type | Publication | |
| gdc.author.id | Ebru Arısoy / 0000-0002-8311-3611 | |
| gdc.author.institutional | Arısoy, Ebru | |
| gdc.bip.impulseclass | C5 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C5 | |
| gdc.coar.access | metadata only access | |
| gdc.coar.type | text::conference output | |
| gdc.description.department | Mühendislik Fakültesi, Elektrik Elektronik Mühendisliği Bölümü | |
| gdc.description.endpage | 4 | |
| gdc.description.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | |
| gdc.description.scopusquality | N/A | |
| gdc.description.startpage | 1-4 | |
| gdc.description.woscitationindex | Conference Proceedings Citation Index - Science | |
| gdc.description.wosquality | N/A | |
| gdc.identifier.openalex | W3119042470 | |
| gdc.identifier.wos | WOS:000653136100017 | |
| gdc.index.type | WoS | |
| gdc.index.type | Scopus | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.impulse | 0.0 | |
| gdc.oaire.influence | 2.5942106E-9 | |
| gdc.oaire.isgreen | false | |
| gdc.oaire.popularity | 1.652743E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.oaire.sciencefields | 03 medical and health sciences | |
| gdc.oaire.sciencefields | 0305 other medical science | |
| gdc.openalex.fwci | 0.2937191 | |
| gdc.openalex.normalizedpercentile | 0.65 | |
| gdc.opencitations.count | 1 | |
| gdc.plumx.crossrefcites | 1 | |
| gdc.plumx.mendeley | 3 | |
| gdc.plumx.scopuscites | 2 | |
| gdc.publishedmonth | Ekim | |
| gdc.relation.journal | 2020 28th Signal Processing and Communications Applications Conference (SIU) | |
| gdc.scopus.citedcount | 2 | |
| gdc.virtual.author | Arısoy Saraçlar, Ebru | |
| gdc.wos.citedcount | 1 | |
| gdc.wos.collaboration | Uluslararası işbirliği ile yapılmayan - HAYIR | |
| gdc.wos.documenttype | Proceedings Paper | |
| gdc.wos.indexdate | 2020 | |
| gdc.wos.publishedmonth | Ekim | |
| gdc.yokperiod | YÖK - 2020-21 | |
| relation.isAuthorOfPublication | 0b895153-5793-4e46-bc2f-06a28b30f531 | |
| relation.isAuthorOfPublication.latestForDiscovery | 0b895153-5793-4e46-bc2f-06a28b30f531 | |
| relation.isOrgUnitOfPublication | de19334f-6a5b-4f7b-9410-9433c48d1e5a | |
| relation.isOrgUnitOfPublication | 0d54cd31-4133-46d5-b5cc-280b2c077ac3 | |
| relation.isOrgUnitOfPublication | a6e60d5c-b0c7-474a-b49b-284dc710c078 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | de19334f-6a5b-4f7b-9410-9433c48d1e5a |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Improving_the_Usage_of_Subword-Based_Units_for_Turkish_Speech_Recognition.pdf
- Size:
- 224.35 KB
- Format:
- Adobe Portable Document Format
- Description:
- Proceedings Paper
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.44 KB
- Format:
- Item-specific license agreed upon to submission
- Description:
