Audio Source Separation Using Variational Autoencoders and Weak Class Supervision
| dc.contributor.author | Karamatli, Ertug | |
| dc.contributor.author | Kirbiz, Serap | |
| dc.contributor.author | Cemgil, Ali Taylan | |
| dc.date.accessioned | 2026-04-03T15:00:37Z | |
| dc.date.available | 2026-04-03T15:00:37Z | |
| dc.date.issued | 2019 | |
| dc.description.abstract | In this letter, we propose a source separation method that is trained by observing the mixtures and the class labels of the sources present in the mixture without any access to isolated sources. Since our method does not require source class labels for every time-frequency bin but only a single label for each source constituting the mixture signal, we call this scenario as weak class supervision. We associate a variational autoencoder (VAE) with each source class within a non negative (compositional) model. Each VAE provides a prior model to identify the signal from its associated class in a sound mixture. After training the model on mixtures, we obtain a generative model for each source class and demonstrate our method on one-second mixtures of utterances of digits from 0 to 9. We show that the separation performance obtained by source class supervision is as good as the performance obtained by source signal supervision. | |
| dc.description.sponsorship | This work was supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under Grant 215E076. | |
| dc.description.sponsorship | Scientific and Technological Research Council of Turkey (TUBITAK) [215E076] | |
| dc.identifier.doi | 10.1109/LSP.2019.2929440 | |
| dc.identifier.issn | 1070-9908 | |
| dc.identifier.issn | 1558-2361 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.11779/3268 | |
| dc.identifier.uri | https://doi.org/10.1109/LSP.2019.2929440 | |
| dc.language.iso | en | |
| dc.publisher | IEEE-Inst Electrical Electronics Engineers Inc | |
| dc.rights | info:eu-repo/semantics/openAccess | |
| dc.subject | Variational Autoencoders | |
| dc.subject | Weak Supervision | |
| dc.subject | Source Separation | |
| dc.title | Audio Source Separation Using Variational Autoencoders and Weak Class Supervision | |
| dc.type | Article | |
| dspace.entity.type | Publication | |
| gdc.author.id | Karamatlı, Ertuğ/0000-0001-8839-0821 | |
| gdc.author.id | Kırbız, Serap/0000-0001-7718-3683 | |
| gdc.author.wosid | Kırbız, Serap/LPP-8018-2024 | |
| gdc.author.wosid | Cemgil, Ali/A-3068-2016 | |
| gdc.description.department | MEF University | |
| gdc.description.departmenttemp | [Karamatli, Ertug; Cemgil, Ali Taylan] Bogazici Univ, Dept Comp Engn, TR-34342 Istanbul, Turkey; [Karamatli, Ertug] Sahibinden Com, TR-34752 Istanbul, Turkey; [Kirbiz, Serap] MEF Univ, Dept Elect & Elect Engn, TR-34396 Istanbul, Turkey | |
| gdc.description.endpage | 1353 | |
| gdc.description.issue | 9 | |
| gdc.description.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | |
| gdc.description.startpage | 1349 | |
| gdc.description.volume | 26 | |
| gdc.description.woscitationindex | Science Citation Index Expanded | |
| gdc.identifier.wos | WOS:000480311900003 | |
| gdc.index.type | WoS | |
| gdc.virtual.author | Kırbız, Serap | |
| gdc.virtual.author | Cemgil, Ali Taylan | |
| relation.isAuthorOfPublication | 552e4b0c-955f-4b93-925b-08cb2e6c5cc0 | |
| relation.isAuthorOfPublication | 6943b45e-b359-4195-b278-21ac0fc5d439 | |
| relation.isAuthorOfPublication.latestForDiscovery | 552e4b0c-955f-4b93-925b-08cb2e6c5cc0 | |
| relation.isOrgUnitOfPublication | de19334f-6a5b-4f7b-9410-9433c48d1e5a | |
| relation.isOrgUnitOfPublication | 05ffa8cd-2a88-4676-8d3b-fc30eba0b7f3 | |
| relation.isOrgUnitOfPublication | 0d54cd31-4133-46d5-b5cc-280b2c077ac3 | |
| relation.isOrgUnitOfPublication | a6e60d5c-b0c7-474a-b49b-284dc710c078 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | de19334f-6a5b-4f7b-9410-9433c48d1e5a |
