Browsing by Author "Karamatli, Ertug"
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Article Audio Source Separation Using Variational Autoencoders and Weak Class Supervision(IEEE-Inst Electrical Electronics Engineers Inc, 2019) Karamatli, Ertug; Kirbiz, Serap; Cemgil, Ali TaylanIn this letter, we propose a source separation method that is trained by observing the mixtures and the class labels of the sources present in the mixture without any access to isolated sources. Since our method does not require source class labels for every time-frequency bin but only a single label for each source constituting the mixture signal, we call this scenario as weak class supervision. We associate a variational autoencoder (VAE) with each source class within a non negative (compositional) model. Each VAE provides a prior model to identify the signal from its associated class in a sound mixture. After training the model on mixtures, we obtain a generative model for each source class and demonstrate our method on one-second mixtures of utterances of digits from 0 to 9. We show that the separation performance obtained by source class supervision is as good as the performance obtained by source signal supervision.Conference Object Değişimli Oto-Kodlayıcılar Kullanılarak Birleşik Kaynak Ayırtstırma ve Sınıflandırma(Institute of Electrical and Electronics Engineers Inc., 2020) Karamatli, Ertug; Kirbiz, Serap; Cemgil, Ali Taylan; Hizli, CaglarConference Object Weak Label Supervision for Monaural Source Separation Using Non-Negative Denoising Variational Autoencoders(IEEE, 2019) Karamatli, Ertug; Kirbiz, Serap; Cemgil, Ali TaylanDeep learning models are very effective in source separation when there are large amounts of labeled data available. However it is not always possible to have carefully labeled datasets. In this paper, we propose a weak supervision method that only uses class information rather than source signals for learning to separate short utterance mixtures. We associate a variational autoencoder (VAE) with each class within a non-negative model. We demonstrate that deep convolutional VAEs provide a prior model to identify complex signals in a sound mixture without having access to any source signal. We show that the separation results are on par with source signal supervision.

