Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11779/1989
Title: MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation Invariant Training
Authors: Karamatlı, Ertuğ
Kırbız, Serap
Keywords: Training
Recording
Source separation
Time-domain analysis
Task analysis
Optimized production technology
Unsupervised learning
Blind source separation
deep learning
self-supervised learning
unsupervised learning
Publisher: IEEE
Source: Karamatlı, E., & Kırbız, S. (2022). MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation Invariant Training. IEEE Signal Processing Letters, 29, 2637-2641.
Abstract: We introduce two unsupervised source separation methods, which involve self-supervised training from single-channel two-source speech mixtures. Our first method, mixture permutation invariant training (MixPIT), enables learning a neural network model which separates the underlying sources via a challenging proxy task without supervision from the reference sources. Our second method, cyclic mixture permutation invariant training (MixCycle), uses MixPIT as a building block in a cyclic fashion for continuous learning. MixCycle gradually converts the problem from separating mixtures of mixtures into separating single mixtures. We compare our methods to common supervised and unsupervised baselines: permutation invariant training with dynamic mixing (PIT-DM) and mixture invariant training (MixIT). We show that MixCycle outperforms MixIT and reaches a performance level very close to the supervised baseline (PIT-DM) while circumventing the over-separation issue of MixIT. Also, we propose a self-evaluation technique inspired by MixCycle that estimates model performance without utilizing any reference sources. We show that it yields results consistent with an evaluation on reference sources (LibriMix) and also with an informal listening test conducted on a real-life mixtures dataset (REAL-M).
URI: https://hdl.handle.net/20.500.11779/1989
https://doi.org/10.1109/LSP.2022.3232276
ISSN: 1070-9908
1558-2361
Appears in Collections:Endüstri Mühendisliği Bölümü koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Files in This Item:
File Description SizeFormat 
MixCycle_Unsupervised_Speech_Separation_via_Cyclic_Mixture_Permutation_Invariant_Training.pdfFull Text- Article604.92 kBAdobe PDFThumbnail
View/Open
Show full item record



CORE Recommender

SCOPUSTM   
Citations

3
checked on Aug 1, 2024

WEB OF SCIENCETM
Citations

1
checked on Jun 23, 2024

Page view(s)

4
checked on Jun 26, 2024

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.