Mixcycle: Unsupervised Speech Separation Via Cyclic Mixture Permutation Invariant Training

dc.contributor.author Karamatlı, Ertuğ
dc.contributor.author Kırbız, Serap
dc.date.accessioned 2023-10-18T12:06:14Z
dc.date.available 2023-10-18T12:06:14Z
dc.date.issued 2022
dc.description.abstract We introduce two unsupervised source separation methods, which involve self-supervised training from single-channel two-source speech mixtures. Our first method, mixture permutation invariant training (MixPIT), enables learning a neural network model which separates the underlying sources via a challenging proxy task without supervision from the reference sources. Our second method, cyclic mixture permutation invariant training (MixCycle), uses MixPIT as a building block in a cyclic fashion for continuous learning. MixCycle gradually converts the problem from separating mixtures of mixtures into separating single mixtures. We compare our methods to common supervised and unsupervised baselines: permutation invariant training with dynamic mixing (PIT-DM) and mixture invariant training (MixIT). We show that MixCycle outperforms MixIT and reaches a performance level very close to the supervised baseline (PIT-DM) while circumventing the over-separation issue of MixIT. Also, we propose a self-evaluation technique inspired by MixCycle that estimates model performance without utilizing any reference sources. We show that it yields results consistent with an evaluation on reference sources (LibriMix) and also with an informal listening test conducted on a real-life mixtures dataset (REAL-M).
dc.identifier.citation Karamatlı, E., & Kırbız, S. (2022). MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation Invariant Training. IEEE Signal Processing Letters, 29, 2637-2641.
dc.identifier.doi 10.1109/LSP.2022.3232276
dc.identifier.issn 1070-9908
dc.identifier.issn 1558-2361
dc.identifier.scopus 2-s2.0-85146250664
dc.identifier.uri https://hdl.handle.net/20.500.11779/1989
dc.identifier.uri https://doi.org/10.1109/LSP.2022.3232276
dc.language.iso en
dc.publisher IEEE
dc.relation.ispartof IEEE Signal Processing Letters
dc.rights info:eu-repo/semantics/openAccess
dc.subject Self-supervised learning
dc.subject Time-domain analysis
dc.subject Unsupervised learning
dc.subject Training
dc.subject Source separation
dc.subject Optimized production technology
dc.subject Recording
dc.subject Blind source separation
dc.subject Deep learning
dc.subject Task analysis
dc.title Mixcycle: Unsupervised Speech Separation Via Cyclic Mixture Permutation Invariant Training
dc.type Article
dspace.entity.type Publication
gdc.author.institutional Kırbız, Serap
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department Mühendislik Fakültesi, Endüstri Mühendisliği Bölümü
gdc.description.endpage 2641
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
gdc.description.scopusquality Q1
gdc.description.startpage 2637
gdc.description.volume 29
gdc.description.woscitationindex Science Citation Index Expanded
gdc.description.wosquality Q2
gdc.identifier.openalex W4312540993
gdc.identifier.wos WOS:000910559500004
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.impulse 2.0
gdc.oaire.influence 2.6886524E-9
gdc.oaire.isgreen true
gdc.oaire.keywords Signal Processing (eess.SP)
gdc.oaire.keywords FOS: Computer and information sciences
gdc.oaire.keywords Computer Science - Machine Learning
gdc.oaire.keywords Sound (cs.SD)
gdc.oaire.keywords deep learning
gdc.oaire.keywords unsupervised learning
gdc.oaire.keywords Unsupervised learning
gdc.oaire.keywords Computer Science - Sound
gdc.oaire.keywords Machine Learning (cs.LG)
gdc.oaire.keywords Time-domain analysis
gdc.oaire.keywords Audio and Speech Processing (eess.AS)
gdc.oaire.keywords Recording
gdc.oaire.keywords Task analysis
gdc.oaire.keywords self-supervised learning
gdc.oaire.keywords FOS: Electrical engineering, electronic engineering, information engineering
gdc.oaire.keywords Training
gdc.oaire.keywords Blind source separation
gdc.oaire.keywords Source separation
gdc.oaire.keywords Electrical Engineering and Systems Science - Signal Processing
gdc.oaire.keywords Optimized production technology
gdc.oaire.keywords Electrical Engineering and Systems Science - Audio and Speech Processing
gdc.oaire.popularity 4.058928E-9
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 02 engineering and technology
gdc.oaire.sciencefields 0202 electrical engineering, electronic engineering, information engineering
gdc.openalex.collaboration National
gdc.openalex.fwci 1.16974372
gdc.openalex.normalizedpercentile 0.73
gdc.opencitations.count 2
gdc.plumx.crossrefcites 1
gdc.plumx.mendeley 7
gdc.plumx.scopuscites 14
gdc.publishedmonth Aralık
gdc.relation.journal Ieee Signal Processing Letters
gdc.scopus.citedcount 14
gdc.virtual.author Kırbız, Serap
gdc.wos.citedcount 9
gdc.wos.collaboration Uluslararası işbirliği ile yapılmayan - HAYIR
gdc.wos.documenttype article
gdc.wos.indexdate 2022
gdc.wos.publishedmonth Aralık
gdc.yokperiod YÖK - 2022-23
relation.isAuthorOfPublication 552e4b0c-955f-4b93-925b-08cb2e6c5cc0
relation.isAuthorOfPublication.latestForDiscovery 552e4b0c-955f-4b93-925b-08cb2e6c5cc0
relation.isOrgUnitOfPublication de19334f-6a5b-4f7b-9410-9433c48d1e5a
relation.isOrgUnitOfPublication 0d54cd31-4133-46d5-b5cc-280b2c077ac3
relation.isOrgUnitOfPublication a6e60d5c-b0c7-474a-b49b-284dc710c078
relation.isOrgUnitOfPublication.latestForDiscovery de19334f-6a5b-4f7b-9410-9433c48d1e5a

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
MixCycle_Unsupervised_Speech_Separation_via_Cyclic_Mixture_Permutation_Invariant_Training.pdf
Size:
604.92 KB
Format:
Adobe Portable Document Format
Description:
Full Text- Article

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description: