Endüstri Mühendisliği Bölümü Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.11779/1942

Browse

Search Results

Now showing 1 - 2 of 2
  • Conference Object
    Dialogue Enhancement Using Kernel Additive Modelling
    (Institute of Electrical and Electronics Engineers Inc., 2015-05-01) Liutkus, A.; Kırbız, Serap; Cemgil, A. Taylan
    It is a major problem for the sound engineers to find the right balance between the dialogue signals and the ambient sources. This problem also makes one of the main causes of the audience concerns. The audience wants to arrange the sound balance based on their personal preferences, listening environment and their hearing. In this work, a method is proposed for enhancing the dialogue signals in stereo recordings that consist of more than one source. The kernel additive modelling that has been used successfully in sound source separation is used to extract the dialogues and the ambient sources from the movie sounds. The separated dialogue and ambient sources can later be upmixed by the user to make a personal mix. The separation performance of the proposed method is evaluated on the sounds generated by mixing the sources which were taken from the only dialogue and only music parts of the movies. It has been shown that the Kernel Additive Modelling (KAM) based method can be successfully used for dialogue enhancement. © 2015 IEEE.
  • Article
    Citation - WoS: 9
    Citation - Scopus: 15
    Mixcycle: Unsupervised Speech Separation Via Cyclic Mixture Permutation Invariant Training
    (IEEE, 2022) Karamatlı, Ertuğ; Kırbız, Serap
    We introduce two unsupervised source separation methods, which involve self-supervised training from single-channel two-source speech mixtures. Our first method, mixture permutation invariant training (MixPIT), enables learning a neural network model which separates the underlying sources via a challenging proxy task without supervision from the reference sources. Our second method, cyclic mixture permutation invariant training (MixCycle), uses MixPIT as a building block in a cyclic fashion for continuous learning. MixCycle gradually converts the problem from separating mixtures of mixtures into separating single mixtures. We compare our methods to common supervised and unsupervised baselines: permutation invariant training with dynamic mixing (PIT-DM) and mixture invariant training (MixIT). We show that MixCycle outperforms MixIT and reaches a performance level very close to the supervised baseline (PIT-DM) while circumventing the over-separation issue of MixIT. Also, we propose a self-evaluation technique inspired by MixCycle that estimates model performance without utilizing any reference sources. We show that it yields results consistent with an evaluation on reference sources (LibriMix) and also with an informal listening test conducted on a real-life mixtures dataset (REAL-M).