Endüstri Mühendisliği Bölümü Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.11779/1942

Browse

Search Results

Now showing 1 - 2 of 2
  • Article
    Citation - WoS: 1
    Facial Emotion Recognition Using Residual Neural Networks
    (Aves, 2024-11-08) Kırbız, Serap
    Facial emotion recognition (FER) has been an emerging research topic in recent years. Recent automatic FER systems generally apply deep learning methods and focus on two important issues: lack of sufficient labeled training data and variations in images such as illumination, pose, or expression-related variations among different cultures. Although Convolutional Neural Networks (CNNs) are widely used in automatic FER, they cannot be used when the number of layers is large. Therefore, a residual technique is applied to CNNs and this architecture is named residual neural network. In this paper, an automatic facial emotion recognition method using residual networks with random data augmentation is proposed on a merged FER dataset consisting of 41,598 facial images of size 48 × 48 pixels from seven basic emotion classes. Experimental results show that ResNet34 with data augmentation performs better than CNN with a classification accuracy of 81%.
  • Conference Object
    Citation - WoS: 4
    Citation - Scopus: 4
    Perceptual Coding-Based Informed Source Separation
    (IEEE, 2014) Girin, Laurent; Kırbız, Serap; Ozerov, Alexey; Liutkus, Antoine
    Informed Source Separation (ISS) techniques enable manipulation of the source signals that compose an audio mixture, based on a coder-decoder configuration. Provided the source signals are known at the encoder, a low-bitrate side-information is sent to the decoder and permits to achieve efficient source separation. Recent research has focused on a Coding-based ISS framework, which has an advantage to encode the desired audio objects, while exploiting their mixture in an information-theoretic framework. Here, we show how the perceptual quality of the separated sources can be improved by inserting perceptual source coding techniques in this framework, achieving a continuum of optimal bitrate-perceptual distortion trade-offs.