EmotionNet-X: An Optimized CNN Architecture for Robust Facial Emotion Analysis
Facial emotions are expressions of people’s inner feelings. A computer’s ability to recognize emotions is known as emotion recognition (ER), which involves extracting facial characteristics or expressions from a person’s face in order to enable the computer to commun...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/11037439/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Facial emotions are expressions of people’s inner feelings. A computer’s ability to recognize emotions is known as emotion recognition (ER), which involves extracting facial characteristics or expressions from a person’s face in order to enable the computer to communicate emotionally with them. In recent years, the field of computer vision has grown rapidly, and the recognition of human facial emotions (FER) has drawn the attention of the research community due to its potential utility. The proposed EmotionNet-X bridges accuracy and deployability, enabling cost-effective FER in IoT systems, such as access control, authentication, monitoring health status in real-time, security systems, live assistance, etc. In addition, it involves a number of disciplines, including cognition, medicine, physiology, and psychology. There have been several publications in the literature regarding Facial Emotion Recognition (FER). FER remains challenging due to variations in facial expressions, demographics (age, ethnicity), and imaging conditions (lighting, occlusion). Existing pretrained models suffer from high computational costs, limiting real-time IoT deployment. Deep Neural Networks (DNNs), particularly Convolutional Neural Networks (CNNs), are widely used for facial expression recognition (FER). This is primarily due to their inherent ability to extract features from images automatically. Image-based prediction tasks are well suited to deep learning techniques, such as CNNs, which have demonstrated remarkable performance in this area. We propose EmotionNet-X, a lightweight CNN architecture with 19.9M parameters and 18 ms/image inference time. Key innovations include a streamlined design (four convolutional layers, seven dropout layers) and batch normalization for robust feature learning. Various pre-trained models, such as VGG19, ResNet50V2, MobileNetV2, EfficientNetB7, and recently state-of-the-art proposed models, have been compared with our proposed model. Public datasets named Cohn-Kanade (CK+) and FER2013 were used to evaluate the predictor’s performance and 99.86% accuracy was achieved on the CK+. |
---|---|
ISSN: | 2169-3536 |