U-Net-based VGG19 model for improved facial expression recognition

In response to the challenges faced by traditional facial recognition techniques, such as insufficient focus on key channel features, large number of parameters, and low recognition accuracy, this study proposes an improved VGG19 model that incorporates concepts from the U-Net architecture. While ma...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaohu ZHAO, Jingyi ZHANG, Mingzhi JIAO, Lixun XIE, Lanfei WANG, Weiqing SUN, Di ZHANG
Format: Article
Language:Chinese
Published: Science Press 2025-06-01
Series:工程科学学报
Subjects:
Online Access:http://cje.ustb.edu.cn/article/doi/10.13374/j.issn2095-9389.2024.07.24.002
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839635530100244480
author Xiaohu ZHAO
Jingyi ZHANG
Mingzhi JIAO
Lixun XIE
Lanfei WANG
Weiqing SUN
Di ZHANG
author_facet Xiaohu ZHAO
Jingyi ZHANG
Mingzhi JIAO
Lixun XIE
Lanfei WANG
Weiqing SUN
Di ZHANG
author_sort Xiaohu ZHAO
collection DOAJ
description In response to the challenges faced by traditional facial recognition techniques, such as insufficient focus on key channel features, large number of parameters, and low recognition accuracy, this study proposes an improved VGG19 model that incorporates concepts from the U-Net architecture. While maintaining the deep feature extraction capability of VGG19, which is well-regarded in the field, the model employs specially designed convolutional layers and skip connections. The use of feature cropping and stitching techniques allows the model to efficiently integrate multi-scale features, thereby enhancing the robustness and effectiveness of facial expression recognition tasks. This design ensures the seamless integration of features from different layers, which is crucial for accurate facial expression recognition, as it maximizes the information yielded from each layer. Additionally, this paper introduces an improved SEAttention module, specifically designed for facial expression recognition tasks. The innovation of the SEAttention module lies in replacing the original activation function with the Mish activation function, which can dynamically adjust the weights of different channels to enhance performance. This adjustment ensures that important features are emphasized while redundant features are suppressed, streamlining the recognition process. This selective focus significantly speeds up the convergence of the network and improves the ability of the model to detect subtle changes in facial expressions, which is especially valuable in nuanced emotional contexts. Furthermore, modifications are made to the fully connected layers by substituting the first two layers with convolutional layers while retaining the fully connected final layer. This change reduces the number of nodes in these layers from [4096, 4096, 1000] to just [7], effectively addressing the large parameter size in the VGG19 network. Additionally, this modification improves the resistance of the model to overfitting, making it more robust when applied to new data. Extensive experiments were conducted on the FER2013 and CK+ datasets, demonstrating that the improved VGG19 model significantly enhanced recognition accuracy by 1.58% and 4.04%, respectively, compared to the original version. Furthermore, the parameter efficiency of the model was thoroughly evaluated, which indicated a substantial reduction in the overall parameter count without compromising performance. This balance between model complexity and accuracy highlights the practical applicability of the proposed method in real-world facial recognition scenarios, ensuring that it can be deployed in environments with limited computational resources. In conclusion, integrating the U-Net architecture and enhanced SEAttention module into the VGG19 network led to significant advancements in facial expression recognition. The improved model not only boosts performance in terms of feature extraction and fusion but is also adept in solving the pressing problems of parameter size and computational efficiency. These innovations contribute to achieving state-of-the-art performance in facial expression recognition, making the proposed method an important contribution to advancing computer vision and deep learning. The robustness and efficiency of the proposed method highlight its potential for various applications requiring accurate real-time facial expression analysis, such as human-computer interaction, security systems, and emotion-driven computing. Future work will explore the adaptability of the model to other datasets and additional optimization techniques, aiming to further enhance its performance and expand its applicability across diverse use cases.
format Article
id doaj-art-f3cf70e60d764e48a1afc754ec7ffb3d
institution Matheson Library
issn 2095-9389
language zho
publishDate 2025-06-01
publisher Science Press
record_format Article
series 工程科学学报
spelling doaj-art-f3cf70e60d764e48a1afc754ec7ffb3d2025-07-09T07:48:22ZzhoScience Press工程科学学报2095-93892025-06-014761272128410.13374/j.issn2095-9389.2024.07.24.002240724-0002U-Net-based VGG19 model for improved facial expression recognitionXiaohu ZHAO0Jingyi ZHANG1Mingzhi JIAO2Lixun XIE3Lanfei WANG4Weiqing SUN5Di ZHANG6School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaIn response to the challenges faced by traditional facial recognition techniques, such as insufficient focus on key channel features, large number of parameters, and low recognition accuracy, this study proposes an improved VGG19 model that incorporates concepts from the U-Net architecture. While maintaining the deep feature extraction capability of VGG19, which is well-regarded in the field, the model employs specially designed convolutional layers and skip connections. The use of feature cropping and stitching techniques allows the model to efficiently integrate multi-scale features, thereby enhancing the robustness and effectiveness of facial expression recognition tasks. This design ensures the seamless integration of features from different layers, which is crucial for accurate facial expression recognition, as it maximizes the information yielded from each layer. Additionally, this paper introduces an improved SEAttention module, specifically designed for facial expression recognition tasks. The innovation of the SEAttention module lies in replacing the original activation function with the Mish activation function, which can dynamically adjust the weights of different channels to enhance performance. This adjustment ensures that important features are emphasized while redundant features are suppressed, streamlining the recognition process. This selective focus significantly speeds up the convergence of the network and improves the ability of the model to detect subtle changes in facial expressions, which is especially valuable in nuanced emotional contexts. Furthermore, modifications are made to the fully connected layers by substituting the first two layers with convolutional layers while retaining the fully connected final layer. This change reduces the number of nodes in these layers from [4096, 4096, 1000] to just [7], effectively addressing the large parameter size in the VGG19 network. Additionally, this modification improves the resistance of the model to overfitting, making it more robust when applied to new data. Extensive experiments were conducted on the FER2013 and CK+ datasets, demonstrating that the improved VGG19 model significantly enhanced recognition accuracy by 1.58% and 4.04%, respectively, compared to the original version. Furthermore, the parameter efficiency of the model was thoroughly evaluated, which indicated a substantial reduction in the overall parameter count without compromising performance. This balance between model complexity and accuracy highlights the practical applicability of the proposed method in real-world facial recognition scenarios, ensuring that it can be deployed in environments with limited computational resources. In conclusion, integrating the U-Net architecture and enhanced SEAttention module into the VGG19 network led to significant advancements in facial expression recognition. The improved model not only boosts performance in terms of feature extraction and fusion but is also adept in solving the pressing problems of parameter size and computational efficiency. These innovations contribute to achieving state-of-the-art performance in facial expression recognition, making the proposed method an important contribution to advancing computer vision and deep learning. The robustness and efficiency of the proposed method highlight its potential for various applications requiring accurate real-time facial expression analysis, such as human-computer interaction, security systems, and emotion-driven computing. Future work will explore the adaptability of the model to other datasets and additional optimization techniques, aiming to further enhance its performance and expand its applicability across diverse use cases.http://cje.ustb.edu.cn/article/doi/10.13374/j.issn2095-9389.2024.07.24.002facial expression recognitiondeep learningconvolutional neural networkemotion classificationvgg19
spellingShingle Xiaohu ZHAO
Jingyi ZHANG
Mingzhi JIAO
Lixun XIE
Lanfei WANG
Weiqing SUN
Di ZHANG
U-Net-based VGG19 model for improved facial expression recognition
工程科学学报
facial expression recognition
deep learning
convolutional neural network
emotion classification
vgg19
title U-Net-based VGG19 model for improved facial expression recognition
title_full U-Net-based VGG19 model for improved facial expression recognition
title_fullStr U-Net-based VGG19 model for improved facial expression recognition
title_full_unstemmed U-Net-based VGG19 model for improved facial expression recognition
title_short U-Net-based VGG19 model for improved facial expression recognition
title_sort u net based vgg19 model for improved facial expression recognition
topic facial expression recognition
deep learning
convolutional neural network
emotion classification
vgg19
url http://cje.ustb.edu.cn/article/doi/10.13374/j.issn2095-9389.2024.07.24.002
work_keys_str_mv AT xiaohuzhao unetbasedvgg19modelforimprovedfacialexpressionrecognition
AT jingyizhang unetbasedvgg19modelforimprovedfacialexpressionrecognition
AT mingzhijiao unetbasedvgg19modelforimprovedfacialexpressionrecognition
AT lixunxie unetbasedvgg19modelforimprovedfacialexpressionrecognition
AT lanfeiwang unetbasedvgg19modelforimprovedfacialexpressionrecognition
AT weiqingsun unetbasedvgg19modelforimprovedfacialexpressionrecognition
AT dizhang unetbasedvgg19modelforimprovedfacialexpressionrecognition