U-Net-based VGG19 model for improved facial expression recognition

In response to the challenges faced by traditional facial recognition techniques, such as insufficient focus on key channel features, large number of parameters, and low recognition accuracy, this study proposes an improved VGG19 model that incorporates concepts from the U-Net architecture. While ma...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xiaohu ZHAO, Jingyi ZHANG, Mingzhi JIAO, Lixun XIE, Lanfei WANG, Weiqing SUN, Di ZHANG
Format:	Article
Language:	Chinese
Published:	Science Press 2025-06-01
Series:	工程科学学报
Subjects:	facial expression recognition deep learning convolutional neural network emotion classification vgg19
Online Access:	http://cje.ustb.edu.cn/article/doi/10.13374/j.issn2095-9389.2024.07.24.002
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1839635530100244480
author	Xiaohu ZHAO Jingyi ZHANG Mingzhi JIAO Lixun XIE Lanfei WANG Weiqing SUN Di ZHANG
author_facet	Xiaohu ZHAO Jingyi ZHANG Mingzhi JIAO Lixun XIE Lanfei WANG Weiqing SUN Di ZHANG
author_sort	Xiaohu ZHAO
collection	DOAJ
description	In response to the challenges faced by traditional facial recognition techniques, such as insufficient focus on key channel features, large number of parameters, and low recognition accuracy, this study proposes an improved VGG19 model that incorporates concepts from the U-Net architecture. While maintaining the deep feature extraction capability of VGG19, which is well-regarded in the field, the model employs specially designed convolutional layers and skip connections. The use of feature cropping and stitching techniques allows the model to efficiently integrate multi-scale features, thereby enhancing the robustness and effectiveness of facial expression recognition tasks. This design ensures the seamless integration of features from different layers, which is crucial for accurate facial expression recognition, as it maximizes the information yielded from each layer. Additionally, this paper introduces an improved SEAttention module, specifically designed for facial expression recognition tasks. The innovation of the SEAttention module lies in replacing the original activation function with the Mish activation function, which can dynamically adjust the weights of different channels to enhance performance. This adjustment ensures that important features are emphasized while redundant features are suppressed, streamlining the recognition process. This selective focus significantly speeds up the convergence of the network and improves the ability of the model to detect subtle changes in facial expressions, which is especially valuable in nuanced emotional contexts. Furthermore, modifications are made to the fully connected layers by substituting the first two layers with convolutional layers while retaining the fully connected final layer. This change reduces the number of nodes in these layers from [4096, 4096, 1000] to just [7], effectively addressing the large parameter size in the VGG19 network. Additionally, this modification improves the resistance of the model to overfitting, making it more robust when applied to new data. Extensive experiments were conducted on the FER2013 and CK+ datasets, demonstrating that the improved VGG19 model significantly enhanced recognition accuracy by 1.58% and 4.04%, respectively, compared to the original version. Furthermore, the parameter efficiency of the model was thoroughly evaluated, which indicated a substantial reduction in the overall parameter count without compromising performance. This balance between model complexity and accuracy highlights the practical applicability of the proposed method in real-world facial recognition scenarios, ensuring that it can be deployed in environments with limited computational resources. In conclusion, integrating the U-Net architecture and enhanced SEAttention module into the VGG19 network led to significant advancements in facial expression recognition. The improved model not only boosts performance in terms of feature extraction and fusion but is also adept in solving the pressing problems of parameter size and computational efficiency. These innovations contribute to achieving state-of-the-art performance in facial expression recognition, making the proposed method an important contribution to advancing computer vision and deep learning. The robustness and efficiency of the proposed method highlight its potential for various applications requiring accurate real-time facial expression analysis, such as human-computer interaction, security systems, and emotion-driven computing. Future work will explore the adaptability of the model to other datasets and additional optimization techniques, aiming to further enhance its performance and expand its applicability across diverse use cases.
format	Article
id	doaj-art-f3cf70e60d764e48a1afc754ec7ffb3d
institution	Matheson Library
issn	2095-9389
language	zho
publishDate	2025-06-01
publisher	Science Press
record_format	Article
series	工程科学学报
spelling	doaj-art-f3cf70e60d764e48a1afc754ec7ffb3d2025-07-09T07:48:22ZzhoScience Press工程科学学报2095-93892025-06-014761272128410.13374/j.issn2095-9389.2024.07.24.002240724-0002U-Net-based VGG19 model for improved facial expression recognitionXiaohu ZHAO0Jingyi ZHANG1Mingzhi JIAO2Lixun XIE3Lanfei WANG4Weiqing SUN5Di ZHANG6School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221008, ChinaIn response to the challenges faced by traditional facial recognition techniques, such as insufficient focus on key channel features, large number of parameters, and low recognition accuracy, this study proposes an improved VGG19 model that incorporates concepts from the U-Net architecture. While maintaining the deep feature extraction capability of VGG19, which is well-regarded in the field, the model employs specially designed convolutional layers and skip connections. The use of feature cropping and stitching techniques allows the model to efficiently integrate multi-scale features, thereby enhancing the robustness and effectiveness of facial expression recognition tasks. This design ensures the seamless integration of features from different layers, which is crucial for accurate facial expression recognition, as it maximizes the information yielded from each layer. Additionally, this paper introduces an improved SEAttention module, specifically designed for facial expression recognition tasks. The innovation of the SEAttention module lies in replacing the original activation function with the Mish activation function, which can dynamically adjust the weights of different channels to enhance performance. This adjustment ensures that important features are emphasized while redundant features are suppressed, streamlining the recognition process. This selective focus significantly speeds up the convergence of the network and improves the ability of the model to detect subtle changes in facial expressions, which is especially valuable in nuanced emotional contexts. Furthermore, modifications are made to the fully connected layers by substituting the first two layers with convolutional layers while retaining the fully connected final layer. This change reduces the number of nodes in these layers from [4096, 4096, 1000] to just [7], effectively addressing the large parameter size in the VGG19 network. Additionally, this modification improves the resistance of the model to overfitting, making it more robust when applied to new data. Extensive experiments were conducted on the FER2013 and CK+ datasets, demonstrating that the improved VGG19 model significantly enhanced recognition accuracy by 1.58% and 4.04%, respectively, compared to the original version. Furthermore, the parameter efficiency of the model was thoroughly evaluated, which indicated a substantial reduction in the overall parameter count without compromising performance. This balance between model complexity and accuracy highlights the practical applicability of the proposed method in real-world facial recognition scenarios, ensuring that it can be deployed in environments with limited computational resources. In conclusion, integrating the U-Net architecture and enhanced SEAttention module into the VGG19 network led to significant advancements in facial expression recognition. The improved model not only boosts performance in terms of feature extraction and fusion but is also adept in solving the pressing problems of parameter size and computational efficiency. These innovations contribute to achieving state-of-the-art performance in facial expression recognition, making the proposed method an important contribution to advancing computer vision and deep learning. The robustness and efficiency of the proposed method highlight its potential for various applications requiring accurate real-time facial expression analysis, such as human-computer interaction, security systems, and emotion-driven computing. Future work will explore the adaptability of the model to other datasets and additional optimization techniques, aiming to further enhance its performance and expand its applicability across diverse use cases.http://cje.ustb.edu.cn/article/doi/10.13374/j.issn2095-9389.2024.07.24.002facial expression recognitiondeep learningconvolutional neural networkemotion classificationvgg19
spellingShingle	Xiaohu ZHAO Jingyi ZHANG Mingzhi JIAO Lixun XIE Lanfei WANG Weiqing SUN Di ZHANG U-Net-based VGG19 model for improved facial expression recognition 工程科学学报 facial expression recognition deep learning convolutional neural network emotion classification vgg19
title	U-Net-based VGG19 model for improved facial expression recognition
title_full	U-Net-based VGG19 model for improved facial expression recognition
title_fullStr	U-Net-based VGG19 model for improved facial expression recognition
title_full_unstemmed	U-Net-based VGG19 model for improved facial expression recognition
title_short	U-Net-based VGG19 model for improved facial expression recognition
title_sort	u net based vgg19 model for improved facial expression recognition
topic	facial expression recognition deep learning convolutional neural network emotion classification vgg19
url	http://cje.ustb.edu.cn/article/doi/10.13374/j.issn2095-9389.2024.07.24.002
work_keys_str_mv	AT xiaohuzhao unetbasedvgg19modelforimprovedfacialexpressionrecognition AT jingyizhang unetbasedvgg19modelforimprovedfacialexpressionrecognition AT mingzhijiao unetbasedvgg19modelforimprovedfacialexpressionrecognition AT lixunxie unetbasedvgg19modelforimprovedfacialexpressionrecognition AT lanfeiwang unetbasedvgg19modelforimprovedfacialexpressionrecognition AT weiqingsun unetbasedvgg19modelforimprovedfacialexpressionrecognition AT dizhang unetbasedvgg19modelforimprovedfacialexpressionrecognition

U-Net-based VGG19 model for improved facial expression recognition

Similar Items