Dual-Branch Multi-Dimensional Attention Mechanism for Joint Facial Expression Detection and Classification

This paper addresses the central issue arising from the (SDAC) of facial expressions, namely, to balance the competing demands of good global features for detection, and fine features for good facial expression classifications by replacing the feature extraction part of the “neck” network in the fea...

Full description

Saved in:
Bibliographic Details
Main Authors: Cheng Peng, Bohao Li, Kun Zou, Bowen Zhang, Genan Dai, Ah Chung Tsoi
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/12/3815
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839652680675360768
author Cheng Peng
Bohao Li
Kun Zou
Bowen Zhang
Genan Dai
Ah Chung Tsoi
author_facet Cheng Peng
Bohao Li
Kun Zou
Bowen Zhang
Genan Dai
Ah Chung Tsoi
author_sort Cheng Peng
collection DOAJ
description This paper addresses the central issue arising from the (SDAC) of facial expressions, namely, to balance the competing demands of good global features for detection, and fine features for good facial expression classifications by replacing the feature extraction part of the “neck” network in the feature pyramid network in the You Only Look Once X (YOLOX) framework with a novel architecture involving three attention mechanisms—batch, channel, and neighborhood—which respectively explores the three input dimensions—batch, channel, and spatial. Correlations across a batch of images in the individual path of the dual incoming paths are first extracted by a self attention mechanism in the batch dimension; these two paths are fused together to consolidate their information and then split again into two separate paths; the information along the channel dimension is extracted using a generalized form of channel attention, an adaptive graph channel attention, which provides each element of the incoming signal with a weight that is adapted to the incoming signal. The combination of these two paths, together with two skip connections from the input to the batch attention to the output of the adaptive channel attention, then passes into a residual network, with neighborhood attention to extract fine features in the spatial dimension. This novel dual path architecture has been shown experimentally to achieve a better balance between the competing demands in an SDAC problem than other competing approaches. Ablation studies enable the determination of the relative importance of these three attention mechanisms. Competitive results are obtained on two non-aligned face expression recognition datasets, RAF-DB and SFEW, when compared with other state-of-the-art methods.
format Article
id doaj-art-cabb8acd3dbf45f686acf99d14016ca2
institution Matheson Library
issn 1424-8220
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj-art-cabb8acd3dbf45f686acf99d14016ca22025-06-25T14:25:50ZengMDPI AGSensors1424-82202025-06-012512381510.3390/s25123815Dual-Branch Multi-Dimensional Attention Mechanism for Joint Facial Expression Detection and ClassificationCheng Peng0Bohao Li1Kun Zou2Bowen Zhang3Genan Dai4Ah Chung Tsoi5School of Computing, Zhongshan Institute, University of Electronic Science and Technology of China, Zhongshan 528402, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610000, ChinaSchool of Computing, Zhongshan Institute, University of Electronic Science and Technology of China, Zhongshan 528402, ChinaCollege of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, ChinaCollege of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, ChinaSchool of Computing and Information Technology, University of Wollongong, Wollongong, NSW 2522, AustraliaThis paper addresses the central issue arising from the (SDAC) of facial expressions, namely, to balance the competing demands of good global features for detection, and fine features for good facial expression classifications by replacing the feature extraction part of the “neck” network in the feature pyramid network in the You Only Look Once X (YOLOX) framework with a novel architecture involving three attention mechanisms—batch, channel, and neighborhood—which respectively explores the three input dimensions—batch, channel, and spatial. Correlations across a batch of images in the individual path of the dual incoming paths are first extracted by a self attention mechanism in the batch dimension; these two paths are fused together to consolidate their information and then split again into two separate paths; the information along the channel dimension is extracted using a generalized form of channel attention, an adaptive graph channel attention, which provides each element of the incoming signal with a weight that is adapted to the incoming signal. The combination of these two paths, together with two skip connections from the input to the batch attention to the output of the adaptive channel attention, then passes into a residual network, with neighborhood attention to extract fine features in the spatial dimension. This novel dual path architecture has been shown experimentally to achieve a better balance between the competing demands in an SDAC problem than other competing approaches. Ablation studies enable the determination of the relative importance of these three attention mechanisms. Competitive results are obtained on two non-aligned face expression recognition datasets, RAF-DB and SFEW, when compared with other state-of-the-art methods.https://www.mdpi.com/1424-8220/25/12/3815deep learningfacial expression recognitionbatch attentionattention fusion
spellingShingle Cheng Peng
Bohao Li
Kun Zou
Bowen Zhang
Genan Dai
Ah Chung Tsoi
Dual-Branch Multi-Dimensional Attention Mechanism for Joint Facial Expression Detection and Classification
Sensors
deep learning
facial expression recognition
batch attention
attention fusion
title Dual-Branch Multi-Dimensional Attention Mechanism for Joint Facial Expression Detection and Classification
title_full Dual-Branch Multi-Dimensional Attention Mechanism for Joint Facial Expression Detection and Classification
title_fullStr Dual-Branch Multi-Dimensional Attention Mechanism for Joint Facial Expression Detection and Classification
title_full_unstemmed Dual-Branch Multi-Dimensional Attention Mechanism for Joint Facial Expression Detection and Classification
title_short Dual-Branch Multi-Dimensional Attention Mechanism for Joint Facial Expression Detection and Classification
title_sort dual branch multi dimensional attention mechanism for joint facial expression detection and classification
topic deep learning
facial expression recognition
batch attention
attention fusion
url https://www.mdpi.com/1424-8220/25/12/3815
work_keys_str_mv AT chengpeng dualbranchmultidimensionalattentionmechanismforjointfacialexpressiondetectionandclassification
AT bohaoli dualbranchmultidimensionalattentionmechanismforjointfacialexpressiondetectionandclassification
AT kunzou dualbranchmultidimensionalattentionmechanismforjointfacialexpressiondetectionandclassification
AT bowenzhang dualbranchmultidimensionalattentionmechanismforjointfacialexpressiondetectionandclassification
AT genandai dualbranchmultidimensionalattentionmechanismforjointfacialexpressiondetectionandclassification
AT ahchungtsoi dualbranchmultidimensionalattentionmechanismforjointfacialexpressiondetectionandclassification