Multimodal emotion recognition method in complex dynamic scenes

Multimodal emotion recognition technology leverages the power of deep learning to address advanced visual and emotional tasks. While generic deep networks can handle simple emotion recognition tasks, their generalization capability in complex and noisy environments, such as multi-scene outdoor setti...

Full description

Saved in:

Bibliographic Details
Main Authors:	Long Liu, Qingquan Luo, Wenbo Zhang, Mengxuan Zhang, Bowen Zhai
Format:	Article
Language:	English
Published:	KeAi Communications Co., Ltd. 2025-05-01
Series:	Journal of Information and Intelligence
Subjects:	Multimodal sentiment recognition Attention mechanisms Contrastive learning Multitask analysis introduction
Online Access:	http://www.sciencedirect.com/science/article/pii/S2949715925000046
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Multimodal emotion recognition technology leverages the power of deep learning to address advanced visual and emotional tasks. While generic deep networks can handle simple emotion recognition tasks, their generalization capability in complex and noisy environments, such as multi-scene outdoor settings, remains limited. To overcome these challenges, this paper proposes a novel multimodal emotion recognition framework. First, we develop a robust network architecture based on the T5-small model, designed for dynamic-static fusion in complex scenarios, effectively mitigating the impact of noise. Second, we introduce a dynamic-static cross fusion network (D-SCFN) to enhance the integration and extraction of dynamic and static information, embedding it seamlessly within the T5 framework. Finally, we design and evaluate three distinct multi-task analysis frameworks to explore dependencies among tasks. The experimental results demonstrate that our model significantly outperforms other existing models, showcasing exceptional stability and remarkable adaptability to complex and dynamic scenarios.
ISSN:	2949-7159

Multimodal emotion recognition method in complex dynamic scenes

Similar Items