Multimodal emotion recognition method in complex dynamic scenes
Multimodal emotion recognition technology leverages the power of deep learning to address advanced visual and emotional tasks. While generic deep networks can handle simple emotion recognition tasks, their generalization capability in complex and noisy environments, such as multi-scene outdoor setti...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
KeAi Communications Co., Ltd.
2025-05-01
|
Series: | Journal of Information and Intelligence |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2949715925000046 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Multimodal emotion recognition technology leverages the power of deep learning to address advanced visual and emotional tasks. While generic deep networks can handle simple emotion recognition tasks, their generalization capability in complex and noisy environments, such as multi-scene outdoor settings, remains limited. To overcome these challenges, this paper proposes a novel multimodal emotion recognition framework. First, we develop a robust network architecture based on the T5-small model, designed for dynamic-static fusion in complex scenarios, effectively mitigating the impact of noise. Second, we introduce a dynamic-static cross fusion network (D-SCFN) to enhance the integration and extraction of dynamic and static information, embedding it seamlessly within the T5 framework. Finally, we design and evaluate three distinct multi-task analysis frameworks to explore dependencies among tasks. The experimental results demonstrate that our model significantly outperforms other existing models, showcasing exceptional stability and remarkable adaptability to complex and dynamic scenarios. |
---|---|
ISSN: | 2949-7159 |