DepressionMIGNN: A Multiple-Instance Learning-Based Depression Detection Model with Graph Neural Networks
The global prevalence of depression necessitates the application of technological solutions, particularly sensor-based systems, to augment scarce resources for early diagnostic purposes. In this study, we use benchmark datasets that contain multimodal data including video, audio, and transcribed tex...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-07-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/25/14/4520 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The global prevalence of depression necessitates the application of technological solutions, particularly sensor-based systems, to augment scarce resources for early diagnostic purposes. In this study, we use benchmark datasets that contain multimodal data including video, audio, and transcribed text. To address depression detection as a chronic long-term disorder reflected by temporal behavioral patterns, we propose a novel framework that segments videos into utterance-level instances using GRU for contextual representation, and then constructs graphs where utterance embeddings serve as nodes connected through dual relationships capturing both chronological development and intermittent relevant information. Graph neural networks are employed to learn multi-dimensional edge relationships and align multimodal representations across different temporal dependencies. Our approach achieves superior performance with an MAE of 5.25 and RMSE of 6.75 on AVEC2014, and CCC of 0.554 and RMSE of 4.61 on AVEC2019, demonstrating significant improvements over existing methods that focus primarily on momentary expressions. |
---|---|
ISSN: | 1424-8220 |