MT-CMVAD: A Multi-Modal Transformer Framework for Cross-Modal Video Anomaly Detection

Video anomaly detection (VAD) faces significant challenges in multimodal semantic alignment and long-term temporal modeling within open surveillance scenarios. Existing methods are often plagued by modality discrepancies and fragmented temporal reasoning. To address these issues, we introduce MT-CMV...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hantao Ding, Shengfeng Lou, Hairong Ye, Yanbing Chen
Format:	Article
Language:	English
Published:	MDPI AG 2025-06-01
Series:	Applied Sciences
Subjects:	multi-modal transformer LoRA video anomaly detection self-attention mechanism cross-modal learning
Online Access:	https://www.mdpi.com/2076-3417/15/12/6773
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!

MT-CMVAD: A Multi-Modal Transformer Framework for Cross-Modal Video Anomaly Detection

Similar Items