Synergy-CLIP: Extending CLIP With Multi-Modal Integration for Robust Representation Learning

QR Code

Synergy-CLIP: Extending CLIP With Multi-Modal Integration for Robust Representation Learning

Multi-modal representation learning has become a pivotal area in artificial intelligence, enabling the integration of diverse modalities such as vision, text, and audio to solve complex problems. However, existing approaches predominantly focus on bimodal interactions, such as image-text pairs, whic...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sangyeon Cho, Jangyeong Jeon, Mingi Kim, Junyeong Kim
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Multi-modal multi-modal representation learning missing modality missing modality reconstruction speech and multi-modality vision and language
Online Access:	https://ieeexplore.ieee.org/document/10962132/
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Image First or Text First? Optimising the Sequencing of Modalities in Large Language Model Prompting and Reasoning Tasks
by: Grant Wardle, et al.
Published: (2025-06-01)

DCLMA: Deep correlation learning with multi-modal attention for visual-audio retrieval
by: Jiwei Zhang, et al.
Published: (2025-09-01)

Review of Structural Modal Tracking in Operational Modal Analysis: Methods and Applications
by: Shenghui Fu, et al.
Published: (2025-06-01)

GSR-Fusion: A Deep Multimodal Fusion Architecture for Robust Sign Language Recognition Using RGB, Skeleton, and Graph-Based Modalities
by: Wuttichai Vijitkunsawat, et al.
Published: (2025-01-01)

RESEARCH OF MULTIMODAL PERCEPTION PROFILE AMONG STUDENTS
by: T. N. Bandurka
Published: (2015-07-01)

Video anomaly detection via cross-modal fusion and hyperbolic graph attention mechanism
by: JIANG Di, et al.
Published: (2025-06-01)

MT-CMVAD: A Multi-Modal Transformer Framework for Cross-Modal Video Anomaly Detection
by: Hantao Ding, et al.
Published: (2025-06-01)

The Notion of Subjective and Objective Modality in Language
by: T. A. Selezneva
Published: (2013-06-01)

Lexeme 'valjda' as an exponent of epistemic modality
by: Stojanović Milena S., et al.
Published: (2025-01-01)

ISSUES OF MODAL VERBS TRANSLATION FROM ENGLISH INTO UKRAINIAN
by: Natalia I. Talan
Published: (2020-12-01)

Modal Passport Concept for Enhanced Non-Destructive Monitoring and Diagnostics of Wind Turbine Blades
by: Aleksey Mironov, et al.
Published: (2025-04-01)

Cross-Modal Fake News Detection Method Based on Multi-Level Fusion Without Evidence
by: Ping He, et al.
Published: (2025-07-01)

In Situ and Real‐Time Multi‐Modality Imaging Guided Orderly Triple‐Therapy of Tumors with a Multifunctional Nanodrug
by: Chaoyi Yang, et al.
Published: (2025-07-01)

FORMATION OF INTERMODAL REPRESENTATIONS ON THE STAGES OF SCHOOL DEVELOPMENT
by: V. P. Peskov
Published: (2015-07-01)

Distribution-Free Normal Modal Logics
by: Chrysafis Hartonas
Published: (2025-04-01)

Toward unsupervised building extraction from very high-resolution remote sensing images using SAM and CLIP
by: Chenxiao Zhang, et al.
Published: (2025-12-01)

ChiralCat: Molecular chirality classification with enhanced spatial representation using learnable queries
by: Yichuan Peng, et al.
Published: (2025-12-01)

Functioning of Epistemic Modal Modifiers in Internet Discourse (by Material of English-Language Internet-Forums)
by: A. V. Troshina
Published: (2017-02-01)

Modalization of Speech Actions as the Basis of the Metaphoric Transfer (on the Example of German Economic Discourse)
by: I. A. Shipova, et al.
Published: (2021-09-01)

Hedging modal adverbs in Slovenian academic discourse
by: Jakob Lenardič, et al.
Published: (2021-07-01)

Investigating Algerian Use of English Modals: The Case of Second Year Master Students of English at the University “Frères Mentouri”, Constantine 1
by: Salima SELMEN
Published: (2018-06-01)

The Role of the Visual Versus Verbal Modality in Learning Novel Verbs
by: Maria Luisa Lorusso, et al.
Published: (2025-05-01)

MPVT: An Efficient Multi-Modal Prompt Vision Tracker for Visual Target Tracking
by: Jianyu Xie, et al.
Published: (2025-07-01)

Prediction of Alzheimer’s Disease Based on Multi-Modal Domain Adaptation
by: Binbin Fu, et al.
Published: (2025-06-01)

MODAL EPISTEMOLOGY, REALISM ABOUT MODALITY, AND THE IMAGINATION
by: Mihai RUSU
Published: (2018-12-01)

The Problem of Logico-Philosophical Origins of the Category of Modality
by: T. A. Selezneva
Published: (2013-04-01)

PR-CLIP: Cross-Modal Positional Reconstruction for Remote Sensing Image–Text Retrieval
by: Jihong Guan, et al.
Published: (2025-06-01)

The Modal Particle "-ā", a New Member of Modal Elements in Persian Language
by: Morteza Dastlan
Published: (2025-02-01)

Tracing truth: dynamic temporal networks for multi-modal fake news detection
by: Jiaen Hu, et al.
Published: (2025-07-01)

Text-Guided Visual Representation Learning via Cross-Modal Fusion for Person Re-Identification
by: Ge Cao, et al.
Published: (2025-01-01)

GaitCSF: Multi-Modal Gait Recognition Network Based on Channel Shuffle Regulation and Spatial-Frequency Joint Learning
by: Siwei Wei, et al.
Published: (2025-06-01)

Physically crosslinked gelatin bio‐inks with enhanced printability, degradation and mechanical robustness for multi‐modal bioprinting
by: Wei Long Ng, et al.
Published: (2025-07-01)

Modal Shapes Selection Criterion for Modal Reconstruction Aimed at Structural Health Monitoring
by: Gabriele Liuzzo, et al.
Published: (2025-03-01)

Multistage Training and Fusion Method for Imbalanced Multimodal UAV Remote Sensing Classification
by: Shihao Wang, et al.
Published: (2025-01-01)

Analysis of Asymmetric Structures with Triple Modal Reservation
by: S. R. Morozov, et al.
Published: (2025-05-01)

Linguistic Means of Expressing Objective Epistemic Modality in Scientific Discourse
by: A. V. Sakharova
Published: (2020-04-01)

Deep multi-modal imaging temperature measurement method for detecting temperature rise in electrical equipment faults
by: Jinxuan Wen, et al.
Published: (2025-07-01)

SMM-POD: Panoramic 3D Object Detection via Spherical Multi-Stage Multi-Modal Fusion
by: Jinghan Zhang, et al.
Published: (2025-06-01)

Swin Transformer With Late-Fusion Feature Aggregation for Multi-Modal Vehicle Reidentification
by: Reza Fuad Rachmadi, et al.
Published: (2025-01-01)

A progressive attention-based cross-modal fusion network for cardiovascular disease detection using synchronized electrocardiogram and phonocardiogram signals
by: Wei Peng Li, et al.
Published: (2025-07-01)