Synergy-CLIP: Extending CLIP With Multi-Modal Integration for Robust Representation Learning
Multi-modal representation learning has become a pivotal area in artificial intelligence, enabling the integration of diverse modalities such as vision, text, and audio to solve complex problems. However, existing approaches predominantly focus on bimodal interactions, such as image-text pairs, whic...
Saved in:
Main Authors: | Sangyeon Cho, Jangyeong Jeon, Mingi Kim, Junyeong Kim |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10962132/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Image First or Text First? Optimising the Sequencing of Modalities in Large Language Model Prompting and Reasoning Tasks
by: Grant Wardle, et al.
Published: (2025-06-01) -
DCLMA: Deep correlation learning with multi-modal attention for visual-audio retrieval
by: Jiwei Zhang, et al.
Published: (2025-09-01) -
Review of Structural Modal Tracking in Operational Modal Analysis: Methods and Applications
by: Shenghui Fu, et al.
Published: (2025-06-01) -
GSR-Fusion: A Deep Multimodal Fusion Architecture for Robust Sign Language Recognition Using RGB, Skeleton, and Graph-Based Modalities
by: Wuttichai Vijitkunsawat, et al.
Published: (2025-01-01) -
RESEARCH OF MULTIMODAL PERCEPTION PROFILE AMONG STUDENTS
by: T. N. Bandurka
Published: (2015-07-01)