Strong and Weak Prompt Engineering for Remote Sensing Image-Text Cross-Modal Retrieval
Cross-modal retrieval is vital at the intersection of vision and language. Specifically, remote sensing image–text retrieval enhances our understanding of complex remote sensing content by combining multiperspective visual information with concise textual descriptions and has increasingly...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10855571/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Cross-modal retrieval is vital at the intersection of vision and language. Specifically, remote sensing image–text retrieval enhances our understanding of complex remote sensing content by combining multiperspective visual information with concise textual descriptions and has increasingly become a hotspot for research. Existing prompts typically emphasize either global or local information, which fails to excavate or fully leverage the effective information of cross-modal data, resulting in the subpar performance of retrieval models. To address these limitations, we propose a novel method called Strong and Weak Prompt Engineering (SWPE) for remote sensing image–text retrieval. Specifically, SWPE employs the Strong and Weak Prompt Generation module to generate fine-grained and global category semantic prompts via an attention mechanism and a pretrained classification model. The prompt-guided feature fine-tuning module then refines the prompt information using a Transformer architecture, integrating the refined prompts with high-level image, and text features to enhance both fine-grained details and global semantics. Finally, the adaptive hard sample elimination module optimizes the triplet loss function by training the model with negative sample pairs of varying difficulty, assigning higher weights to simpler pairs. Extensive quantitative and qualitative experiments on four remote sensing benchmarks validate the superior effectiveness of SWPE. |
---|---|
ISSN: | 1939-1404 2151-1535 |