PIFRNet: Position Information Guided Feature Reconstruction Network for Salient Object Detection in Remote Sensing Images

Benefiting from the success of deep learning, salient object detection in natural scene images has rapidly advanced. However, salient object detection for remote sensing images (RSI-SOD) faces unique challenges, including high resolution, diverse object scales, and cluttered backgrounds, which limit...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhen Wang, Ruixiang Li, Xiaotian Wang, Nan Xu, Zhuhong You
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11049872/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Benefiting from the success of deep learning, salient object detection in natural scene images has rapidly advanced. However, salient object detection for remote sensing images (RSI-SOD) faces unique challenges, including high resolution, diverse object scales, and cluttered backgrounds, which limit the effectiveness of existing methods. To overcome these issues, we propose a Position Information Guided Feature Reconstruction Network, where each module is specifically designed to address a core RSI-SOD challenge. First, a hybrid dual-branch encoder integrates convolutional neural networks for robust local feature extraction and Transformers for capturing global contextual information, enabling simultaneous modeling of fine details and large-scale object relationships. Next, the Spatial Coordinate Attention Mechanism leverages positional correlations between spatial and channel dimensions to accurately highlight salient regions and suppress background noise. The Position-Sensitive Self-Attention Mechanism further refines feature representation by modeling pixel-level spatial relationships, enhancing the network’s ability to distinguish complex object boundaries. To address multiscale object variation, the Multiscale Attention Mechanism adaptively aggregates information across scales, improving detection robustness for objects of all sizes. Finally, the Feature Reconstruction Module restores fine-grained details and sharp boundaries in the predicted saliency maps by leveraging spatial position information. Extensive experiments on three public RSI-SOD datasets demonstrate that our method achieves significant improvements over 36 state-of-the-art approaches, validating the effectiveness of each proposed module.
ISSN:1939-1404
2151-1535