MLFF-SFNet: A Visual Perception-Guided Network for High-Accuracy Road Extraction in Remote Sensing Imagery via Multilevel Feature Fusion and Similarity Filtering
Remote-sensing-image-based accurate road extraction plays a pivotal role in smart city planning and geospatial analysis. Despite recent advances, achieving precise road extraction remains challenging due to irregular road topologies and complex environment interference in high-resolution remote sens...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/11075544/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Remote-sensing-image-based accurate road extraction plays a pivotal role in smart city planning and geospatial analysis. Despite recent advances, achieving precise road extraction remains challenging due to irregular road topologies and complex environment interference in high-resolution remote sensing images. Inspired by the human visual system’s “local focusing-global verification” mechanism, a novel road extraction network based on multilevel feature fusion (FF) and similarity filtering is proposed. The framework addresses two critical challenges. First, to suppress environmental noise, a dual-branch subsampling and feature similarity filtering network architecture is designed, as each branch is used to downsample the local area and the whole image separately. The auxiliary branch extracts local road features to construct a feature affinity matrix that guides global feature refinement in the main branch. Besides, to mitigate the impact of texture-similar objects and occlusions, a multilevel FF method is proposed integrating bidirectional deformable attention module (BDAM) and dilated skip network (DSN). The BDAM captures long-range dependencies of irregular roads through deformable convolutional kernels and bidirectional attention, while the DSN establishes skip connections with dilated convolutions to preserve edge details during multiscale FF. Comprehensive experiments conducted on publicly available datasets demonstrate that the proposed model outperforms the state-of-the-art methods for road extraction. |
---|---|
ISSN: | 1939-1404 2151-1535 |