Global-Frequency-Domain Network: A Semantic Segmentation Method for High-Resolution Remote Sensing Images Based on Fine-Grained Feature Extraction and Global Context Integration
The accurate semantic segmentation of high-resolution remote sensing images is essential for urban planning and management applications. The inherent complex spatial structure and abundant contextual information in these images make segmentation challenges, such as feature recognition difficulties a...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10975104/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The accurate semantic segmentation of high-resolution remote sensing images is essential for urban planning and management applications. The inherent complex spatial structure and abundant contextual information in these images make segmentation challenges, such as feature recognition difficulties and segmentation discontinuities. Hence, we propose a novel global-frequency-domain network (GFDNet) designed for the semantic segmentation of high-resolution remote sensing images. The GFDNet framework incorporates a global-frequency-domain feature module that employs a learnable global filter to extract contextual information, leveraging the Fourier transform to capture both the spatial- and frequency-domain features. In addition, a hybrid attention mechanism combining channel sparse and spatial attention is introduced to enhance fine-grained feature extraction when objects are continuous. Then, a spatial feature-aware decoder module enhances the ability of the network to perceive surrounding structural targets, thereby improving the spatial feature representation. A series of experimental results demonstrate that the GFDNet network outperforms traditional convolutional neural network and transformer-based methods, achieving the highest mean intersection over union scores of 77.12% and 81.08% on the Vaihingen and Potsdam datasets, respectively. Comparative analyses further indicate GFDNet’s superior accuracy and generalization capabilities in delivering precise semantic segmentation outcomes. |
---|---|
ISSN: | 1939-1404 2151-1535 |