PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation

The application of LiDAR point cloud for urban environment analysis has become a critical approach in urban scene understanding. Concurrently, substantial progress has been made in 3D point cloud semantic segmentation, advancing the precision and effectiveness of urban scene interpretation. However,...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yang Luo, Ting Han, Xiaorong Zhang, Yujun Liu, Duxin Zhu, Jinyuan Li, Yiping Chen, Yundong Wu, Guorong Cai, Yingchao Piao, Jinhe Su
Format:	Article
Language:	English
Published:	Taylor & Francis Group 2025-12-01
Series:	International Journal of Digital Earth
Subjects:	Point cloud semantic segmentation urban scene multimodal positional guided semantic alignment
Online Access:	https://www.tandfonline.com/doi/10.1080/17538947.2025.2528811
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The application of LiDAR point cloud for urban environment analysis has become a critical approach in urban scene understanding. Concurrently, substantial progress has been made in 3D point cloud semantic segmentation, advancing the precision and effectiveness of urban scene interpretation. However, existing methods face challenges when handling long-range LiDAR point cloud, where reduced point density and increased noise at greater distances result in segmentation errors and diminished accuracy. To this end, we propose PASeg, which incorporates two key components: the Positional-Guided Classifier (PGC) and the Multimodal Semantic Alignment (MSA) module. The PGC uses positional embeddings to dynamically adjust normalization parameters, thereby improving segmentation accuracy across varying distances. The MSA module aligns semantic features from text, image, and point cloud data, facilitating better category differentiation. The interaction between PGC and MSA strengthens large-scale 3D semantic segmentation synergistically. Extensive experiments on the SemanticKITTI and nuScenes datasets demonstrate that PASeg’s overall segmentation performance is competitive with state-of-the-art methods. Notably, our method achieves a significant improvement of over 2.3% and 1.7% in long-range LiDAR point cloud segmentation (30–40 m and 40–50 m, respectively) compared to the baseline segmenter on the SemanticKITTI dataset. PASeg improves urban segmentation for smart, sustainable city development.
ISSN:	1753-8947 1753-8955

PASeg: positional-guided segmenter with multimodal semantic alignment for enhancing urban scene 3D semantic segmentation

Similar Items