Attention-Based LiDAR–Camera Fusion for 3D Object Detection in Autonomous Driving

In multi-vehicle traffic scenarios, achieving accurate environmental perception and motion trajectory tracking through LiDAR–camera fusion is critical for downstream vehicle planning and control tasks. To address the challenges of cross-modal feature interaction in LiDAR–image fusion and the low rec...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhibo Wang, Xiaoci Huang, Zhihao Hu
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:World Electric Vehicle Journal
Subjects:
Online Access:https://www.mdpi.com/2032-6653/16/6/306
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In multi-vehicle traffic scenarios, achieving accurate environmental perception and motion trajectory tracking through LiDAR–camera fusion is critical for downstream vehicle planning and control tasks. To address the challenges of cross-modal feature interaction in LiDAR–image fusion and the low recognition efficiency/positioning accuracy of traffic participants in dense traffic flows, this study proposes an attention-based 3D object detection network integrating point cloud and image features. The algorithm adaptively fuses LiDAR geometric features and camera semantic features through channel-wise attention weighting, enhancing multi-modal feature representation by dynamically prioritizing informative channels. A center point detection architecture is further employed to regress 3D bounding boxes in bird’s-eye-view space, effectively resolving orientation ambiguities caused by sparse point distributions. Experimental validation on the nuScenes dataset demonstrates the model’s robustness in complex scenarios, achieving a mean Average Precision (mAP) of 64.5% and a 12.2% improvement over baseline methods. Real-vehicle deployment further confirms the fusion module’s effectiveness in enhancing detection stability under dynamic traffic conditions.
ISSN:2032-6653