BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR
This paper presents a bidirectional feature fusion network (BiDFNet) for 3D object detection, leveraging pseudo-point clouds to achieve bidirectional fusion of point cloud and image features. The proposed model addresses key challenges in multimodal 3D detection by introducing three novel components...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-05-01
|
Series: | Information |
Subjects: | |
Online Access: | https://www.mdpi.com/2078-2489/16/6/437 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper presents a bidirectional feature fusion network (BiDFNet) for 3D object detection, leveraging pseudo-point clouds to achieve bidirectional fusion of point cloud and image features. The proposed model addresses key challenges in multimodal 3D detection by introducing three novel components: (1) the SAF-Conv module, which extends the receptive field through improved submanifold sparse convolution, enhancing feature extraction from pseudo-point clouds while effectively reducing edge noise; (2) the bidirectional cross-modal attention feature interaction module (BiCSAFIM), which employs a multi-head cross-attention mechanism to enable global information interaction between point cloud and image features; and (3) the attention-based feature fusion module (ADFM), which adaptively fuses dual-stream features to improve robustness. Extensive experiments on the KITTI dataset demonstrate that BiDFNet achieves state-of-the-art performance, with a 3D AP (R40) of 88.79% on the validation set and 85.27% on the test set for the Car category, significantly outperforming existing methods. These results highlight the effectiveness of BiDFNet in complex scenarios, showcasing its potential for real-world applications such as autonomous driving. |
---|---|
ISSN: | 2078-2489 |