BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR
This paper presents a bidirectional feature fusion network (BiDFNet) for 3D object detection, leveraging pseudo-point clouds to achieve bidirectional fusion of point cloud and image features. The proposed model addresses key challenges in multimodal 3D detection by introducing three novel components...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-05-01
|
Series: | Information |
Subjects: | |
Online Access: | https://www.mdpi.com/2078-2489/16/6/437 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1839653789892608000 |
---|---|
author | Qiang Zhu Yaping Wan |
author_facet | Qiang Zhu Yaping Wan |
author_sort | Qiang Zhu |
collection | DOAJ |
description | This paper presents a bidirectional feature fusion network (BiDFNet) for 3D object detection, leveraging pseudo-point clouds to achieve bidirectional fusion of point cloud and image features. The proposed model addresses key challenges in multimodal 3D detection by introducing three novel components: (1) the SAF-Conv module, which extends the receptive field through improved submanifold sparse convolution, enhancing feature extraction from pseudo-point clouds while effectively reducing edge noise; (2) the bidirectional cross-modal attention feature interaction module (BiCSAFIM), which employs a multi-head cross-attention mechanism to enable global information interaction between point cloud and image features; and (3) the attention-based feature fusion module (ADFM), which adaptively fuses dual-stream features to improve robustness. Extensive experiments on the KITTI dataset demonstrate that BiDFNet achieves state-of-the-art performance, with a 3D AP (R40) of 88.79% on the validation set and 85.27% on the test set for the Car category, significantly outperforming existing methods. These results highlight the effectiveness of BiDFNet in complex scenarios, showcasing its potential for real-world applications such as autonomous driving. |
format | Article |
id | doaj-art-f5bfd64c1e2b4edaa42e2e69dbd2466a |
institution | Matheson Library |
issn | 2078-2489 |
language | English |
publishDate | 2025-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Information |
spelling | doaj-art-f5bfd64c1e2b4edaa42e2e69dbd2466a2025-06-25T13:57:29ZengMDPI AGInformation2078-24892025-05-0116643710.3390/info16060437BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDARQiang Zhu0Yaping Wan1School of Computer Science, University of South China, Hengyang 421001, ChinaSchool of Computer Science, University of South China, Hengyang 421001, ChinaThis paper presents a bidirectional feature fusion network (BiDFNet) for 3D object detection, leveraging pseudo-point clouds to achieve bidirectional fusion of point cloud and image features. The proposed model addresses key challenges in multimodal 3D detection by introducing three novel components: (1) the SAF-Conv module, which extends the receptive field through improved submanifold sparse convolution, enhancing feature extraction from pseudo-point clouds while effectively reducing edge noise; (2) the bidirectional cross-modal attention feature interaction module (BiCSAFIM), which employs a multi-head cross-attention mechanism to enable global information interaction between point cloud and image features; and (3) the attention-based feature fusion module (ADFM), which adaptively fuses dual-stream features to improve robustness. Extensive experiments on the KITTI dataset demonstrate that BiDFNet achieves state-of-the-art performance, with a 3D AP (R40) of 88.79% on the validation set and 85.27% on the test set for the Car category, significantly outperforming existing methods. These results highlight the effectiveness of BiDFNet in complex scenarios, showcasing its potential for real-world applications such as autonomous driving.https://www.mdpi.com/2078-2489/16/6/437pseudo-point cloud3D object detectionsubstream sparse convolutionmulti-head attention mechanism |
spellingShingle | Qiang Zhu Yaping Wan BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR Information pseudo-point cloud 3D object detection substream sparse convolution multi-head attention mechanism |
title | BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR |
title_full | BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR |
title_fullStr | BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR |
title_full_unstemmed | BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR |
title_short | BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR |
title_sort | bidfnet a bidirectional feature fusion network for 3d object detection based on pseudo lidar |
topic | pseudo-point cloud 3D object detection substream sparse convolution multi-head attention mechanism |
url | https://www.mdpi.com/2078-2489/16/6/437 |
work_keys_str_mv | AT qiangzhu bidfnetabidirectionalfeaturefusionnetworkfor3dobjectdetectionbasedonpseudolidar AT yapingwan bidfnetabidirectionalfeaturefusionnetworkfor3dobjectdetectionbasedonpseudolidar |