BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR

This paper presents a bidirectional feature fusion network (BiDFNet) for 3D object detection, leveraging pseudo-point clouds to achieve bidirectional fusion of point cloud and image features. The proposed model addresses key challenges in multimodal 3D detection by introducing three novel components...

Full description

Saved in:
Bibliographic Details
Main Authors: Qiang Zhu, Yaping Wan
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/16/6/437
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839653789892608000
author Qiang Zhu
Yaping Wan
author_facet Qiang Zhu
Yaping Wan
author_sort Qiang Zhu
collection DOAJ
description This paper presents a bidirectional feature fusion network (BiDFNet) for 3D object detection, leveraging pseudo-point clouds to achieve bidirectional fusion of point cloud and image features. The proposed model addresses key challenges in multimodal 3D detection by introducing three novel components: (1) the SAF-Conv module, which extends the receptive field through improved submanifold sparse convolution, enhancing feature extraction from pseudo-point clouds while effectively reducing edge noise; (2) the bidirectional cross-modal attention feature interaction module (BiCSAFIM), which employs a multi-head cross-attention mechanism to enable global information interaction between point cloud and image features; and (3) the attention-based feature fusion module (ADFM), which adaptively fuses dual-stream features to improve robustness. Extensive experiments on the KITTI dataset demonstrate that BiDFNet achieves state-of-the-art performance, with a 3D AP (R40) of 88.79% on the validation set and 85.27% on the test set for the Car category, significantly outperforming existing methods. These results highlight the effectiveness of BiDFNet in complex scenarios, showcasing its potential for real-world applications such as autonomous driving.
format Article
id doaj-art-f5bfd64c1e2b4edaa42e2e69dbd2466a
institution Matheson Library
issn 2078-2489
language English
publishDate 2025-05-01
publisher MDPI AG
record_format Article
series Information
spelling doaj-art-f5bfd64c1e2b4edaa42e2e69dbd2466a2025-06-25T13:57:29ZengMDPI AGInformation2078-24892025-05-0116643710.3390/info16060437BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDARQiang Zhu0Yaping Wan1School of Computer Science, University of South China, Hengyang 421001, ChinaSchool of Computer Science, University of South China, Hengyang 421001, ChinaThis paper presents a bidirectional feature fusion network (BiDFNet) for 3D object detection, leveraging pseudo-point clouds to achieve bidirectional fusion of point cloud and image features. The proposed model addresses key challenges in multimodal 3D detection by introducing three novel components: (1) the SAF-Conv module, which extends the receptive field through improved submanifold sparse convolution, enhancing feature extraction from pseudo-point clouds while effectively reducing edge noise; (2) the bidirectional cross-modal attention feature interaction module (BiCSAFIM), which employs a multi-head cross-attention mechanism to enable global information interaction between point cloud and image features; and (3) the attention-based feature fusion module (ADFM), which adaptively fuses dual-stream features to improve robustness. Extensive experiments on the KITTI dataset demonstrate that BiDFNet achieves state-of-the-art performance, with a 3D AP (R40) of 88.79% on the validation set and 85.27% on the test set for the Car category, significantly outperforming existing methods. These results highlight the effectiveness of BiDFNet in complex scenarios, showcasing its potential for real-world applications such as autonomous driving.https://www.mdpi.com/2078-2489/16/6/437pseudo-point cloud3D object detectionsubstream sparse convolutionmulti-head attention mechanism
spellingShingle Qiang Zhu
Yaping Wan
BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR
Information
pseudo-point cloud
3D object detection
substream sparse convolution
multi-head attention mechanism
title BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR
title_full BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR
title_fullStr BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR
title_full_unstemmed BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR
title_short BiDFNet: A Bidirectional Feature Fusion Network for 3D Object Detection Based on Pseudo-LiDAR
title_sort bidfnet a bidirectional feature fusion network for 3d object detection based on pseudo lidar
topic pseudo-point cloud
3D object detection
substream sparse convolution
multi-head attention mechanism
url https://www.mdpi.com/2078-2489/16/6/437
work_keys_str_mv AT qiangzhu bidfnetabidirectionalfeaturefusionnetworkfor3dobjectdetectionbasedonpseudolidar
AT yapingwan bidfnetabidirectionalfeaturefusionnetworkfor3dobjectdetectionbasedonpseudolidar