InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset
In fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resultin...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-06-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/17/12/2035 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1839652809328295936 |
---|---|
author | Yuandong Niu Limin Liu Fuyu Huang Juntao Ma Chaowen Zheng Yunfeng Jiang Ting An Zhongchen Zhao Shuangyou Chen |
author_facet | Yuandong Niu Limin Liu Fuyu Huang Juntao Ma Chaowen Zheng Yunfeng Jiang Ting An Zhongchen Zhao Shuangyou Chen |
author_sort | Yuandong Niu |
collection | DOAJ |
description | In fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resulting in a significant decline in image quality and making it difficult to meet the task requirements. The method based on lidar has poor imaging effects in rainy and foggy weather, close-range scenes, and scenarios requiring thermal imaging data. In contrast, infrared cameras can effectively overcome this challenge because their imaging mechanisms are different from those of RGB cameras and lidar. However, the research on three-dimensional scene reconstruction of infrared images is relatively immature, especially in the field of infrared binocular stereo matching. There are two main challenges given this situation: first, there is a lack of a dataset specifically for infrared binocular stereo matching; second, the lack of texture information in infrared images causes a limit in the extension of the RGB method to the infrared reconstruction problem. To solve these problems, this study begins with the construction of an infrared binocular stereo matching dataset and then proposes an innovative perspective projection positional encoding-based transformer method to complete the infrared binocular stereo matching task. In this paper, a stereo matching network combined with transformer and cost volume is constructed. The existing work in the positional encoding of the transformer usually uses a parallel projection model to simplify the calculation. Our method is based on the actual perspective projection model so that each pixel is associated with a different projection ray. It effectively solves the problem of feature extraction and matching caused by insufficient texture information in infrared images and significantly improves matching accuracy. We conducted experiments based on the infrared binocular stereo matching dataset proposed in this paper. Experiments demonstrated the effectiveness of the proposed method. |
format | Article |
id | doaj-art-ff6040e27b7a4724bccb2129a51a6c1d |
institution | Matheson Library |
issn | 2072-4292 |
language | English |
publishDate | 2025-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj-art-ff6040e27b7a4724bccb2129a51a6c1d2025-06-25T14:23:39ZengMDPI AGRemote Sensing2072-42922025-06-011712203510.3390/rs17122035InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared DatasetYuandong Niu0Limin Liu1Fuyu Huang2Juntao Ma3Chaowen Zheng4Yunfeng Jiang5Ting An6Zhongchen Zhao7Shuangyou Chen8Shijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, China77123 Units of PLA, Mianyang 621000, ChinaIn fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resulting in a significant decline in image quality and making it difficult to meet the task requirements. The method based on lidar has poor imaging effects in rainy and foggy weather, close-range scenes, and scenarios requiring thermal imaging data. In contrast, infrared cameras can effectively overcome this challenge because their imaging mechanisms are different from those of RGB cameras and lidar. However, the research on three-dimensional scene reconstruction of infrared images is relatively immature, especially in the field of infrared binocular stereo matching. There are two main challenges given this situation: first, there is a lack of a dataset specifically for infrared binocular stereo matching; second, the lack of texture information in infrared images causes a limit in the extension of the RGB method to the infrared reconstruction problem. To solve these problems, this study begins with the construction of an infrared binocular stereo matching dataset and then proposes an innovative perspective projection positional encoding-based transformer method to complete the infrared binocular stereo matching task. In this paper, a stereo matching network combined with transformer and cost volume is constructed. The existing work in the positional encoding of the transformer usually uses a parallel projection model to simplify the calculation. Our method is based on the actual perspective projection model so that each pixel is associated with a different projection ray. It effectively solves the problem of feature extraction and matching caused by insufficient texture information in infrared images and significantly improves matching accuracy. We conducted experiments based on the infrared binocular stereo matching dataset proposed in this paper. Experiments demonstrated the effectiveness of the proposed method.https://www.mdpi.com/2072-4292/17/12/2035infrared binoculardatasetperspective projection |
spellingShingle | Yuandong Niu Limin Liu Fuyu Huang Juntao Ma Chaowen Zheng Yunfeng Jiang Ting An Zhongchen Zhao Shuangyou Chen InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset Remote Sensing infrared binocular dataset perspective projection |
title | InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset |
title_full | InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset |
title_fullStr | InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset |
title_full_unstemmed | InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset |
title_short | InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset |
title_sort | infraredstereo3d breaking night vision limits with perspective projection positional encoding and groundbreaking infrared dataset |
topic | infrared binocular dataset perspective projection |
url | https://www.mdpi.com/2072-4292/17/12/2035 |
work_keys_str_mv | AT yuandongniu infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset AT liminliu infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset AT fuyuhuang infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset AT juntaoma infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset AT chaowenzheng infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset AT yunfengjiang infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset AT tingan infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset AT zhongchenzhao infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset AT shuangyouchen infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset |