InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset

In fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resultin...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuandong Niu, Limin Liu, Fuyu Huang, Juntao Ma, Chaowen Zheng, Yunfeng Jiang, Ting An, Zhongchen Zhao, Shuangyou Chen
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/12/2035
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839652809328295936
author Yuandong Niu
Limin Liu
Fuyu Huang
Juntao Ma
Chaowen Zheng
Yunfeng Jiang
Ting An
Zhongchen Zhao
Shuangyou Chen
author_facet Yuandong Niu
Limin Liu
Fuyu Huang
Juntao Ma
Chaowen Zheng
Yunfeng Jiang
Ting An
Zhongchen Zhao
Shuangyou Chen
author_sort Yuandong Niu
collection DOAJ
description In fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resulting in a significant decline in image quality and making it difficult to meet the task requirements. The method based on lidar has poor imaging effects in rainy and foggy weather, close-range scenes, and scenarios requiring thermal imaging data. In contrast, infrared cameras can effectively overcome this challenge because their imaging mechanisms are different from those of RGB cameras and lidar. However, the research on three-dimensional scene reconstruction of infrared images is relatively immature, especially in the field of infrared binocular stereo matching. There are two main challenges given this situation: first, there is a lack of a dataset specifically for infrared binocular stereo matching; second, the lack of texture information in infrared images causes a limit in the extension of the RGB method to the infrared reconstruction problem. To solve these problems, this study begins with the construction of an infrared binocular stereo matching dataset and then proposes an innovative perspective projection positional encoding-based transformer method to complete the infrared binocular stereo matching task. In this paper, a stereo matching network combined with transformer and cost volume is constructed. The existing work in the positional encoding of the transformer usually uses a parallel projection model to simplify the calculation. Our method is based on the actual perspective projection model so that each pixel is associated with a different projection ray. It effectively solves the problem of feature extraction and matching caused by insufficient texture information in infrared images and significantly improves matching accuracy. We conducted experiments based on the infrared binocular stereo matching dataset proposed in this paper. Experiments demonstrated the effectiveness of the proposed method.
format Article
id doaj-art-ff6040e27b7a4724bccb2129a51a6c1d
institution Matheson Library
issn 2072-4292
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj-art-ff6040e27b7a4724bccb2129a51a6c1d2025-06-25T14:23:39ZengMDPI AGRemote Sensing2072-42922025-06-011712203510.3390/rs17122035InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared DatasetYuandong Niu0Limin Liu1Fuyu Huang2Juntao Ma3Chaowen Zheng4Yunfeng Jiang5Ting An6Zhongchen Zhao7Shuangyou Chen8Shijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, ChinaShijiazhuang Campus, Army Engineering University of PLA, Shijiazhuang 050003, China77123 Units of PLA, Mianyang 621000, ChinaIn fields such as military reconnaissance, forest fire prevention, and autonomous driving at night, there is an urgent need for high-precision three-dimensional reconstruction in low-light or night environments. The acquisition of remote sensing data by RGB cameras relies on external light, resulting in a significant decline in image quality and making it difficult to meet the task requirements. The method based on lidar has poor imaging effects in rainy and foggy weather, close-range scenes, and scenarios requiring thermal imaging data. In contrast, infrared cameras can effectively overcome this challenge because their imaging mechanisms are different from those of RGB cameras and lidar. However, the research on three-dimensional scene reconstruction of infrared images is relatively immature, especially in the field of infrared binocular stereo matching. There are two main challenges given this situation: first, there is a lack of a dataset specifically for infrared binocular stereo matching; second, the lack of texture information in infrared images causes a limit in the extension of the RGB method to the infrared reconstruction problem. To solve these problems, this study begins with the construction of an infrared binocular stereo matching dataset and then proposes an innovative perspective projection positional encoding-based transformer method to complete the infrared binocular stereo matching task. In this paper, a stereo matching network combined with transformer and cost volume is constructed. The existing work in the positional encoding of the transformer usually uses a parallel projection model to simplify the calculation. Our method is based on the actual perspective projection model so that each pixel is associated with a different projection ray. It effectively solves the problem of feature extraction and matching caused by insufficient texture information in infrared images and significantly improves matching accuracy. We conducted experiments based on the infrared binocular stereo matching dataset proposed in this paper. Experiments demonstrated the effectiveness of the proposed method.https://www.mdpi.com/2072-4292/17/12/2035infrared binoculardatasetperspective projection
spellingShingle Yuandong Niu
Limin Liu
Fuyu Huang
Juntao Ma
Chaowen Zheng
Yunfeng Jiang
Ting An
Zhongchen Zhao
Shuangyou Chen
InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset
Remote Sensing
infrared binocular
dataset
perspective projection
title InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset
title_full InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset
title_fullStr InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset
title_full_unstemmed InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset
title_short InfraredStereo3D: Breaking Night Vision Limits with Perspective Projection Positional Encoding and Groundbreaking Infrared Dataset
title_sort infraredstereo3d breaking night vision limits with perspective projection positional encoding and groundbreaking infrared dataset
topic infrared binocular
dataset
perspective projection
url https://www.mdpi.com/2072-4292/17/12/2035
work_keys_str_mv AT yuandongniu infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset
AT liminliu infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset
AT fuyuhuang infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset
AT juntaoma infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset
AT chaowenzheng infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset
AT yunfengjiang infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset
AT tingan infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset
AT zhongchenzhao infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset
AT shuangyouchen infraredstereo3dbreakingnightvisionlimitswithperspectiveprojectionpositionalencodingandgroundbreakinginfrareddataset