3D reconstruction for unconstrained image collections using Gaussian Splatting with foundation model

Achieving high-quality rendering with Gaussian Splatting for unconstrained image collections is critical for advancing 3D reconstruction. Currently, the methods using a single multi-layer perceptron and convolutional neural network to predict distractors in input images often suffer from errors and...

Full description

Saved in:
Bibliographic Details
Main Authors: Shuowen Huang, Qingwu Hu, Pengcheng Zhao, Mingyao Ai, Xujie Zhang, Wenwu Ou, Linze Li
Format: Article
Language:English
Published: Taylor & Francis Group 2025-07-01
Series:Geo-spatial Information Science
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/10095020.2025.2532586
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Achieving high-quality rendering with Gaussian Splatting for unconstrained image collections is critical for advancing 3D reconstruction. Currently, the methods using a single multi-layer perceptron and convolutional neural network to predict distractors in input images often suffer from errors and is insufficient in rendering details for appearance modeling. In this study, we propose the U3GS framework for Gaussian Splatting under unconstrained image collections. Specifically, we develop a multi-scale tri-plane feature encoding module that integrates scene features across multiple scales and predicts Gaussian attributes. Furthermore, we model image appearance by combining global and local appearance embeddings, allowing the framework to adapt to illumination variations within the images. Finally, we design an uncertainty estimation module based on visual foundation model to predict distractors in input images, and apply the uncertainty-guided loss to ensure reliable target rendering within the scene. We evaluate our framework on the NeRF On-the-go and Phototourism datasets, demonstrating its effectiveness in distractor removal within highly occluded environments and achieving high-quality rendering across images with diverse appearances.
ISSN:1009-5020
1993-5153