Simple and Efficient Adaptation of Visual Geometric Transformers for RGB+Thermal 3D Reconstruction https://arxiv.org/abs/2603.18774
Find a file
2026-04-29 17:02:59 +02:00
.github/workflows Update workflow for github 2026-04-29 17:02:59 +02:00
.vscode first commit 2026-03-26 22:05:20 +01:00
docs More fixes 2026-04-29 14:52:09 +02:00
images Adds the video examples 2026-03-25 23:31:56 +01:00
match_anything Adds documentation for evaluation 2026-04-28 17:31:27 +02:00
sear More fixes 2026-04-29 14:52:09 +02:00
tests Adds documentation for evaluation 2026-04-28 17:31:27 +02:00
thermalnerf first commit 2026-03-26 22:05:20 +01:00
.gitignore Add missing images 2026-03-28 18:53:29 +01:00
.python-version first commit 2026-03-26 22:05:20 +01:00
pyproject.toml Adds pairs evaluation 2026-03-28 19:53:11 +01:00
README.md More fixes 2026-04-29 14:52:09 +02:00
uv.lock Adds pairs evaluation 2026-03-28 19:53:11 +01:00

SEAR: Simple and Efficient Adaptation of Visual Geometric Transformers for RGB+Thermal 3D Reconstruction

This project aims to estimate camera poses of RGB and Thermal images together.

Hugging Face | arXiv

Install

Clone this repo and VGGT

git clone https://github.com/Schindler-EPFL-Lab/SEAR.git
cd SEAR
git clone https://github.com/facebookresearch/vggt.git

Install with uv:

uv sync --all-extras

Train the model

Install VGGT checkpoint VGGT-1B.

To train our model run this script:

python sear/scripts/train_sear.py --thermal-vggt.vggt-path /path/to/vggt/weights.pth

Ablation studies can run by using the other aggregator-types found in sear/ablation_models/possible_aggregators.py.

Models can be evaluated after training with sear/scripts/eval/ablation_vggt.py.

To run the evaluation see the tutorials for camera pose and point cloud, relative camera pose from two views and dependence on thermal ratio.

Training Data

Our training dataset is a combination of the following dataset:

We provide a compilation of all training dataset as well as ours.

See details of the data processing in Dataset documentation.

Cite us

@misc{skorokhodov2026searsimpleefficientadaptation,
      title={SEAR: Simple and Efficient Adaptation of Visual Geometric Transformers for RGB+Thermal 3D Reconstruction},
      author={Vsevolod Skorokhodov and Chenghao Xu and Shuo Sun and Olga Fink and Malcolm Mielle},
      year={2026},
      eprint={2603.18774},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.18774},
}