Tao Xie1,2,
Peishan Yang1,
Yudong Jin1,
Yingfeng Cai2,
Wei Yin2,
Weiqiang Ren2,
Qian Zhang2,
Wei Hua3,
Sida Peng1,
Xiaoyang Guo2†,
Xiaowei Zhou1†
News
- [2026-04-17] Inference acceleration is enabled.
- [2026-04-10] The inference code is released.
- [2026-04-10] Scal3R has been selected as a highlight paper for CVPR 2026.
Use the automated installation script:
bash scripts/install.shThe script creates or reuses a conda environment named scal3r, installs the core dependencies from requirements.txt, and installs Scal3R in editable mode. By default it uses uv pip inside that conda environment, with a plain pip fallback available.
This release currently includes inference only; evaluation and benchmark code are not part of the public package yet.
For detailed installation instructions and PyTorch/CUDA guidance, see docs/install.md.
Download the required checkpoints to data/checkpoints/:
mkdir -p data/checkpoints
hf download xbillowy/Scal3R scal3r.pt --repo-type model --local-dir data/checkpoints
curl -L https://github.com/serizba/salad/releases/download/v1.0.0/dino_salad.ckpt -o data/checkpoints/dino_salad.ckptRun inference on a folder of images:
python -m scal3r.run --input_dir /path/to/imagesYou can also set an explicit tag or output directory:
python -m scal3r.run \
--input_dir /path/to/images \
--tag demo \
--output_dir data/result/custom/demoImportant arguments:
--config: model config path. Defaults toconfigs/models/scal3r.yaml.--tag: controls the default output directory name when--output_diris not set.--block_sizeand--overlap_size: control chunking for long-sequence inference.--save_dptand--save_xyz: control whether depth maps and point clouds are exported.--offload_batches,--offload_outputs: control whether to offload batches and outputs to disk.
By default, inference results are written to data/result/custom/<tag>/, and runtime artifacts are written to data/result/custom/<tag>/runtime/. The result directory typically contains:
mat.txtfor the predicted camera poses (camera-to-world transform matrix), each row is a raveled 4x4 matrixintri.ymlandextri.ymlfor EasyVolcap format camera parametersdepths/when depth export is enabledpoints/when point-cloud export is enabledruntime/for runtime artifacts
- TODO: Release inference code.
- TODO: Release evaluation code along with dataset preparation scripts.
- TODO: Provide a simple viser viewer for the inference results.
This project builds on and benefits from several excellent open-source works, especially VGGT, VGGT-Long, and LaCT. We thank the authors for making their code and ideas publicly available.
@misc{xie2026scal3rscalabletesttimetraining,
title={Scal3R: Scalable Test-Time Training for Large-Scale 3D Reconstruction},
author={Tao Xie and Peishan Yang and Yudong Jin and Yingfeng Cai and Wei Yin and Weiqiang Ren and Qian Zhang and Wei Hua and Sida Peng and Xiaoyang Guo and Xiaowei Zhou},
year={2026},
eprint={2604.08542},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2604.08542},
}