🎊 Project Page | 🦉Paper | 🎄 Demo
Rui Qian, Haozhi Cao, Tianchen Deng, Shenghai Yuan, Lihua Xie
SplatSSC formulates a monocular 3D occupancy prediction task and proposes a Gaussian-based framework to accomplish it.

- 2025.12.13: Checkpoints and training information for depth branch and SplatSSC on Occ-ScanNet dataset are released!
- 2025.11.08: 🎉 Accepted to AAAI 2026 as oral presentation! Stay tuned for more updates!
- 2025.08.04: SplatSSC is avaliable on arixv.
Monocular 3D Semantic Scene Completion (SSC) is a challenging yet promising task that aims to infer dense geometric and semantic descriptions of a scene from a single image. While recent object-centric paradigms significantly improve efficiency by leveraging flexible 3D Gaussian primitives, they still rely heavily on a large number of randomly initialized primitives, which inevitably leads to 1) inefficient primitive initialization and 2) outlier primitives that introduce erroneous artifacts. In this paper, we propose SplatSSC, a novel framework that resolves these limitations with a depth-guided initialization strategy and a principled Gaussian aggregator. Instead of random initialization, SplatSSC utilizes a dedicated depth branch composed of a Group-wise Multi-scale Fusion (GMF) module, which integrates multi-scale image and depth features to generate a sparse yet representative set of initial Gaussian primitives. To mitigate noise from outlier primitives, we develop the Decoupled Gaussian Aggregator (DGA), which enhances robustness by decomposing geometric and semantic predictions during the Gaussian-to-voxel splatting process. Complemented with a specialized Probability Scale Loss, our method achieves state-of-the-art performance on the Occ-ScanNet dataset, outperforming prior approaches by over
- Local Environment: Follow instructions HERE to prepare the local environment.
- Docker Environment: Follow instructions HERE to prepare the docker environment.
Prepare posed_images and gathered_data following the Occ-ScanNet dataset and move them to data/occscannet. The folder structure should look like this:
SplatSSC
├── results/
│ ├── occscannet/
│ │ ├── ...
├── vis/
│ ├── occscannet/
│ │ ├── ...
├── data/
│ ├── occscannet/
│ │ ├── gathered_data/
│ │ ├── posed_images/
│ │ ├── train_final.txt
│ │ ├── train_mini_final.txt
│ │ ├── test_final.txt
│ │ ├── test_mini_final.txt| Dataset | Depth Model | Stage | Epoch | IoU | mIoU | Config | Log | Weight |
|---|---|---|---|---|---|---|---|---|
| Occ-ScanNet | DaV2 | 1 | 10 | - | - | - | - | weight |
| Occ-ScanNet | FT-DaV2 | 1 | 10 | - | - | - | - | weight |
| Occ-ScanNet | FT-DaV2 | 2 | 10 | 62.83 | 51.83 | config | log | weight |
| Occ-ScanNet-mini | FT-DaV2 | 2 | 20 | 61.47 | 48.87 | config | log | weight |
- Pre-train our depth branch using 2 GPUs on Occ-ScanNet:
cd SplatSSC bash scripts/fine_tune_mono.sh - Train SplatSSC using 4 GPUs on Occ-ScanNet and Occ-ScanNet-mini:
cd SplatSSC # mini bash scripts/train_mono_mini.sh # base bash scripts/train_mono.sh
-
Test our depth branch using 1 GPU on Occ-ScanNet:
cd SplatSSC bash scripts/test_fine_tuned_mono.sh -
Test SplatSSC using 1 GPU on Occ-ScanNet and Occ-ScanNet-mini:
cd SplatSSC # mini bash scripts/test_mono_mini.sh # base bash scripts/test_mono.sh
We provide a comprehensive toolkit for further visualization: visualization repository.
Features:
- 🥪 3D voxel rendering with rotating animations
- 🍲 Training loss/mIoU curves and efficiency analysis
- 🌮 Gaussian splatting visualization (Mitsuba & Matplotlib)
-
Our work is inspired by these excellent open-sourced repos: EmbodiedOcc, GaussianFormer, ISO.
-
Our code is based on EmbodiedOcc and GaussianFormer.
All our original source code is licensed under the CC-BY-NC-SA-4. license. This permits any non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited and any derivative works are shared under the same license.
If you find this project helpful, please consider citing the following paper:
@article{qian2025splatssc,
title={SplatSSC: Decoupled Depth-Guided Gaussian Splatting for Semantic Scene Completion},
author={Qian, Rui and Cao, Haozhi and Deng, Tianchen and Yuan, Shenghai and Xie, Lihua},
journal={arXiv preprint arXiv:2508.02261},
year={2025}
}
