This repository is forked from Kai-46/SatelliteSfM. Special thanks to Kai Zhang for the original work!
It provides tools for preparing datasets for Skyfall-GS from two sources:
- COLMAP reconstructions — Convert existing COLMAP sparse reconstructions
- Satellite imagery — Process satellite images with RPC camera models using SatelliteSfM
If you already have reconstruction results from COLMAP, convert them with:
python convert_colmap_datasets.py -s /path/to/colmap_output [--skip_calibration]This script reads the COLMAP sparse model at <dataset>/sparse/0 and generates transforms_train.json and transforms_test.json, rotating and centering cameras so that the look-at point is at the scene origin with up vector (0, 0, 1).
Dataset ready! After conversion, your dataset can be used directly with Skyfall-GS.
The RPC (Rational Polynomial Camera) model is a standard camera model for satellite imagery. Unlike pinhole cameras, it handles orbital motion, Earth curvature, and atmospheric distortion using rational polynomials to map between (Lat, Lon, Alt) world coordinates and (Row, Col) pixel coordinates:
Row = P1(X,Y,Z) / P2(X,Y,Z)
Col = P3(X,Y,Z) / P4(X,Y,Z)
See the RPC specification for details.
Handles all dependencies (ColmapForVisSat compiled from source, CUDA, GDAL, Python packages) without requiring a specific OS or compiler.
Requirements: Docker with NVIDIA Container Toolkit
Pull the pre-built image from GHCR (fastest — skips ~30–45 min compilation):
docker compose pullOr build the image locally (required if you modify the Dockerfile):
Pre-download the Miniconda installer first (network is unavailable during Docker build):
wget https://repo.anaconda.com/miniconda/Miniconda3-py38_4.12.0-Linux-x86_64.sh -O Miniconda3-installer.shdocker compose build # ~30–45 min, compiles ColmapForVisSat from sourceRun interactively:
docker compose run satellitesfmRequires Ubuntu 18.04 (for GCC-7), at least one GPU, and conda installed.
. ./env.shThe pipeline uses the DFC2019 Track 3: Multi-View Semantic Stereo benchmark — multi-view satellite imagery of Jacksonville (JAX) and Omaha (OMA).
Download these three archives and extract into data/:
| Archive | Size |
|---|---|
| Track 3 / Training data / RGB images 1/2 | 7.6 GB |
| Track 3 / Training data / RGB images 2/2 | 7.6 GB |
| Track 3 / Training data / Reference | 37.5 MB |
Expected structure:
data/
├── Track3-RGB-1/*.tif
├── Track3-RGB-2/*.tif
└── Track3-Truth/[*.tif, *.txt]
A preprocessed version is also available on Google Drive.
Convert .tif images to PNG and extract RPC metadata:
# Jacksonville (JAX)
python preprocess_track3/preprocess_track3.py \
--base_view_dir data/Track3-RGB-1 \
--base_dsm_dir data/Track3-Truth \
--out_dir data/DFC2019_JAX_preprocessed
# Omaha (OMA)
python preprocess_track3/preprocess_track3.py \
--base_view_dir data/Track3-RGB-2 \
--base_dsm_dir data/Track3-Truth \
--out_dir data/DFC2019_OMA_preprocessedOutput:
data/DFC2019_[JAX|OMA]_preprocessed/
├── cameras/ # Camera parameters
├── images/ # Converted PNG images
├── metas/ # RPC coefficients and metadata (JSON)
├── enu_bbx/ # ENU coordinate bounding boxes
├── enu_observers/ # Observer positions in ENU coordinates
├── latlonalt_bbx/ # Lat/Lon/Alt bounding boxes
└── groundtruth_u/ # Ground truth data
Organize images by scene ID:
# Basic usage
python prepare_input.py --scene_id 004 # JAX scene 004
python prepare_input.py --scene_id 068 --city OMA # OMA scene 068
# Use symlinks to save disk space
python prepare_input.py --scene_id 004 --symlinkOutput:
data/DFC2019_processed/{CITY}_{SCENE_ID}/inputs/
├── images/
│ └── {CITY}_{SCENE_ID}_*_RGB.tif
└── latlonalt_bbx.json
Perform structure-from-motion reconstruction on the satellite imagery:
# With Docker
docker compose run satellitesfm python satellite_sfm.py \
--input_folder data/DFC2019_processed/JAX_004/inputs \
--output_folder data/DFC2019_processed/JAX_004/outputs_srtm \
--run_sfm \
--use_srtm4 \
--enable_debug
# Without Docker
python satellite_sfm.py \
--input_folder data/DFC2019_processed/JAX_004/inputs \
--output_folder data/DFC2019_processed/JAX_004/outputs_srtm \
--run_sfm \
--use_srtm4 \
--enable_debug| Flag | Description |
|---|---|
--run_sfm |
Enable SfM reconstruction |
--use_srtm4 |
Fetch SRTM4 elevation data for altitude initialization |
--enable_debug |
Output visualizations to outputs_srtm/debug_sfm/ |
Output:
data/DFC2019_processed/{CITY}_{SCENE_ID}/outputs_srtm/
├── colmap_triangulate_postba/
│ ├── cameras.bin
│ ├── images.bin
│ └── points3D.txt
├── cameras_adjusted/ # Bundle-adjusted pinhole cameras (K, W2C matrices)
├── images/ # PNG images
└── enu_bbx_adjusted.json
Applies skew correction, converts to Skyfall-GS format, and generates masks:
chmod +x postprocess_scenes.sh
# Single scene
./postprocess_scenes.sh JAX_004
# Multiple scenes
./postprocess_scenes.sh JAX_004 OMA_068 JAX_214Steps performed for each scene:
- Skew correction — corrects geometric distortion using SRTM elevation data
- Format conversion — converts to Skyfall-GS format (
transforms_train.json,transforms_test.json) - Copy 3D points — copies
points3D.txtfrom SfM output - Mask generation — creates binary masks for valid (non-black) pixels
Advanced options:
./postprocess_scenes.sh --skip-skew JAX_004 # Skip skew correction
./postprocess_scenes.sh --skip-convert JAX_004 # Skip format conversion
./postprocess_scenes.sh --skip-copy JAX_004 # Skip points3D.txt copy
./postprocess_scenes.sh --skip-mask JAX_004 # Skip mask generation
./postprocess_scenes.sh --dry-run JAX_004 # Preview without executing
./postprocess_scenes.sh --base-dir /custom JAX_004Final output:
data/DFC2019_processed/{CITY}_{SCENE_ID}/outputs_skew/
├── cameras/
├── images/*.png
├── masks/
│ ├── *.npy
│ └── *.png
├── transforms_train.json
├── transforms_test.json
└── points3D.txt
Dataset ready! This output can be used directly with Skyfall-GS.
End-to-end processing of JAX scene 004 (single scene):
# 0. Pull Docker image (one-time)
docker compose pull
# 1. Download and extract DFC2019 (manual step)
# → data/Track3-RGB-1/, data/Track3-RGB-2/, data/Track3-Truth/
# 2. Initial preprocessing (runs outside Docker; no GPU needed)
python preprocess_track3/preprocess_track3.py \
--base_view_dir data/Track3-RGB-1 \
--base_dsm_dir data/Track3-Truth \
--out_dir data/DFC2019_JAX_preprocessed
# 3. Prepare scene inputs
python prepare_input.py --scene_id 004 --city JAX
# Output: data/DFC2019_processed/JAX_004/inputs/
# [Optional] Merge multiple scenes into a shared coordinate system
# python merge_scenes.py --scenes 004 005 --city JAX --output_name JAX_merged
# 4. Run SatelliteSfM (requires GPU — use Docker)
docker compose run satellitesfm python satellite_sfm.py \
--input_folder data/DFC2019_processed/JAX_004/inputs \
--output_folder data/DFC2019_processed/JAX_004/outputs_srtm \
--run_sfm --use_srtm4 --enable_debug
# Output: data/DFC2019_processed/JAX_004/outputs_srtm/cameras_adjusted/
# 5. Post-process (skew correction → format conversion → masks)
chmod +x postprocess_scenes.sh
./postprocess_scenes.sh JAX_004
# Output: data/DFC2019_processed/JAX_004/outputs_skew/
# 6. Dataset is ready for Skyfall-GS
ls data/DFC2019_processed/JAX_004/outputs_skew/
# cameras/ images/ masks/ transforms_train.json transforms_test.json points3D.txt# 0. Pull Docker image (one-time)
docker compose pull
scenes=(JAX_004 JAX_068 JAX_214 OMA_001 OMA_175)
# 1. Preprocess JAX images (Track3-RGB-1) and OMA images (Track3-RGB-2)
python preprocess_track3/preprocess_track3.py \
--base_view_dir data/Track3-RGB-1 \
--base_dsm_dir data/Track3-Truth \
--out_dir data/DFC2019_JAX_preprocessed
python preprocess_track3/preprocess_track3.py \
--base_view_dir data/Track3-RGB-2 \
--base_dsm_dir data/Track3-Truth \
--out_dir data/DFC2019_OMA_preprocessed
# 2. Prepare all scene inputs (use --symlink to save disk space)
for scene in "${scenes[@]}"; do
city=$(echo $scene | cut -d'_' -f1)
scene_id=$(echo $scene | cut -d'_' -f2)
python prepare_input.py --scene_id $scene_id --city $city --symlink
done
# 3. Run SatelliteSfM for all scenes
for scene in "${scenes[@]}"; do
docker compose run satellitesfm python satellite_sfm.py \
--input_folder data/DFC2019_processed/${scene}/inputs \
--output_folder data/DFC2019_processed/${scene}/outputs_srtm \
--run_sfm --use_srtm4
done
# 4. Post-process all at once
chmod +x postprocess_scenes.sh
./postprocess_scenes.sh "${scenes[@]}""No images found matching pattern"
- Scene ID must be zero-padded (e.g.,
"004"not"4") - Verify
.tiffiles exist in the Track3-RGB directory
"Input folder does not exist"
- Run
prepare_input.pybeforesatellite_sfm.py - Check scene ID and city code are correct
SatelliteSfM fails with SRTM errors
- SRTM data is downloaded at runtime — ensure internet access inside the container
- Fall back to manual altitude: edit
latlonalt_bbx.jsonand remove--use_srtm4
COLMAP binary fails to load (libXXX.so not found)
- Should not happen with the Docker image; if it does, run:
and add the missing package to the runtime apt stage in
docker run --rm -e LD_LIBRARY_PATH=/app/preprocess_sfm/ColmapForVisSat/build/__install__/lib \ ghcr.io/jayin92/satellitesfm:latest ldd /app/preprocess_sfm/ColmapForVisSat/build/__install__/bin/colmap | grep "not found"Dockerfile.
No module named '_gdal_array'
- This means GDAL was compiled before numpy. Rebuild the image — the current
Dockerfileinstalls numpy first.
GPU not detected in container
- Ensure NVIDIA Container Toolkit is installed and configured
- Test with:
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu18.04 nvidia-smi
This repo is built on top of Kai-46/SatelliteSfM, which bridges the gap between computer vision SfM pipelines and remote sensing satellite imagery. The key insight is that satellite images can be processed through standard SfM pipelines if the RPC camera model is first converted to a pinhole model — enabling downstream use of tools like COLMAP, NeRF, and 3DGS.
- SatelliteSfM
- SatelliteNeRF
- Camera visualization scripts
- ColmapForVisSatPatched (patches for latest COLMAP)
- SatelliteNeuS
- SatellitePlaneSweep / SatelliteNeRF / SatelliteNeuS documentation
- Deep Satellite Stereo
- SatelliteNeRF — neural radiance fields for satellite imagery
- SatelliteNeuS — mesh reconstruction from multi-date satellite images
- ColmapForVisSatPatched — updated COLMAP fork
Satellite cameras are far from the scene, producing huge depth values. When using float32 GPU computing (NeRF, MVS, deep stereo), precision loss can cause artifacts. Use normalize_sfm_reconstruction.py to center and scale the scene, then adjust pixel2ray to use float64 internally:
def pixel2ray(col, row, K, W2C):
C2W = torch.inverse(W2C) # float64
px = torch.stack((col, row, torch.ones_like(col)), axis=-1).unsqueeze(-1).double()
ray_d = torch.matmul(C2W[:3, :3], torch.matmul(torch.inverse(K[:3, :3]), px))
ray_d = (ray_d / ray_d.norm(dim=1, keepdims=True)).squeeze(-1)
ray_o = C2W[:3, 3].unsqueeze(0).expand(ray_d.shape[0], -1)
shift = torch.norm(ray_o, dim=-1) - 5.
ray_o = ray_o + ray_d * shift.unsqueeze(-1)
return ray_o.float(), ray_d.float()Input images
Sparse point cloud from SfM
Camera visualization (python visualize_satellite_cameras.py)
Red/Green/Blue axes = East/North/Up. Each camera is a line from origin to camera center.
If you use this pipeline, please cite:
@inproceedings{VisSat-2019,
title={Leveraging Vision Reconstruction Pipelines for Satellite Imagery},
author={Zhang, Kai and Sun, Jin and Snavely, Noah},
booktitle={IEEE International Conference on Computer Vision Workshops},
year={2019}
}
@inproceedings{schoenberger2016sfm,
author={Sch\"{o}nberger, Johannes Lutz and Frahm, Jan-Michael},
title={Structure-from-Motion Revisited},
booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2016},
}
@data{c6tm-vw12-19,
doi = {10.21227/c6tm-vw12},
url = {https://dx.doi.org/10.21227/c6tm-vw12},
author = {Le Saux, Bertrand and Yokoya, Naoto and Hänsch, Ronny and Brown, Myron},
publisher = {IEEE Dataport},
title = {Data Fusion Contest 2019 ({DFC2019})},
year = {2019},
}
@article{lee2025SkyfallGS,
title = {{Skyfall-GS}: Synthesizing Immersive {3D} Urban Scenes from Satellite Imagery},
author = {Jie-Ying Lee and Yi-Ruei Liu and Shr-Ruei Tsai and Wei-Cheng Chang and Chung-Ho Wu and Jiewen Chan and Zhenjun Zhao and Chieh Hubert Lin and Yu-Lun Liu},
journal = {arXiv preprint},
year = {2025},
eprint = {2510.15869},
archivePrefix = {arXiv}
}

