FaceSplat: 3D Face Reconstruction via Gaussian Splatting
FaceSplat is a tool designed to reconstruct 3D heads from ordinary phone or action cam video using Gaussian splatting. By combining classical computer vision (COLMAP, Haar cascades) with modern differentiable rendering (gsplat), we produce head-only 3D models from just a few seconds of footage, all running locally on a single GPU.
Our Goal: Turn a short video of someone's face into a clean, isolated 3D Gaussian splat of their head, without expensive scanners, photogrammetry rigs, or cloud services, for use in medical and cultural purposes.
The Problem
Most Gaussian splatting pipelines reconstruct entire scenes. If you want just the head, you're left manually cropping point clouds or training on pre-segmented images. Neither works well. Scene-level splats include floors, walls, and shoulders that are hard to remove after the fact, and pre-cropping frames throws away the background features that COLMAP needs for reliable camera pose estimation.
FaceSplat solves this with a train-then-isolate approach:
- Full-frame SfM: Run COLMAP on complete frames so it has plenty of features to work with
- Masked training: Only supervise the head region during Gaussian splat optimization - Multi-view isolation: After training, keep only the Gaussians that consistently project into the head mask across all camera views ## Features - Video to 3D: Drop in an mp4 and get a PLY file out the other end - Head-only output: Automatic face detection, mask generation, and Gaussian isolation strips away everything that isn't the head
- Masked loss training: L1 + SSIM loss applied only to head pixels, with a background alpha penalty to suppress stray Gaussians
- Adaptive densification: gsplat's DefaultStrategy grows and prunes Gaussians during training based on gradient magnitude - Robust face detection: Dual Haar cascades (frontal + profile) with interpolation across missed frames - Soft elliptical masks: Gaussian-blurred ellipses sized to cover head and neck, with configurable expansion - Browser viewer: React + Three.js + gaussian-splats-3d for interactive 3D viewing, no install required ## How it works The system runs through six stages:
- Frame extraction: Sample frames from the video, resize to target height, force even dimensions for COLMAP compatibility 2. Head mask generation: - Haar cascade detection (frontal + profile) finds the largest face per frame - Interpolate position and size across frames where detection fails - Generate soft elliptical masks with Gaussian blur for smooth edges
- COLMAP SfM: Feature extraction, exhaustive matching, and incremental mapping via pycolmap. Outputs camera poses, intrinsics, and a sparse 3D point cloud
- Gaussian splat training: - Initialize Gaussians from COLMAP's sparse points (positions, colors, neighbor-based scales) - Train with masked L1 + SSIM loss (0.8/0.2 weighting) plus background alpha penalty
- Adaptive densification refines every 100 iterations, resets opacity every 500
- SH degree 1 for view-dependent color 5. Head isolation: Re-render all views, check what fraction of each Gaussian's visible appearances fall inside the head mask. Keep those above 30% with opacity above 0.05 6. PLY export: Standard 3DGS format compatible with existing viewers ## Steps to use
Run the pipeline:
bash python face_splat.py --video input.mp4 --output head.ply Optional flags:
--num-frames 40 Number of frames to sample (default: 40)
--height 540 Resize height in pixels (default: 540)
--num-iters 3000 Training iterations (default: 3000) --device cuda Device for training (default: cuda) View the result:
npm install
npm start
Open http://localhost:3000 and drag in the PLY file, or place it at models/model.ply for auto-loading.
Troubleshooting
- COLMAP fails with few registered cameras: Try increasing
--num-framesor using a video with more head movement and varied background - No faces detected: The Haar cascade needs reasonably lit, reasonably sized faces. Check that the face isn't too small after resizing to
--height - Splat has artifacts at mask edges: The mask ellipse may be too tight. Increase
expand_multingenerate_head_masks - Background Gaussians survive isolation: Lower the threshold in
isolate_head_gaussians(default 0.3) or increase--num-itersso training converges more fully - Out of GPU memory: Reduce
--heightor--num-frames
Technical details
The pipeline uses:
- pycolmap: Python bindings for COLMAP's SfM pipeline (SIMPLE_RADIAL camera model, exhaustive matching)
- gsplat: Differentiable Gaussian splatting rasterizer with packed mode and absgrad support
- PyTorch: Training loop with per-parameter Adam optimizers (separate learning rates for means, quaternions, scales, opacities, SH coefficients)
- OpenCV: Frame extraction, Haar cascade face detection, elliptical mask generation
- scikit-learn: NearestNeighbors for computing initial Gaussian scales from point cloud density
- plyfile: PLY format export
- React 18 + React Three Fiber + Three.js: Browser-based 3D viewer
- gaussian-splats-3d: Gaussian splat rendering in the browser
- Flask: Lightweight backend for serving models
- Tailwind CSS + Radix UI: Viewer interface styling ## What's next for FaceSplat - Replace Haar cascades with a face mesh model for tighter masks around ears and hair
- Expression transfer using blendshapes derived from the source video - Real-time Gaussian splatting rendering in the browser instead of mesh conversion
- Shareable links so people can send their face models to others
Built with Python, PyTorch, gsplat, pycolmap, OpenCV, scikit-learn, plyfile, React, React Three Fiber, Three.js, gaussian-splats-3d, Flask, Tailwind CSS, Radix UI
Built With
- flask
- gaussian-splats-3d
- opencv
- plyfile
- pycolmap
- radix-ui
- react
- react-three-fiber
- scikit-learn
- tailwind-css
- three.js
Log in or sign up for Devpost to join the conversation.