Skip to content

Inception3D/Motion324

Repository files navigation

Motion 3-to-4 reconstructs 3D motion from videos for 4D synthesis in a feedforward manner within seconds.


Quick Start

For users who want to quickly try the inference:

git clone https://github.com/Inception3D/Motion324.git
cd Motion324

# 1. Setup environment
conda create -n Motion324 python=3.11
conda activate Motion324
pip install -r requirements.txt
# Install Hunyuan3D-2.0 components(optional)
cd scripts/hy3dgen/texgen/custom_rasterizer
python3 setup.py install
cd ../../../..
cd scripts/hy3dgen/texgen/differentiable_renderer
python3 setup.py install
cd ../../../..

# 2. Download pre-trained checkpoints and place in experiments/checkpoints/

# 3. Run inference
chmod +x ./scripts/4D_from_existing.sh
./scripts/4D_from_existing.sh ./examples/chili.glb ./examples/chili.mp4 ./examples/output

# Hunyuan needed
chmod +x ./scripts/4D_from_video.sh
./scripts/4D_from_video.sh ./examples/tiger.mp4

1. Preparation

Checkpoints

Download: Please download the pre-trained checkpoint from here and place it in experiments/checkpoints/.

Environment Details

Setup up base environment

conda create -n Motion324 python=3.11
conda activate Motion324
pip install -r requirements.txt

The code has been tested with Python 3.11 + Pytorch 2.4.1 + CUDA 12.4.

Setup Hunyuan3D-2.0 Components

# Install custom rasterizer
cd scripts/hy3dgen/texgen/custom_rasterizer
python3 setup.py install
cd ../../../..

# Install differentiable renderer
cd scripts/hy3dgen/texgen/differentiable_renderer
python3 setup.py install
cd ../../../..

Setup Blender

Download and install Blender for 4D asset rendering.

Our results is rendered with blender-4.0.0-linux-x64, using the scripts which is modified from bpy-renderer.

scripts/render_results.py also provides basic visualization of results, which you can use to check the output animations.

Installation steps:

# Download Blender
wget https://download.blender.org/release/Blender4.0/blender-4.0.0-linux-x64.tar.xz
tar -xf blender-4.0.0-linux-x64.tar.xz

# Add Blender to PATH (optional, or use full path in scripts)
export PATH=$PATH:$(pwd)/blender-4.0.0-linux-x64

Note: As we use xformers memory_efficient_attention with flash_attn, the GPU device compute capability needs > 8.0. Otherwise, it would pop up an error. Check your GPU compute capability in CUDA GPUs Page.

Dataset

The Motion80 benchmark and the training dataset is available here.

Update the dataset path in configs/dyscene.yaml:

training:
  dataset_path: /path/to/your/dataset
  train_lst: /path/to/name_list

2. Training

Before training, you need to follow the instructions here to generate the Wandb key file for logging and save it in the configs folder as api_keys.yaml.

Training Command

The default training uses configs/dyscene.yaml:

torchrun --nproc_per_node 8 --nnodes 1 --master_port 12344 train.py --config configs/dyscene.yaml

Training Configuration

Key training parameters in configs/dyscene.yaml: You can override any config parameter via command line:

torchrun --nproc_per_node 8 --nnodes 1 --master_port 12346 train.py --config configs/dyscene.yaml \
    training.batch_size_per_gpu=32

3. Inference

We use rembg for simple background removal from videos.
However, we strongly recommend using SAM2 for best video background removal.

Generate 4D animation from a single video input

Input: Video file (.mp4/.avi/.mov) or image directory (use ./scripts/images2video.py to convert images to video first)

Output:

  • Processed frames and mesh files in {video_name}_processed/
  • Animation output in {video_name}_processed/animation/ (FBX format)

Example:

chmod +x ./scripts/4D_from_video.sh
./scripts/4D_from_video.sh ./examples/tiger.mp4

Reconstruct 4D from an existing mesh and video

Inputs:

  • data_dir: Mesh file (.glb or .fbx) - FBX files will be automatically converted to GLB
  • video_path: Video file (.mp4/.avi/.mov) or image directory
  • output_dir: Output directory for results

Output:

  • Animated mesh files (GLB format) in the specified output directory
  • Segmented videos if segmentation is enabled

Example:

chmod +x ./scripts/4D_from_existing.sh
./scripts/4D_from_existing.sh ./examples/chili.glb ./examples/chili.mp4 ./examples/output

4. Evaluation

You can evaluate video results with the provided evaluation.py script.

Example:

python ./evaluation/evaluation.py \
--gt_paths /paths/to/gt_videos.mp4 \
--result_paths /paths/to/results_videos.mp4

This compares the generated result video(s) to the ground-truth and outputs metrics such as FVD, LPIPS, DreamSim, and CLIP Loss.

You can evaluate mesh geometry with the provided evaluation_pcd.py script.

Example:

python ./evaluation/evaluation_pcd.py \
--gt_path /paths/to/name_pointclouds \
--result_path /paths/to/mesh.fbx

This compares your mesh result with the ground-truth point cloud and evaluates the geometric error between them. It outputs metrics such as Chamfer Distance and F-score.

5. Citation

If you find this work useful in your research, please consider citing:

@article{chen2026motion3to4,
    title={Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis},
    author={Hongyuan, Chen and Xingyu, Chen and Youjia Zhang, and Zexiang, Xu and Anpei, Chen},
    journal={arXiv preprint arXiv:2601.14253},
    year={2026}
}

6. Acknowledgments

7. License

This project is licensed under the CC BY-NC-SA 4.0 License - see the LICENSE.md file for details.

About

[CVPR 2026] 3D Motion Reconstruction for 4D Synthesis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages