Shicheng Yin, Kaixuan Yin, Weixing Chen, Yang Liu, Guanbin Li and Liang Lin, Sun Yat-sen University
This repository is the official implementation of DDP-WM (Disentangled Dynamics Prediction World Model), a novel framework designed to tackle the efficiency-performance bottleneck in existing world models. We observe that in most physical interaction scenarios, scene evolution can be decomposed into sparse primary dynamics driven by physical interactions and context-driven background updates.
Unlike dense models such as DINO-WM, DDP-WM employs a four-stage decoupled process for efficient modeling: it identifies foreground changing regions via dynamic localization, focuses computational resources on them using a primary predictor for high-precision forecasting, and leverages an innovative Low-Rank Correction Module (LRM) to update the background at a minimal cost. This design significantly improves computational efficiency while also providing a smoother optimization landscape for the planner, leading to superior planning success rates across various tasks.
Our codebase is refactored and developed upon the excellent DINO-WM project. We sincerely thank the authors of DINO-WM for their great work and for open-sourcing their code.
- Core Model Architecture (
DDP_Predictor,DDPVWorldModel) - Staged Training Script (
train.py) - Hydra Configuration Files (
conf/) - Planning and Evaluation Scripts (
plan.py,planning/) - Pre-trained Model Checkpoints
Our codebase is an extension of the DINO-WM project. For environment setup, including Conda, Mujoco, and the optional PyFlex installation, please follow the comprehensive installation guide in the official DINO-WM repository.
After setting up the environment, clone this project:
git clone https://github.com/HCPLab-SYSU/DDP-WM.git
cd DDP-WM
conda activate dino_wm # Activate the environment you created following the DINO-WM guideThe datasets used in this project are the same as those for DINO-WM and can be downloaded from here.
After downloading and unzipping, set an environment variable pointing to your dataset root directory:
# Replace /path/to/data with the actual path to your dataset folder
export DATASET_DIR=/path/to/dataThe expected directory structure is as follows:
data
├── deformable
│ ├── granular
│ └── rope
├── point_maze
├── pusht_noise
└── wall_single
The training of DDP-WM is conducted in a staged, decoupled manner to ensure stability and reproducibility. You need to train the different components of the model sequentially.
This stage trains the Historical Information Fusion Module and the Dynamic Localization Network.
python train.py model.training_stage=localization env=pusht frameskip=5 num_hist=3Checkpoints are saved under the outputs/ directory, which can be configured via ckpt_base_path in conf/train.yaml.
This stage trains the Primary Dynamics Predictor. We freeze the weights from the previous stage and use the generated sparse masks to guide the predictor. You need to specify the checkpoint from Stage 1 via ckpt_path.
python train.py model.training_stage=primary_predictor env=pusht frameskip=5 num_hist=3 ckpt_path=<path_to_stage1_checkpoint.pth>Finally, we train the LRM to update the background. In this stage, all modules from the previous two stages are frozen.
python train.py model.training_stage=lrm env=pusht frameskip=5 num_hist=3 ckpt_path=<path_to_stage2_checkpoint.pth>After these three stages, you will have a fully trained DDP-WM model.
We are currently refactoring the planning and evaluation scripts to be fully compatible with the new model architecture. This feature will be available in a future update.
If you find our work useful, please consider citing our paper:
@misc{yin2026ddpwmdisentangleddynamicsprediction,
title={DDP-WM: Disentangled Dynamics Prediction for Efficient World Models},
author={Shicheng Yin and Kaixuan Yin and Weixing Chen and Yang Liu and Guanbin Li and Liang Lin},
year={2026},
eprint={2602.01780},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2602.01780},
}