Publications

You can also find my publications on Google Scholar.

Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control

Published in arxiv, 2026

We introduce a human-centric video world model that is conditioned on both tracked head pose and joint-level hand poses. For this purpose, we evaluate existing diffusion model conditioning strategies and propose an effective mechanism for 3D head and hand control, enabling dexterous hand-object interactions. We train a bidirectional video diffusion model teacher using this strategy and distill it into a causal, interactive system that generates egocentric virtual environments. We evaluate this generated reality system with human subjects and demonstrate improved task performance as well as a significantly higher level of perceived amount of control over the performed actions compared with relevant baselines.

MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements

Published in 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024

Paper | Slides | Webpage

Our method, MM3DGS, addresses the limitations of prior neural radiance field-based representations by enabling faster rendering, scale awareness, and improved trajectory tracking. Our framework enables keyframe-based mapping and tracking utilizing loss functions that incorporate relative pose transformations from pre-integrated inertial measurements, depth estimates, and measures of photometric rendering quality. Experimental evaluation on several scenes shows a 3x improvement in tracking and 5% improvement in photometric rendering quality compared to the current 3DGS SLAM state-of-the-art, while allowing real-time rendering.

Robust Absolute Headset Tracking for Extended Reality

Published in 2023 IEEE/ION Position, Location and Navigation Symposium (PLANS), 2023

Paper

This paper presents a novel headset tracking frame-work designed for extended reality (XR) applications. By loosely coupling a visual simultaneous localization and mapping (SLAM) algorithm to a tightly-coupled carrier phase differential GNSS (CDGNSS) and inertial sensor subsystem, the proposed system aims to achieve centimeter-accurate, globally-referenced tracking that persists during extended periods of GNSS degradation. Collaborative and persistent XR experiences are enabled through accurate map creation utilizing a bundle adjustment approach for map generation and maintenance. Cloud or near-edge offloading of computationally demanding steps in the pipeline is explored to reduce the computational demand on the headset. This paper also explores the benefit of additional headset tracking constraints offered by direction-of-arrival measurements to nearby cellular base stations.

Codey Sun

Publications

Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control

MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements

Robust Absolute Headset Tracking for Extended Reality