UniVTAC: A Unified Simulation Platform for Visuo-Tactile Manipulation Data Generation, Learning, and Benchmarking
arXiv | PDF | Website | HuggingFace Dataset | Modelscope Dataset
UniVTAC is a tactile-aware simulation benchmark for robotic manipulation built on top of NVIDIA Isaac Lab and TacEx (UIPC-based tactile simulation). It provides a unified framework for collecting expert demonstrations, training visuotactile policies, and evaluating them across a diverse suite of contact-rich manipulation tasks — all with high-fidelity tactile feedback from simulated GelSight Mini, ViTai GF225, or XenseWS sensors.
See the Installation Guide for detailed setup instructions, including installing the environment, installing TacEx from the modified local source and setting up cuRobo for motion planning.
UniVTAC currently includes the following manipulation tasks, all featuring tactile sensing:
| Task | Module | Description |
|---|---|---|
| Collect | collect |
Collect contact-rich tactile data for pretraining |
| Lift Bottle | lift_bottle |
Grasp and lift a bottle off a surface near a wall |
| Lift Can | lift_can |
Grasp and lift a cylindrical can |
| Insert HDMI | insert_HDMI |
Insert an HDMI connector into a port |
| Insert Hole | insert_hole |
Precision peg-in-hole insertion |
| Insert Tube | insert_tube |
Insert a tube into a fixture |
| Pull Out Key | pull_out_key |
Extract a key from a lock |
| Put Bottle in Shelf | put_bottle_in_shelf |
Place a bottle onto a shelf |
| Grasp & Classify | grasp_classify |
Grasp an object and classify it by tactile feedback |
See the Data Collection Guide for instructions on how to run the automated data collection pipeline, configure task-specific parameters, and understand the output data structure.
Dataset containing 100 episodes per task can be downloaded from HuggingFace, Modelscope or by running the script in data/download.sh.
UniVTAC includes several baseline policies implemented under the policy/ directory:
- ACT: Action Chunking with Transformers with/without tactile inputs
- Abation: ACT ablation variants for modality comparison
- ViTAL: ACT with CLIP-pretrained tactile-vision encoders in ViTAL
Each policy is a self-contained module under policy/ with its own data processing, training, and deployment scripts. All policies share a unified evaluation entry point at the project root:
bash eval_policy.sh ${task_name} ${task_config} ${policy_config} ${gpu_id}For parallel evaluation over many seeds:
bash parallel_eval.sh ${task_name} ${task_config} ${policy_config} ${gpu_id} [num_processes] [total_num]The evaluation results, including videos and success rate logs, will be saved in the eval_result/ directory under the project root.
To deploy your own policy, refer to the Deploy Your Policy.
- Data collection and evaluation are now only supported on the GelSight Mini sensor. We will add support for ViTai GF225 and XenseWS in the near future.
If you find our work useful, please consider citing:
@article{chen2026univtac,
title={UniVTAC: A Unified Simulation Platform for Visuo-Tactile Manipulation Data Generation, Learning, and Benchmarking},
author={Chen, Baijun and Wan, Weijie and Chen, Tianxing and Guo, Xianda and Xu, Congsheng and Qi, Yuanyang and Zhang, Haojie and Wu, Longyan and Xu, Tianling and Li, Zixuan and others},
journal={arXiv preprint arXiv:2602.10093},
year={2026}
}
This repository is released under the MIT license. See LICENSE for additional details.