Masked Visual-Tactile Pre-training for Robot Manipulation (ICRA24)

Overview

This repository contains the official code for the research project "Masked Visual-Tactile Pre-training for Robot Manipulation," presented at ICRA24. The project focuses on enhancing robotic manipulation capabilities through a novel approach that integrates visual and tactile information.

Key Links

Project Webpage: https://lqts.github.io/M2VTP/
Research Paper: IEEE | ResearchGate
Demo Video: Bilibili

Features

This repository provides the following functionalities:

Environment Setup: Instructions for configuring the necessary environment.
Pre-trained Model Integration: Code for importing and utilizing pre-trained models.
Downstream Task Training: Scripts for training models on specific manipulation tasks.
Model Evaluation: Tools for testing trained models and visualizing training strategies.

Getting Started

Environment Setup

To set up the environment, please refer to the detailed instructions in the Dependencies section.

Importing Pre-trained Models

You can access the pre-trained model code here. Additionally, pre-trained models can be downloaded from this link. Place the downloaded model and configuration files in the model/vitac/model_and_config directory. You can modify the directory information in model/backbones/pre_model.py as shown below:

MODEL_REGISTRY = {
    "vt20t-reall-tmr05-bin-ft+dataset-BottleCap": {
        "config": "model/vitac/model_and_config/vt20t-reall-tmr05-bin-ft+dataset-BottleCap.json",
        "checkpoint": "model/vitac/model_and_config/vt20t-reall-tmr05-bin-ft+dataset-BottleCap.pt",
        "cls": VTT_ReAll,
    }
}

Training a Policy

The repository includes a downstream task for bottle cap manipulation. To train the ShadowHand policy for this task, execute the following command:

python train_agent.py --task bottle_cap_vt --seed 123

Note: The training process requires at least two Nvidia 3090 GPUs—one for model training and the other for image rendering. Set the following environment variables accordingly:

import os
os.environ['MUJOCO_GL'] = 'egl'
os.environ["MUJOCO_EGL_DEVICE_ID"] = "1"  # '1' is for image rendering, '0' is used for training.

Testing a Policy

We provide a pre-trained policy that can be downloaded from this link. You can also train your own model using the instructions above.

To perform testing, use the following command:

python eval_agent.py --task bottle_cap_vt --seed 123 --resume_model path/to/your/model.pt --test

The test results will be output to the console and saved to a specified file.

Visualizing Policies

To save the operation process as a video, run the command:

python eval_agent.py --task bottle_cap_vt --seed 123 --resume_model path/to/your/model.pt --test --env_vis

The video will be saved in the runs/videos directory.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

If you have any questions or need support, please contact Qingtao Liu or Qi Ye. .

BibTeX

@inproceedings{liu2024m2vtp,
    title={Masked Visual-Tactile Pre-training for Robot Manipulation},
    author={Liu, Qingtao and Ye, Qi and Sun, Zhengnan and Cui, Yu and Li, Gaofeng and Chen, Jiming},
    booktitle={2024 IEEE International Conference on Robotics and Automation (ICRA)},
    year={2024},
    organization={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
asset		asset
config		config
model		model
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval_agent.py		eval_agent.py
train_agent.py		train_agent.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Masked Visual-Tactile Pre-training for Robot Manipulation (ICRA24)

Overview

Key Links

Features

Getting Started

Environment Setup

Importing Pre-trained Models

Training a Policy

Testing a Policy

Visualizing Policies

License

Contact

BibTeX

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Masked Visual-Tactile Pre-training for Robot Manipulation (ICRA24)

Overview

Key Links

Features

Getting Started

Environment Setup

Importing Pre-trained Models

Training a Policy

Testing a Policy

Visualizing Policies

License

Contact

BibTeX

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages