PX Pipeline: PaXini's Data Processing Toolkit

Overview

The post-processing pipeline for PaXini's Super EID Factory involves the following stages:

Stage	File	Description
DF-1	HDF5	The overall input: raw data after preprocessing and quality inspection
DF-2	HDF5	1st output: DF-1 with encoder and tactile data parsed; adds bimanual and object poses; includes both action and observation
DF-2R	HDF5	2nd output: DF-2 retargeted to a dexterous hand model
DF-3	LeRobot Dataset	3rd output: converts DF-2R to the LeRobot dataset format; can be used for model training

PX Pipeline prepares raw data (DF-1) and transforms them into various formats for downstream application (e.g. large scale pre-training and simulation). It consists of the following modules, which can be run sequentially:

Part	Input	Output
PX Pose	DF-1	Bimanual poses and (optional) object poses
PX Post-Process	(a) DF-1; (b) Output of PX Pose	DF-2, DF-2R, DF-3

The overall outputs are DF-2, DF-2R, and DF-3 data. We release PX Omnisharing, samples of our DF-2 data, on Hugging Face.

Env setup

Prerequisite

OS: Ubuntu 20.04 or 22.04
Architecture: x86_64
System Resource: At least 32 GB of RAM and 64 GB of storage; NVIDIA GPU with > 16 GB of VRAM
NVIDIA Driver: >= 555.42.06 and < 580
Software: Docker

Install NVIDIA Container Toolkit

# [HOST]
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
  sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg       

curl -fsSL https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list      
   
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Add the current user to Docker group

# [HOST]
sudo systemctl enable --now docker
sudo usermod -aG docker "${SUDO_USER:-$USER}"

# Re-login to pick up the new group 
# - Desktop: log out/in of your shell
# - SSH session: exit, and reconnect

Initialization

Docker image px_pipeline includes the whole pipeline.

# [HOST]
docker pull <url>

# Enter the container and mount a workspace
# Example: bash first_run.sh /home/alice/data
cd entry
bash first_run.sh <workspace_to_be_mounted>

# [CONTAINER]
cd /home/app/pipeline
bash build_all.sh

Inside the container, the mounted workspace is located at /ws. All scripts required for running the pipeline are under /home/app/pipeline.
Note that first_run.sh and build_all.sh only needs to be executed once.
To reenter the container, please run:

# [HOST]
docker start px_pipeline
docker exec -it px_pipeline bash

Run the demo

You can run a single script that demonstrates how the complete pipeline works. The outputs (DF-2, DF-2R, DF-3) will be written to the default directories.

# [CONTAINER]
cd /home/app/pipeline
bash run_demo.sh <input_dir> <output_dir> <hand_model>

# Example: 
# bash run_demo.sh /ws/sample/group_1 /ws/output_1 dexh13
# bash run_demo.sh /ws/sample/group_2 /ws/output_2 dexh13

We provide two groups of sample inputs. Group 1 demonstrates post-processing without object poses, and Group 2 is the "complete" input including HDF5 and masks. Please refer to the tutorial of pose estimation for their meaning.

Run on PaXini's Dataset

To run on a large-scale dataset instead of the small samples, we strongly recommend that you follow the instructions in this section, and process the data in two stages. Each stage (Pose or Post-Process) requires different resources:

Part	Critical Resource	Recommendations
PX Pose	GPU	Multiple GPUs
PX Post-Process	CPU and RAM	Capable CPU with abundant RAM No requirements for GPU

Parallel processing: To increase the efficiency, we recommend running the container with CPU pinning and GPU binding, so that you can run multiple tasks simultaneously on a server.

Please first run Part 1, and then run Part 2. Both parts utilize the same container that you have loaded and initialized.

Output

A series of outputs will be generated by the pipeline, including the hand poses, object poses, and three types of data files: DF-2, DF-2R, and DF-3.

Pose results

Part 1 of the pipeline estimates the 6D poses of both hands and (optional) objects. Details can be found in the tutorial of pose estimation.

DF-2, DF-2R & DF-3

After following the instructions in Part 2, the original DF-1 and estimated poses will be transformed to these three formats. Please check the data specification for the content of each stage.

License

Component	Type	License	Commercial Use
Python Package (.whl)	Binary	Proprietary	Requires License
Python Package (.py)	Source	MIT	Yes
Documentation	Source	MIT	Yes
Bracelet Detection Model	Model weights (.pt)	Proprietary	Requires License
Datasets	Source	CC-BY-NC-SA 4.0	No
Other Runtime Components (executables, object code, etc.)	Binary	Proprietary	Requires License

Installation & Usage Rights

Research/Education Use

Acknowledgment

We would like to thank the authors of IINet, FoundationPose, YOLO, manotorch, and PyTorch Robot Kinematics for releasing their code and/or model, which we built upon in our open toolkit. This project would not be possible without the standardized and scalable system of LeRobot. Also, thanks to the authors of FoundationPose++ for their very helpful Kalman filter.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
docs		docs
entry		entry
images		images
sample		sample
.gitattributes		.gitattributes
BINARY_LICENSE.md		BINARY_LICENSE.md
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PX Pipeline: PaXini's Data Processing Toolkit

Overview

Env setup

Prerequisite

Install NVIDIA Container Toolkit

Add the current user to Docker group

Initialization

Run the demo

Run on PaXini's Dataset

Output

Pose results

DF-2, DF-2R & DF-3

License

Installation & Usage Rights

Acknowledgment

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

PX Pipeline: PaXini's Data Processing Toolkit

Overview

Env setup

Prerequisite

Install NVIDIA Container Toolkit

Add the current user to Docker group

Initialization

Run the demo

Run on PaXini's Dataset

Output

Pose results

DF-2, DF-2R & DF-3

License

Installation & Usage Rights

Acknowledgment

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages