Download datasets from Open X-Embodiment and extract single episodes as .npz files:
python datasets/oxe_data_converter.py --dataset_name {dataset name, e.g. bridge} --input_path {path to downloaded OXE} --output_path {path to stored npz}For replicating our pre-training on OXE, you need to extract all datasets listed as OXE_SELECT in ivideogpt/data/dataset_mixes.py.
Follow ContextWM to prepare the Something-Something-V2 dataset.
You should include train_video_folder.txt and val_video_folder.txt in the directory datasets/somethingv2.
Download the dataset and preprocess with the following script:
wget http://rail.eecs.berkeley.edu/datasets/bair_robot_pushing_dataset_v0.tar -P .
tar -xvf ./bair_robot_pushing_dataset_v0.tar -C .
python datasets/preprocess_bair.py --input_path bair_robot_pushing_dataset_v0/softmotion30_44k --save_path bair_preprocessedThen modify the saved paths (e.g. bair_preprocessed/train and bair_preprocessed/test) in DATASET.yaml.
Follow the RoboNet Wiki to download the dataset:
pip install gdown
gdown https://drive.google.com/a/andrew.cmu.edu/uc?id=1BkqHzfRkfzgzCfc73NbNnPMK_rg3i1n9&export=download
tar -xzvf robonet_v3.tar.gzPreprocess the data:
python datasets/preprocess_robonet.py --hdf5_path robonet_data/all_hdf5_data/hdf5 --save_path robonet_preprocessedThen modify the saved paths (e.g. robonet_preprocessed/train and robonet_preprocessed/test) in DATASET.yaml.
Download the datasets (180 GB in total).
Note that we only use 5k trajectories for robosuite. So ONLY download 5k_slice_rendered_256.hdf5 and do NOT download robosuite_demo_1 through robosuite_demo_5. If you want these trajectories, please follow VP2.
Convert them to .npz files using the following script (requires h5py==3.6.0):
python datasets/preprocess_vp2.py --dir_path robodesk --save_path robodesk_preprocessed
python datasets/preprocess_vp2.py --dir_path robosuite --save_path robosuite_preprocessedThen modify the saved paths (e.g. robodesk_preprocessed and robosuite_preprocessed) in DATASET.yaml.