In this post, I provide a kickstarter guide to getting started with TrajNet++ framework for human trajectory forecasting, which will prove useful in helping you approach Milestone 1.
On a high-level, Trajnet++ constitutes four primary components:
Trajnetplusplustools: This repository provides helper functions for trajectory prediction. For instance: trajectory categorization, evaluation metrics, prediction visualization.
Trajnetplusplusdataset: This repository provides scripts to generate train, val and test dataset splits from raw data as well as simulators.
Trajnetplusplusbaselines: This repository provides baseline models (handcrafted as well as data-driven) for human motion prediction. This repository also provides scripts to extensively evaluate the trained models.
Trajnetplusplusdata: This repository provides the already processed real world data as well as synthetic datasets conforming to human motion.
I describe on how to get started using TrajNet++ with the help of a running example. We will download an already-created synthetic dataset and train an LSTM-based model to perform trajectory forecasting.
The first step is to setup the repository Trajnetplusplusbaselines for model training. Next, we setup the virtual environment and download the requirements. The virtual environment can also be setup using Conda on local machines.
## 1. On LOCAL MACHINE
## Make virtual environment using either A. virtualenv or B. conda
## A. Using virtualenv
## Works with Python3.6 and Python3.8
virtualenv -p /usr/bin/python3.6 trajnetv
source trajnetv/bin/activate
## B. Using conda
## Works with Python3.6 and Python3.8
conda create --name trajnetv python=3.8
conda activate trajnetv
## 2. On SCITAS
module load gcc
module load python/3.7.3
virtualenv --system-site-packages venvs/trajnetv
source venvs/trajnetv/bin/activate
Set up TrajNet++ on SCITAS after verifying the setup on local machine. For SCITAS, no need to fork the repository again. Clone the already-created forked repository.
## Create directory to setup Trajnet++
mkdir trajnet++
cd trajnet++
## Clone Repositories
# git clone https://github.com/vita-epfl/trajnetplusplusbaselines.git (Old)
git clone <forked_repository>
## Download Requirements
cd trajnetplusplusbaselines/
pip install -e .
## SCITAS-Specific (!)
## If previous command gives an error: "requires deeptoolsintervals>=0.1.7, requires plotly>=2.0.0", then:
pip install deeptoolsintervals
pip install plotly
pip install -e .
Follow the next steps for SCITAS as well.
## Additional Requirements (ORCA)
wget https://github.com/sybrenstuvel/Python-RVO2/archive/master.zip
unzip master.zip
rm master.zip
## Setting up ORCA (steps provided in the Python-RVO2 repo)
cd Python-RVO2-master/
pip install cmake
pip install cython
python setup.py build
python setup.py install
cd ../
## Additional Requirements (Social Force)
wget https://github.com/svenkreiss/socialforce/archive/refs/heads/main.zip
unzip main.zip
rm main.zip
## Setting up Social Force
cd socialforce-main/
pip install -e .
cd ../
Our repository is now setup!
Now, we will download and prepare data for training our models. In this example, we download a synthetic dataset generated using ORCA policy.
cd DATA_BLOCK/
wget https://github.com/vita-epfl/trajnetplusplusdata/releases/download/v3.1/five_parallel_synth.zip
unzip five_parallel_synth.zip
rm five_parallel_synth.zip
ls five_parallel_synth/
You will notice that the current folder contains train data, test data and test_private data. The test data contains the test examples uptil the end of observation period, while the test_private, as the name suggests, contains the ground-truth predictions which will be used as reference to evaluate the performing of forecasting model. You will notice that a validation set is not present. Preparing a validation set is important for performing hyperparameter tuning. You can use the following helper file to split the current training samples into training (80%) and validation split (20%).
cd ../
python create_validation.py --path five_parallel_synth --val_ratio 0.2
The above command will create a new folder five_parallel_synth_split in the DATA_BLOCK folder.
We can additionally transfer the goal information of the dataset in the goal_files folder.
## Preparing Goals folder (additional attributes)
mkdir -p goal_files/train
mkdir goal_files/val
mkdir goal_files/test_private
## For other datasets, the goal files can be different for corresponding dataset split
cp DATA_BLOCK/five_parallel_synth/orca_five_nontraj_synth.pkl goal_files/train/
cp DATA_BLOCK/five_parallel_synth/orca_five_nontraj_synth.pkl goal_files/val/
cp DATA_BLOCK/five_parallel_synth/orca_five_nontraj_synth.pkl goal_files/test_private/
Now that the dataset is ready, its time to train the model! :)
Training models is more easier than setting up Trajnet++ !
For SCITAS, the training takes place using bash-scripts, please refer to the tutorial to understand more. The below training procedure is only for your local machines.
All you got to do is ….
python -m trajnetbaselines.lstm.trainer --path five_parallel_synth_split --augment
…. and your LSTM model starts training. Your model will be saved in the five_parallel_synth_split folder within OUTPUT_BLOCK. Currently, models are being saved according to the type of interaction models being used.
In order to train using interaction modules (eg. directional-grid) utilizing additional attribute (goal information), run
python -m trajnetbaselines.lstm.trainer --path five_parallel_synth_split --type 'directional' --goals --augment
## To know more options about trainer
python -m trajnetbaselines.lstm.trainer --help
For models trained on SCITAS, you can evaluate and visualize these models on your local machine. To do so, ‘scp’ the output files and log files from SCITAS to the repository on your local machine. Note: Maintain the same file structure.
One strength of TrajNet++ is its extensive evaluation system. You can read more about it in the metrics section here.
To perform extensive evaluation of your trained model. The results are saved in Results.png
# python -m evaluator.trajnet_evaluator --path <test_dataset> --output <path_to_model_pkl_file>
python -m evaluator.trajnet_evaluator --path five_parallel_synth --output OUTPUT_BLOCK/five_parallel_synth_split/lstm_vanilla_None.pkl OUTPUT_BLOCK/five_parallel_synth_split/lstm_goals_directional_None.pkl
## To know more options about evaluator
python -m evaluator.trajnet_evaluator --help
To know more about how the evaluation procedure works, please refer to this README.
Visualize learning curves of two different models
python -m trajnetbaselines.lstm.plot_log OUTPUT_BLOCK/five_parallel_synth_split/lstm_vanilla_None.pkl.log OUTPUT_BLOCK/five_parallel_synth_split/lstm_goals_directional_None.pkl.log
## To view the different log files generated, run the command below:
ls OUTPUT_BLOCK/five_parallel_synth_split/lstm_goals_directional_None.pkl*
## You will notice various log files in form of *.png
Visualize predictions of models
# python -m evaluator.visualize_predictions <ground_truth_file> <prediction_files>
python -m evaluator.visualize_predictions DATA_BLOCK/five_parallel_synth/test_private/orca_five_nontraj_synth.ndjson DATA_BLOCK/five_parallel_synth/test_pred/lstm_vanilla_None_modes1/orca_five_nontraj_synth.ndjson DATA_BLOCK/five_parallel_synth/test_pred/lstm_goals_directional_None_modes1/orca_five_nontraj_synth.ndjson --labels Vanilla D-Grid --n 10 --random -o visualize
## Note the addition of output argument above. The 10 random predictions are saved in the trajnetplusplusbaselines directory. Run:
ls visualize*.png
## You wil see 10 '.png' files with prefix 'visualize' as it was the provided output name.
Add visualizations (obtained using previous command) of 3 test scenes qualitatively comparing outputs of the vanilla model and D-Grid model (that uses goal information), as well as the quantitative evaluation (Results.png) in the README file of your forked repository.
DATA_BLOCK
│
└───real_data
│ └── train
| └── val (self-generated)
│ └── test
│ │ crowds_zara02.ndjson
│ │ biwi_eth.ndjson
│ │ crowds_uni_examples.ndjson
│
└───synth_data
│ └── train
| └── val (self-generated)
│ └── test
│ │ orca_synth.ndjson
| | collision_test.ndjson
rm <path_to_real_data>/train/cff*
You are encouraged to play with other interaction encoders and maybe, design your own! You can validate your designs on the synthetic data before trying out on real data.
Lets assume you have two models named ‘synth_model_name’ trained on TrajNet++ synthetic data and ‘real_model_name’ trained on TrajNet++ real data. Also, by default, you have the data directory structure as mentioned above.
## Generate for Real data
python -m evaluator.trajnet_evaluator --path real_data --output OUTPUT_BLOCK/real_data/<real_model_name>.pkl --write_only
## Generate for Real data
python -m evaluator.trajnet_evaluator --path synth_data --output OUTPUT_BLOCK/synth_data/<synth_model_name>.pkl --write_only
The above operations will save your model predictions in the test_pred folder within data directory as shown below:
DATA_BLOCK
│
└───real_data
│ └── train
| └── val (self-generated)
│ └── test_pred
| └── <real_model_name>_modes1
│ | crowds_zara02.ndjson
│ | biwi_eth.ndjson
│ | crowds_uni_examples.ndjson
│ └── test
└───synth_data
│ └── train
| └── val (self-generated)
│ └── test_pred
| └── <synth_model_name>_modes1
│ | orca_synth.ndjson
| | collision_test.ndjson
│ └── test
These test predictions need to be uploaded to AICrowd.
## KEEP THE SAME FOLDER NAMES and STRUCTURE given below !!
mkdir test
mkdir test/real_data
mkdir test/synth_data
cp DATA_BLOCK/real_data/test_pred/<real_model_name>_modes1/* test/real_data
cp DATA_BLOCK/synth_data/test_pred/<synth_model_name>_modes1/* test/synth_data
zip -r <my_model_name>.zip test/
## Upload the <my_model_name>.zip to AICrowd.
Done Done! :)
To help you get started with Git, here are some useful resources:
Git Handbook (10 min. read): https://guides.github.com/introduction/git-handbook/
Git Cheatsheet: https://training.github.com/downloads/github-git-cheat-sheet/
Please activate your virtual environment!
Do not forget to push your code on GitHub. It saves your progress! :) Refer to the GitHub resources if you haven’t yet.
The goal files contains the ‘final destination’ (goal) of the pedestrian in the ORCA simulator. It does not refer to the location at the end of the prediction period, but the location at the end of the simulation. We have access to these goals only for synthetic data, so only use ‘–goals’ command for synthetic data and not real data. Remember to shift the goal .pkl file to the goal_files folder as shown in the tutorial above.
You can transfer the .png files using ‘scp’ to your Desktop (the boring but simple way). Or you can use a text editor that allows you to open .png files from the terminal. I use Sublime Text.
]]>In this blog post, I provide a quick tutorial to converting external datasets into the desired .ndjson format using the TrajNet++ framework. This post will focus on utilizing the TrajNet++ dataset code for easily converting new datasets.
In this tutorial, I will convert the ETH dataset utilized by the Social GAN paper.
Before proceeding, please setup the base repositories. See ‘Setting Up Repositories’ here
## Checkout 'eth' branch of Trajnetplusplusdataset
git checkout -b eth origin/eth
## Download external data
sh download_data.sh
cp -r datasets/eth/ data/
def standard(line):
line = [e for e in line.split('\t') if e != '']
return TrackRow(int(float(line[0])),
int(float(line[1])),
float(line[2]),
float(line[3]))
Code snippet already provided in readers.py
For dataset conversion, we call the ‘raw dataset conversion’ code shown above in convert.py
def standard(sc, input_file):
print('processing ' + input_file)
return (sc
.textFile(input_file)
.map(readers.standard)
.cache())
Code snippet already provided in convert.py
Finally, we call the appropriate data files for conversion and categorization (See convert.py).
python -m trajnetdataset.convert --obs_len 8 --pred_len 12
Now that the dataset is ready [in output folder], you can train the model! :)
So, for converting any external dataset, all you got to do is 4 simple steps:
We recently released TrajNet++ Challenge for agent-agent based trajectory forecasting. Details regarding the challenge can be found here.
]]>In this blog post, I provide a kickstarter guide to our recently released TrajNet++ framework for human trajectory forecasting. We recently released TrajNet++ Challenge for agent-agent based trajectory forecasting. Details regarding the challenge can be found here. This post will focus on utilizing the TrajNet++ framework for easily creating datasets and learning human motion forecasting models.
On a high-level, Trajnet++ constitutes four primary components:
Trajnetplusplustools: This repository provides helper functions for trajectory prediction. For instance: trajectory categorization, evaluation metrics, prediction visualization.
Trajnetplusplusdataset: This repository provides scripts to generate train, val and test dataset splits from raw data as well as simulators.
Trajnetplusplusbaselines: This repository provides baseline models (handcrafted as well as data-driven) for human motion prediction. This repository also provides scripts to extensively evaluate the trained models.
Trajnetplusplusdata: This repository provides the already processed real world data as well as synthetic datasets conforming to human motion.
I describe how to get started using TrajNet++ with the help of a running example. We will create a synthetic dataset using ORCA simulator and train an LSTM-based model to perform trajectory prediction.
The first step is to setup the repositories, namely Trajnetplusplusdata for dataset generation and Trajnetplusplusbaselines for model training. Next, we setup the virtual environment and download the requirements.
## Create directory to setup Trajnet++
mkdir trajnet++
cd trajnet++
## Clone Repositories
git clone https://github.com/vita-epfl/trajnetplusplusdataset.git
git clone https://github.com/vita-epfl/trajnetplusplusbaselines.git
## Make virtual environment
virtualenv -p /usr/bin/python3.6 trajnetv
source trajnetv/bin/activate
## Download Requirements
cd trajnetplusplusbaselines/
pip install -e .
cd ../trajnetplusplusdataset/
pip install -e .
pip install -e '.[test, plot]'
Alright, our repositories are now setup!
Trajnetplusplusdataset helps in creating the dataset splits to train and test our prediction models. In this example, we will be using the ORCA simulator for generating our synthetic data. Therefore, we will setup the simulator with the help of this wonderful repo.
## Download Repository
wget https://github.com/sybrenstuvel/Python-RVO2/archive/master.zip
unzip master.zip
rm master.zip
## Setting up ORCA (steps provided in the Python-RVO2 repo)
cd Python-RVO2-master/
pip install cmake
pip install cython
python setup.py build
python setup.py install
cd ../
We also download the Social Force simulator available at this repository.
## Download Repository
wget https://github.com/svenkreiss/socialforce/archive/refs/heads/main.zip
unzip main.zip
rm main.zip
## Setting up Social Force
cd socialforce-main/
pip install -e .
cd ../
Now, we will generate controlled data using the ORCA simulator. We will generate 1000 scenarios of 5 pedestrains moving in an interactive setting.
## Destination to store generated trajectories
mkdir -p data/raw/controlled
python -m trajnetdataset.controlled_data --simulator 'orca' --num_ped 5 --num_scenes 1000
## To know more options of generating controlled data
python -m trajnetdataset.controlled_data --help
By default, the generated trajectories will be stored in ‘orca_circle_crossing_5ped_1000scenes_.txt’. Procedure for extracting publicly available datasets can be found here. Also, the goals of the generated trajectories are stored in the ‘goal_files’ folder under the same name as the .txt file.
We will now convert the generated ‘.txt’ file into the TrajNet++ data structure format. Moreover, we will choose to select only interacting scenes (Type III) from our generated trajectories. More details regarding our data format and trajectory categorization can be found on our challenge overview page.
For conversion, open the trajnetdataset/convert.py, comment the real dataset conversion part in main() and uncomment the below given snippet.
## Run the conversion
python -m trajnetdataset.convert --linear_threshold 0.3 --acceptance 0 0 1.0 0 --synthetic
## To know more options of converting data
python -m trajnetdataset.convert --help
Once the conversion process completes, your converted datasets will be available in the output folder. Trajnetplusplustools provides the following utilities to understand your dataset better. To visualize trajectories in terminal in MacOS, I use itermplot.
## obtain new dataset statistics
python -m trajnetplusplustools.dataset_stats output/train/*.ndjson
## visualize sample scenes
python -m trajnetplusplustools.trajectories output/train/*.ndjson --random
## visualize interactions (Default: Collision Avoidance)
mkdir interactions
python -m trajnetplusplustools.visualize_type output/train/*.ndjson
Finally, move the converted data and goal files (if necessary) to the trajnetbaselines folder.
mv output ../trajnetplusplusbaselines/DATA_BLOCK/synth_data
mv goal_files/ ../trajnetplusplusbaselines/
cd ../trajnetplusplusbaselines/
Now that the dataset is ready, its time to train the model! :)
Training models is more easier than generating datasets in Trajnet++ ! All you got to do is ….
python -m trajnetbaselines.lstm.trainer --path synth_data
…. and your LSTM model starts training. Your model will be saved in the synth_data folder within OUTPUT_BLOCK. Currently, models are being saved according to the type of interaction models being used.
In order to train using interaction modules (eg. nearest-neighour encoding) utilizing goal information, run
python -m trajnetbaselines.lstm.trainer --path synth_data --type 'nn' --goals
## To know more options about trainer
python -m trajnetbaselines.lstm.trainer --help
One strength of TrajNet++ is its extensive evaluation system. You can read more about it in the metrics section here.
To perform extensive evaluation of your trained model. The results are saved in Results.png
python -m evaluator.trajnet_evaluator --path synth_data --output OUTPUT_BLOCK/synth_data/lstm_vanilla_None.pkl
## To know more options about evaluator
python -m evaluator.trajnet_evaluator --help
To know more about how the evaluation procedure works, please refer to this README.
Visualize learning curves of two different models
python -m trajnetbaselines.lstm.plot_log OUTPUT_BLOCK/synth_data/lstm_vanilla_None.pkl.log OUTPUT_BLOCK/synth_data/lstm_goals_nn_None.pkl.log
Visualize predictions of models
# python -m evaluator.visualize_predictions <ground_truth_file> <prediction_file>
python -m evaluator.visualize_predictions DATA_BLOCK/synth_data/test_private/orca_five_synth.ndjson DATA_BLOCK/synth_data/test_pred/lstm_vanilla_None_modes1/orca_five_synth.ndjson --n 10 --random
I hope this blog provides you with the necessary kickstart for using TrajNet++. If you have any questions, feel free to post issues on Github. If you liked using TrajNet++, a token of appreciation to [email protected] would really go a long way for me ! :)
]]>