Pytorch Implementation of [Video Transformers for Autonomous Driving]

Report

Video Transformers for Autonomous Driving

Data Preparation

Video data

We leveraged the recently released large-scale Waymo Open Dataset. We used only the front images of 13 training tars (32.5GB) and 3 validation tars (7.5GB) to analyze the potential of our model.

Transformation of velocity from global to vehicle coordinate

The velocity of AV is provided in a global coordinate system. We need to transform the velocity data into a vehicle coordinate system so that we can calculate the acceleration in a vehicle coordinate system. The Waymo dataset provides a vehicle pose that transforms variables from vehicle to global coordinate. We can calculate a vehicle pose that transforms the variables from global to vehicle coordinate by taking a matrix inversion of vehicle pose.

Training and Testing

#testing with tiny scale
python3 train.py --cuda 3 --batch_size 20 --epochs 2 --lr 0.00007 --gamma 0.7 --seed  42 --num_frames  10 --num_dims  20 --num_layers  2 --num_heads  2 --dim_head  10 --mlp_dim  10 --drop_prob  0.4 --emb_drop_prob  0.4 --cls_dim  10

#training
python3 train.py --cuda 3 --batch_size 64 --epochs 100 --lr 0.00007 --gamma 0.7 --seed  42 --num_frames  10 --num_dims  128 --num_layers  6 --num_heads  8 --dim_head  128 --mlp_dim  128 --drop_prob  0.4 --emb_drop_prob  0.4 --cls_dim  64

Reference

[1] Mariusz  Bojarski,  Davide  D  Testa,  Daniel  Dworakowski,Bernhard Firner,  Beat Flepp,  Prasoon Goyal,  Lawrence D,Jackel,  Mathew  Monfort,  Urs  Muller,  Jiakai  Zhang,  et  al.End  to  end  learning  for  self-driving  cars.arXiv  preprintarXiv:1604.07316, 2016.

[2] Jacob Devlin,  Ming-Wei Chang,  Kenton Lee,  and KristinaToutanova.Bert:Pre-training   of   deep   bidirectionaltransformers  for  language  understanding.arXiv  preprintarXiv:1810.04805, 2018.

[3] Alexey  Dosovitskiy,  Lucas  Beyer,  Alexander  Kolesnikov,Dirk   Weissenborn,   Xiaohua   Zhai,   Thomas   Unterthiner,Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl-vain Gelly, et al.   An image is worth 16x16 words:  Trans-formers  for  image  recognition  at  scale.arXiv  preprintarXiv:2010.11929, 2020.

[4] Zhicheng  Gu,  Zhihao  Li,  Xuan  Di,  and  Rongye  Shi.   Anlstm-based autonomous driving model using a waymo opendataset.Applied Sciences, 10(6):2046, 2020.

[5] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.Deep residual learning for image recognition.   InProceed-ings of the IEEE conference on computer vision and patternrecognition, pages 770–778, 2016.

[6] Pengcheng  He,  Xiaodong  Liu,  Jianfeng  Gao,  and  WeizhuChen.  Deberta:  Decoding-enhanced bert with disentangledattention.arXiv preprint arXiv:2006.03654, 2020.

[7] Diederik P Kingma and Jimmy Ba.   Adam:  A method forstochastic  optimization.arXiv  preprint  arXiv:1412.6980,2014.

[8] Yang Liu and Mirella Lapata.  Text summarization with pre-trained encoders.arXiv preprint arXiv:1908.08345, 2019

[9] Pei  Sun,  Henrik  Kretzschmar,  Xerxes  Dotiwalla,  AurelienChouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou,Yuning Chai, Benjamin Caine, et al. Scalability in perceptionfor autonomous driving:  Waymo open dataset.  InProceed-ings of the IEEE/CVF Conference on Computer Vision andPattern Recognition, pages 2446–2454, 2020.

[10] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko-reit,  Llion  Jones,  Aidan  N  Gomez,  Lukasz  Kaiser,  and  Il-lia  Polosukhin.   Attention  is  all  you  need.arXiv  preprintarXiv:1706.03762, 2017

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
VidT_lite.py		VidT_lite.py
Video Transformers for Autonomous Driving_Jongwoo Park, Sounak Mondal.pdf		Video Transformers for Autonomous Driving_Jongwoo Park, Sounak Mondal.pdf
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pytorch Implementation of [Video Transformers for Autonomous Driving]

Report

Data Preparation

Video data

Transformation of velocity from global to vehicle coordinate

Training and Testing

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pytorch Implementation of [Video Transformers for Autonomous Driving]

Report

Data Preparation

Video data

Transformation of velocity from global to vehicle coordinate

Training and Testing

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages