- pip install -r requirements.txt
- Tested on Linux
- Python 3.6.5
- Pytorch 0.4.1
- OpenCV 3.4.4
NVIDIA GPU (12G or 24G memory) + CUDA cuDNN
Prepare your data before training. The format of your data should follow the file in datasets.
Please note that the pedestrains selection were made manully and there is no automated process for this.
Running the training script
bash scripts/train_unet_256.sh
Running the testing script
bash scripts/test_unet_256.sh
Run python -m visdom.server to see the training process.
- After selection is done, you need to resize the dataset into 512x256.
- After resizing, you apply noise using the pixel-label wise to draw a bounding box and apply peper-and-noise on selected images. (unfortunately I don't have automated process for this)
# Resize source images
python tools/process.py \
--input_dir photos/original \
--operation resize \
--output_dir photos/resized
# Combine resized images with blanked images
python tools/process.py \
--input_dir photos/resized \
--b_dir photos/blank \
--operation combine \
--output_dir photos/combined
Heavily borrow the code from pix2pix, Pedestrian-Synthesis-GAN and pix2pixHD