Training hyperparameters in this repo are defined in train.py, including augmentation settings:
|
# Training hyperparameters f |
|
hyp = {'giou': 1.2, # giou loss gain |
|
'xy': 4.062, # xy loss gain |
|
'wh': 0.1845, # wh loss gain |
|
'cls': 15.7, # cls loss gain |
|
'cls_pw': 3.67, # cls BCELoss positive_weight |
|
'obj': 20.0, # obj loss gain |
|
'obj_pw': 1.36, # obj BCELoss positive_weight |
|
'iou_t': 0.194, # iou training threshold |
|
'lr0': 0.00128, # initial learning rate |
|
'lrf': -4., # final LambdaLR learning rate = lr0 * (10 ** lrf) |
|
'momentum': 0.95, # SGD momentum |
|
'weight_decay': 0.000201, # optimizer weight decay |
|
'hsv_s': 0.8, # image HSV-Saturation augmentation (fraction) |
|
'hsv_v': 0.388, # image HSV-Value augmentation (fraction) |
|
'degrees': 1.2, # image rotation (+/- deg) |
|
'translate': 0.119, # image translation (+/- fraction) |
|
'scale': 0.0589, # image scale (+/- gain) |
|
'shear': 0.401} # image shear (+/- deg) |
|
|
We began with darknet defaults before evolving the values using the result of our hyp evolution code:
python3 train.py --data data/coco.data --weights '' --img-size 320 --epochs 1 --batch-size 64 -- accumulate 1 --evolve
The process is simple: for each new generation, the prior generation with the highest fitness (out of all previous generations) is selected for mutation. All parameters are mutated simultaneously based on a normal distribution with about 20% 1-sigma:
|
# Mutate |
|
init_seeds(seed=int(time.time())) |
|
s = [.15, .15, .15, .15, .15, .15, .15, .15, .15, .00, .05, .20, .20, .20, .20, .20, .20, .20] # sigmas |
|
for i, k in enumerate(hyp.keys()): |
|
x = (np.random.randn(1) * s[i] + 1) ** 2.0 # plt.hist(x.ravel(), 300) |
|
hyp[k] *= float(x) # vary by sigmas |
|
|
Fitness is defined as a weighted mAP and F1 combination at the end of epoch 0, under the assumption that better epoch 0 results correlate to better final results, which may or may not be true.
|
def fitness(x): |
|
# Returns fitness (for use with results.txt or evolve.txt) |
|
return 0.5 * x[:, 2] + 0.5 * x[:, 3] # fitness = 0.5 * mAP + 0.5 * F1 |
|
|
An example snapshot of the results are here. Fitness is on the y axis (higher is better).
from utils.utils import *; plot_evolution_results(hyp)

Training hyperparameters in this repo are defined in train.py, including augmentation settings:
yolov3/train.py
Lines 35 to 54 in df4f25e
We began with darknet defaults before evolving the values using the result of our hyp evolution code:
python3 train.py --data data/coco.data --weights '' --img-size 320 --epochs 1 --batch-size 64 -- accumulate 1 --evolveThe process is simple: for each new generation, the prior generation with the highest fitness (out of all previous generations) is selected for mutation. All parameters are mutated simultaneously based on a normal distribution with about 20% 1-sigma:
yolov3/train.py
Lines 390 to 396 in df4f25e
Fitness is defined as a weighted mAP and F1 combination at the end of epoch 0, under the assumption that better epoch 0 results correlate to better final results, which may or may not be true.
yolov3/utils/utils.py
Lines 605 to 608 in bd92457
An example snapshot of the results are here. Fitness is on the y axis (higher is better).

from utils.utils import *; plot_evolution_results(hyp)