English | 简体中文
Note: Latency is measured on an NVIDIA T4 GPU with batch size 1 under FP16 precision using TensorRT (v10.6).
| Model | Size | AP50:95 | #Params | GFLOPs | Latency (ms) | Config | Log | Checkpoint |
|---|---|---|---|---|---|---|---|---|
| ECDet-S | 640 | 51.7 | 10 | 26 | 5.41 | config | log | model |
| ECDet-M | 640 | 54.3 | 18 | 53 | 7.98 | config | log | model |
| ECDet-L | 640 | 57.0 | 31 | 101 | 10.49 | config | log | model |
| ECDet-X | 640 | 57.9 | 49 | 151 | 12.70 | config | log | model |
| Model | Size | AP50:95 | #Params | GFLOPs | Latency (ms) | Config | Log | Checkpoint |
|---|---|---|---|---|---|---|---|---|
| ECSeg-S | 640 | 43.0 | 10 | 33 | 6.96 | config | log | model |
| ECSeg-M | 640 | 45.2 | 20 | 64 | 9.85 | config | log | model |
| ECSeg-L | 640 | 47.1 | 34 | 111 | 12.56 | config | log | model |
| ECSeg-X | 640 | 48.4 | 50 | 168 | 14.96 | config | log | model |
# Install dependencies
pip install -r requirements.txtThe easiest way to test EdgeCrafter is to run inference on a sample image using a pre-trained model.
# 1. Download a pre-trained model (e.g., ECDet-L)
wget https://github.com/capsule2077/edgecrafter/releases/download/edgecrafterv1/ecdet_l.pth
# 2. Run PyTorch inference
# Make sure to replace `path/to/your/image.jpg` with an actual image path
python tools/inference/torch_inf.py -c configs/ecdet/ecdet_l.yml -r ecdet_l.pth -i path/to/your/image.jpgObject Detection
To train on your custom detection dataset in the COCO format, modify the custom.yml configuration file:
task: detection
evaluator:
type: CocoEvaluator
iou_types: ['bbox']
verbose: False # Set to True to output per-category AP
num_classes: 80 # Number of classes in your dataset
remap_mscoco_category: False # Set to False to prevent automatic remapping of category IDs
train_dataloader:
type: DataLoader
dataset:
type: CocoDetection
img_folder: /path/to/your/dataset/train
ann_file: /path/to/your/dataset/train/annotations.json
...
val_dataloader:
type: DataLoader
dataset:
type: CocoDetection
img_folder: /path/to/your/dataset/val
ann_file: /path/to/your/dataset/val/annotations.json
...
Optional: To output per-category AP during evaluation, set verbose: True in your dataset configuration:
evaluator:
type: CocoEvaluator
iou_types: ['bbox']
verbose: True # Output per-category AP
Instance Segmentation
To train on a custom segmentation dataset in COCO format, simply follow the detection and update the img_folder and ann_file paths.To reproduce our results on COCO2017, follow these steps:
Note: Due to the non-deterministic nature of grid_sample during backward operations, results may vary slightly (approx. 0.2 AP). For further details, refer to this PyTorch discussion.
-
Download COCO2017 from OpenDataLab or the official COCO website.
-
Organize the dataset as follows:
/path/to/COCO2017/ ├── annotations/ │ ├── instances_train2017.json │ └── instances_val2017.json ├── train2017/ └── val2017/ -
Update paths in coco.yml:
train_dataloader: dataset: img_folder: /path/to/COCO2017/train2017/ ann_file: /path/to/COCO2017/annotations/instances_train2017.json val_dataloader: dataset: img_folder: /path/to/COCO2017/val2017/ ann_file: /path/to/COCO2017/annotations/instances_val2017.json
Model configuration files are located in configs/ecdet. Choose the appropriate configuration based on your computational budget and accuracy requirements.
For custom datasets, you may need to adjust specific parameters in your configuration file (e.g. ecdet_s.yml):
__include__: [
'../dataset/coco.yml', # Base dataset configuration. Replace with custom.yml when using a custom dataset
'ecdet.yml',
]
ViTAdapter:
name: ecvitt
embed_dim: 192
num_heads: 3
interaction_indexes: [10, 11] # Indices of transformer blocks used for feature interaction/fusion
weights_path: ecvits/ecvitt.pth # Pretrained backbone. Automatically downloaded on first use.
skip_load_backbone: False # If True, the backbone will be initialized from scratch (no pretrained weights)
optimizer:
type: AdamW
params:
- # Backbone parameters excluding normalization layers and bias
params: '^(?=.*backbone)(?!.*(?:norm|bn|bias)).*$'
lr: 0.000025
- # Backbone normalization layers (norm/bn) and bias parameters
params: '^(?=.*backbone)(?=.*(?:norm|bn|bias)).*$'
lr: 0.000025
weight_decay: 0.0
- # Non-backbone normalization layers and bias parameters
params: '^(?!.*\.backbone)(?=.*(?:norm|bn|bias)).*$'
weight_decay: 0.0
lr: 0.0005 # Base learning rate for non-backbone parameters
betas: [0.9, 0.999] # AdamW beta coefficients
weight_decay: 0.0001 # Weight decay applied to parameters except norm/bias
epochs: 74 # Total training epochs. COCO 6× schedule (12 epochs = 1×) plus 2 extra epochs without augmentation
warmup_iter: 2000 # Number of iterations used for learning rate warmup
lr_gamma: 0.5 # Learning rate decay factor during LR scheduling
eval_spatial_size: [640, 640] # Input resolution for training/evaluation (height, width). Use [1280,1280] for high-resolution training
train_dataloader:
total_batch_size: 32 # Global batch size across all GPUs (adjust depending on GPU memory)
dataset:
transforms:
mosaic_epoch: 36 # Apply Mosaic augmentation until this epoch. Recommended to set this to half of stop_epoch
mosaic_prob: 0.75 # Probability of applying Mosaic augmentation
stop_epoch: 72 # Disable all augmentations after this epoch (last 2 epochs without augmentation)
collate_fn:
mixup_prob: 0.75 # Probability of applying MixUp augmentation
mixup_epoch: 36 # Apply MixUp augmentation until this epoch. Recommended to set this to half of stop_epochModel configuration files are located in configs/ecseg. The configuration structure is identical to detection, but inherits from the detection config and adds segmentation-specific settings.
For custom datasets, you may need to adjust the following settings in your config file(e.g. ecseg_s.yml):
__include__: [
'../dataset/coco.yml', # Base dataset configuration. Replace with custom.yml when using a custom dataset
'../ecdet/ecdet_s.yml', # Inherit detection model configuration
'ecseg.yml', # Segmentation-specific configuration
]
train_dataloader: # Add only the parameters you wish to override
...Train from scratch using 4 GPUs:
# Detection
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --standalone --nproc_per_node=4 \
train.py -c configs/ecdet/ecdet_{SIZE}.yml --use-amp --seed=0
# Segmentation
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --standalone --nproc_per_node=4 \
train.py -c configs/ecseg/ecseg_{SIZE}.yml --use-amp --seed=0Replace {SIZE} with s, m, l, or x based on your chosen model size.
Evaluate a trained model:
# Detection
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --standalone --nproc_per_node=4 \
train.py -c configs/ecdet/ecdet_{SIZE}.yml --test-only -r /path/to/model.pth
# Segmentation
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --standalone --nproc_per_node=4 \
train.py -c configs/ecseg/ecseg_{SIZE}.yml --test-only -r /path/to/model.pthFine-tune from a pre-trained checkpoint:
# Detection
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --standalone --nproc_per_node=4 \
train.py -c configs/ecdet/ecdet_{SIZE}.yml --use-amp --seed=0 -t /path/to/model.pth
# Segmentation
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --standalone --nproc_per_node=4 \
train.py -c configs/ecseg/ecseg_{SIZE}.yml --use-amp --seed=0 -t /path/to/model.pthAdditional utilities and tools can be found in the tools directory:
-
Visualization Tools
PyTorch inference:
# Detection python tools/inference/torch_inf.py -c configs/ecdet/ecdet_{SIZE}.yml -r ecdet_{SIZE}.pth -i example.jpg # Segmentation python tools/inference/torch_inf.py -c configs/ecseg/ecseg_{SIZE}.yml -r ecseg_{SIZE}.pth -i example.jpg
ONNX inference:
# Detection python tools/inference/onnx_inf.py -o ecdet_{SIZE}.onnx -i example.jpg # Segmentation python tools/inference/onnx_inf.py -o ecseg_{SIZE}.onnx -i example.jpg
-
Export Tools
Export to ONNX format:
# Detection python tools/deployment/export_onnx.py -c configs/ecdet/ecdet_{SIZE}.yml -r ecdet_{SIZE}.pth # Segmentation python tools/deployment/export_onnx.py -c configs/ecseg/ecseg_{SIZE}.yml -r ecseg_{SIZE}.pth