Skip to content

dafne-imaging/dafne-torch-trainer

 
 

Repository files navigation

DANTE — DAfNe TrainEr

PyPI version Python License: GPL v3

PyTorch-based model trainer for the Dafne segmentation framework. Trains 2D and 3D U-Net-style models on medical images and serializes them into the .model format used by dafne-dl.

Installation

pip install dante-trainer

Requires Python >= 3.9. A CUDA-capable GPU is strongly recommended for training.

Entry points

Command Description
dante Launch the PyQt5 GUI trainer
dante_train Command-line training interface

Input data format

Training data must be .npz files, each containing:

  • data: the image volume (numpy array)
  • mask_<label>: one binary mask per anatomical structure (e.g. mask_muscle, mask_femur)
  • resolution: voxel spacing array

The data folder is scanned recursively. All .npz files found are split into train and validation sets automatically.

Output

All files produced by a training run are saved inside a dedicated folder named after the model, created automatically under the output directory. For example, given --output /models/mymodel.model, the following structure is created:

/models/mymodel/
    mymodel.model          # final serialized model (DynamicTorchModel format)
    mymodel_best_model.pth # best checkpoint by validation Dice (removed after packaging)
    mymodel.csv            # per-epoch metrics log
    logs/
        train/             # TensorBoard training logs
        val/               # TensorBoard validation logs

The .model file embeds:

  • model weights
  • network architecture metadata (model name, spatial dims, patch size, spacing, etc.)
  • training metadata
  • EWC snapshot (Fisher Information Matrix + parameter snapshot, used for continual learning)
  • a dependency hint pointing to dafne-monai-inference for inference-time use

CLI usage

Training from scratch

dante_train --data <data_dir> --output <output_path> [options]
Argument Short Default Description
--data -d required Path to the folder containing training data
--output -o required Output path for the .model file
--epochs 50 Number of training epochs
--batch-size 2 Batch size
--lr 0.001 Learning rate
--3d off Train a 3D model (default: 2D)
--dynunet off Use Dynamic U-Net with auto-computed parameters
--levels 5 Number of U-Net encoder/decoder levels
--kernel-size 3 Convolution kernel size
--conv-layers 2 Number of convolutional layers per level
--early-stopping off Stop training when validation loss stops improving
--mixed-precision off Enable AMP (automatic mixed precision)
--scheduler off Enable learning rate scheduler

Example:

dante_train -d /data/training_set -o /models/my_model.model --epochs 100 --lr 0.0005 --early-stopping

Fine-tuning an existing model

Pass --pretrained with the path to an existing .model file, and set --mode to finetune, lora, or continual.

dante_train --data <data_dir> --output <output_path> --pretrained <model_path> --mode finetune [options]
Argument Default Description
--pretrained none Path to a pretrained .model file
--mode scratch Training mode: scratch, finetune, lora, or continual
--freeze-degree 0.5 Fraction of layers to freeze (used with --mode finetune)
--gradual-unfreeze off Gradually unfreeze frozen layers during training
--lora-rank 8 LoRA rank (used with --mode lora)
--lora-alpha 16 LoRA alpha scaling factor (used with --mode lora)
--lambda-reg 1.0 EWC regularization weight (used with --mode continual)

Example — fine-tuning with 70% of layers frozen:

dante_train -d /data/new_data -o /models/finetuned.model --pretrained /models/base.model \
    --mode finetune --freeze-degree 0.7 --gradual-unfreeze --epochs 30

Example — LoRA adaptation:

dante_train -d /data/new_data -o /models/lora.model --pretrained /models/base.model \
    --mode lora --lora-rank 8 --lora-alpha 16 --epochs 30

Training modes

  • From scratch (--mode scratch): network architecture and preprocessing are derived automatically from dataset statistics (median spacing, median shape, label count).
  • Fine-tuning (--mode finetune): loads an existing .model file and resumes training, preserving the original architecture. Supports partial freezing and gradual unfreezing.
  • LoRA (--mode lora): injects low-rank adapter layers into the frozen base model. Only adapter weights are trained. Useful for adaptation with very little data.
  • Continual learning (--mode continual): fine-tunes on a new task while penalizing changes to weights that were important for the previous task, using Elastic Weight Consolidation (EWC). Requires --pretrained pointing to a .model file produced by a prior training run.

About

Repository for model training and fine-tuning in dafne ecosystem

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%