Is your feature request related to a problem? Please describe.
Albumentations is an external dependency that has slowed down its development (last major release 1.4.3 in late 2023).
Describe the solution you'd like
Replace the current Albumentations-based augmentation system in the PyTorch pose estimation module (deeplabcut/pose_estimation_pytorch/data/transforms.py) with torchvision.transforms.v2.
Benefits:
- Dependency reduction: Fewer external dependencies
- Active development: torchvision.transforms.v2 is actively maintained
- Better integration: Native support for
tv_tensors.KeyPoints, tv_tensors.BoundingBoxes
- Performance: Optimized for PyTorch with potential GPU acceleration
Describe alternatives you've considered
- Keep Albumentations: Continue using it, but this relies on an external library with slower development
- Kornia: Another option for differentiable augmentations, but less focused on standard image transforms
- Custom implementation: Implement all transforms from scratch (too much work)
Additional context
Current implementation uses ~15+ Albumentations transforms:
- Resize, LongestMaxSize, HorizontalFlip, Affine, PadIfNeeded
- Equalize, MotionBlur, GaussNoise, CoarseDropout, ElasticTransform
- Custom transforms: HFlip (keypoint-aware), KeypointAwareCrop, KeepAspectRatioResize, etc.
Relevant files:
deeplabcut/pose_estimation_pytorch/data/transforms.py - main transforms implementation
deeplabcut/pose_estimation_pytorch/data/preprocessor.py
deeplabcut/pose_estimation_pytorch/data/dataset.py
I'd like to contribute to developing this feature but have some questions about the desired interface:
- Custom transforms vs. built-ins: Should we implement custom DLC-specific transforms (e.g.,
KeypointAwareCrop for cropping around annotated keypoints) using PyTorch, or adapt to use built-in v2 transforms?
- API compatibility: Should the config format remain the same (e.g.,
hflip, affine, crop_sampling), or take advantage of v2's declarative API?
- Backward compatibility: Should we maintain Albumentations as a fallback, or fully migrate to v2?
Thank you in advance!
Juan
Is your feature request related to a problem? Please describe.
Albumentations is an external dependency that has slowed down its development (last major release 1.4.3 in late 2023).
Describe the solution you'd like
Replace the current Albumentations-based augmentation system in the PyTorch pose estimation module (
deeplabcut/pose_estimation_pytorch/data/transforms.py) withtorchvision.transforms.v2.Benefits:
tv_tensors.KeyPoints,tv_tensors.BoundingBoxesDescribe alternatives you've considered
Additional context
Current implementation uses ~15+ Albumentations transforms:
Relevant files:
deeplabcut/pose_estimation_pytorch/data/transforms.py- main transforms implementationdeeplabcut/pose_estimation_pytorch/data/preprocessor.pydeeplabcut/pose_estimation_pytorch/data/dataset.pyI'd like to contribute to developing this feature but have some questions about the desired interface:
KeypointAwareCropfor cropping around annotated keypoints) using PyTorch, or adapt to use built-in v2 transforms?hflip,affine,crop_sampling), or take advantage of v2's declarative API?Thank you in advance!
Juan