Feature Request: Replace Albumentations with torchvision.transforms.v2 for PyTorch pose estimation

## Is your feature request related to a problem? Please describe.

Albumentations is an external dependency that has slowed down its development (last major release 1.4.3 in late 2023).

## Describe the solution you'd like

Replace the current Albumentations-based augmentation system in the PyTorch pose estimation module (`deeplabcut/pose_estimation_pytorch/data/transforms.py`) with `torchvision.transforms.v2`.
Benefits:
- **Dependency reduction**: Fewer external dependencies
- **Active development**: torchvision.transforms.v2 is actively maintained
- **Better integration**: Native support for `tv_tensors.KeyPoints`, `tv_tensors.BoundingBoxes`
- **Performance**: Optimized for PyTorch with potential GPU acceleration

## Describe alternatives you've considered

1. **Keep Albumentations**: Continue using it, but this relies on an external library with slower development
2. **Kornia**: Another option for differentiable augmentations, but less focused on standard image transforms
3. **Custom implementation**: Implement all transforms from scratch (too much work)

## Additional context

Current implementation uses ~15+ Albumentations transforms:
- Resize, LongestMaxSize, HorizontalFlip, Affine, PadIfNeeded
- Equalize, MotionBlur, GaussNoise, CoarseDropout, ElasticTransform
- Custom transforms: HFlip (keypoint-aware), KeypointAwareCrop, KeepAspectRatioResize, etc.
Relevant files:
- `deeplabcut/pose_estimation_pytorch/data/transforms.py` - main transforms implementation
- `deeplabcut/pose_estimation_pytorch/data/preprocessor.py`
- `deeplabcut/pose_estimation_pytorch/data/dataset.py`

I'd like to contribute to developing this feature but have some questions about the desired interface:
1. **Custom transforms vs. built-ins**: Should we implement custom DLC-specific transforms (e.g., `KeypointAwareCrop` for cropping around annotated keypoints) using PyTorch, or adapt to use built-in v2 transforms?
2. **API compatibility**: Should the config format remain the same (e.g., `hflip`, `affine`, `crop_sampling`), or take advantage of v2's declarative API?
3. **Backward compatibility**: Should we maintain Albumentations as a fallback, or fully migrate to v2?

Thank you in advance! 
Juan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Replace Albumentations with torchvision.transforms.v2 for PyTorch pose estimation #3240

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Feature Request: Replace Albumentations with torchvision.transforms.v2 for PyTorch pose estimation #3240

Description

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions