Skip to content

V3.3#53

Merged
VisionDepth merged 4 commits intoMain-Stablefrom
V3.3
Aug 10, 2025
Merged

V3.3#53
VisionDepth merged 4 commits intoMain-Stablefrom
V3.3

Conversation

@VisionDepth
Copy link
Owner

3D Pipeline Improvements – v3.3

  • Black Bar Handling

    • Detects and crops letterbox bars before depth estimation.
    • Re-adds bars post-processing with neutral depth to prevent stereo artifacts.
  • Depth Stability & Quality

    • Added temporal depth filtering (EMA) for smoother frame-to-frame depth.
    • Percentile-based normalization for consistent depth range across scenes.
    • Midtone shaping to enhance perceived depth and separation.
    • Optional curvature enhancement for more natural object roundness.
  • Stereo & Parallax Control

    • Dynamic IPD scaling based on scene depth variance.
    • Shift smoothing for foreground, midground, and background layers.
    • Edge-aware masking and feathering to reduce tearing/ghosting.
    • Subject-tracked zero-parallax plane with eased floating window adjustments.
    • Optional dynamic convergence bias tied to subject position.
  • Image Quality

    • GPU-accelerated depth-of-field with multi-level Gaussian blending.
    • Brightness-preserving sharpening with anti-clipping.
  • Framing & Output

    • Per-frame optional black bar cropping in the stereo stage.
    • Aspect-ratio-safe resizing and padding for per-eye frame alignment.
    • Multiple stereo formats supported (Full-SBS, Half-SBS, VR, anaglyph, interlaced).
  • Encoding & Stability

    • FFmpeg pipeline over stdin with correct CRF/CQ handling.
    • Graceful cleanup and resource release on completion or cancel.
    • Detailed logging for crop, skip, and encoder decisions.

Depth Pipeline Improvements – v3.3

  • Letterbox Detection & Cropping

    • Added ignore_letterbox_bars option to automatically detect top/bottom black bars.
    • Crops bars before sending frames to the depth model for cleaner predictions.
    • Stores bar metadata (top/bottom size, original resolution) in a .letterbox.json sidecar file.
  • Preserve Original Resolution

    • After depth prediction, resizes the depth map back to the cropped region size.
    • Re-adds original bars with a neutral depth value to avoid 3D distortion.
    • Ensures depth output matches original width/height exactly.
  • Crash Prevention & Safety Checks

    • Resets bar values to zero if detections are invalid (too large or empty frames).
    • Handles fade-ins and all-black first frames without breaking processing.
  • Grayscale Conversion Refactor

    • Unified convert_depth_to_grayscale() into a single reusable function.
    • Works with PIL.Image, torch.Tensor, and numpy.ndarray.
    • Includes NaN checks, shape handling, and safe fallbacks for bad frames.
  • Output Consistency

    • Standardized grayscale depth conversion across all depth output paths.
    • Maintains uniform depth range handling for downstream 3D processing.

### 3D Pipeline Improvements – v3.3

- **Black Bar Handling**
  - Detects and crops letterbox bars before depth estimation.
  - Re-adds bars post-processing with neutral depth to prevent stereo artifacts.

- **Depth Stability & Quality**
  - Added temporal depth filtering (EMA) for smoother frame-to-frame depth.
  - Percentile-based normalization for consistent depth range across scenes.
  - Midtone shaping to enhance perceived depth and separation.
  - Optional curvature enhancement for more natural object roundness.

- **Stereo & Parallax Control**
  - Dynamic IPD scaling based on scene depth variance.
  - Shift smoothing for foreground, midground, and background layers.
  - Edge-aware masking and feathering to reduce tearing/ghosting.
  - Subject-tracked zero-parallax plane with eased floating window adjustments.
  - Optional dynamic convergence bias tied to subject position.

- **Image Quality**
  - GPU-accelerated depth-of-field with multi-level Gaussian blending.
  - Brightness-preserving sharpening with anti-clipping.

- **Framing & Output**
  - Per-frame optional black bar cropping in the stereo stage.
  - Aspect-ratio-safe resizing and padding for per-eye frame alignment.
  - Multiple stereo formats supported (Full-SBS, Half-SBS, VR, anaglyph, interlaced).

- **Encoding & Stability**
  - FFmpeg pipeline over stdin with correct CRF/CQ handling.
  - Graceful cleanup and resource release on completion or cancel.
  - Detailed logging for crop, skip, and encoder decisions.
### Depth Pipeline Improvements – v3.3

- **Letterbox Detection & Cropping**
  - Added `ignore_letterbox_bars` option to automatically detect top/bottom black bars.
  - Crops bars before sending frames to the depth model for cleaner predictions.
  - Stores bar metadata (top/bottom size, original resolution) in a `.letterbox.json` sidecar file.

- **Preserve Original Resolution**
  - After depth prediction, resizes the depth map back to the cropped region size.
  - Re-adds original bars with a neutral depth value to avoid 3D distortion.
  - Ensures depth output matches original width/height exactly.

- **Crash Prevention & Safety Checks**
  - Resets bar values to zero if detections are invalid (too large or empty frames).
  - Handles fade-ins and all-black first frames without breaking processing.

- **Grayscale Conversion Refactor**
  - Unified `convert_depth_to_grayscale()` into a single reusable function.
  - Works with `PIL.Image`, `torch.Tensor`, and `numpy.ndarray`.
  - Includes NaN checks, shape handling, and safe fallbacks for bad frames.

- **Output Consistency**
  - Standardized grayscale depth conversion across all depth output paths.
  - Maintains uniform depth range handling for downstream 3D processing.
Version Tag Change
@VisionDepth VisionDepth merged commit 3df9af9 into Main-Stable Aug 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant