Release notes from VisionDepth3D

VisionDepth3Dv3.2.8 Release

2026-02-11T22:18:49Z

VisionDepth3D v3.8.2

This release brings major depth engine upgrades, large real-time performance gains, and important stability fixes across both offline rendering and live 3D preview.

Expect faster playback, cleaner depth output, improved codec reliability, and a smoother overall workflow.

New Depth Engines

Depth Anything 3 (DA3) Integration

Native DA3 backend (not Hugging Face pipeline based)
Supports DA3 Small, Base, Large, Giant, and Metric variants
Proper resolution handling and depth normalization
Faster warm-up and improved batching support

Video Depth Anything (VDA)

Sequence-aware video depth inference
Temporal processing for smoother depth output
Target FPS control for heavy footage
Unified post-processing with other depth engines

Performance Improvements

Live 3D

40 to 70 percent FPS increase on most GPUs
Persistent GPU buffers eliminating per-frame allocations
Smoother depth refresh scheduling
Reduced jitter and stutter

Offline Depth and Rendering

Single-pass resizing reducing CPU overhead
Faster FP16 GPU inference
Optimized ONNX runtime sessions
FFmpeg piping enabled by default for faster encoding

Stability and Quality Fixes

ONNX Models

Fixed Distill-Any-Depth shape mismatch crashes
Enforced correct inference resolution (518×518)
Aspect-ratio safe preprocessing without stretching
Cleaner backend detection and logging

Letterbox Handling

Black bars no longer corrupt depth maps
Neutral depth fill prevents white banding artifacts
Stable detection across frames

3D Generator Improvements

Full render state reset per clip to eliminate drift artifacts
Smoother convergence and floating window behavior
Corrected output sizing for VR, SBS, and interlaced modes
Optional convergence crosshair overlay in Preview GUI
Cleaner encoding settings layout

GUI and Workflow Enhancements

Depth Estimation tab renamed to Depth Engine
Fixed preset loading from menu dropdown
Fixed Output Path menu action
Simplified File menu using preset system only
Built-in VisionDepth3D Updater in Help menu
Confirmation prompt before updating

Upgrade Note

Back up your weights/ and presets/ folders before uninstalling older versions.

Use VisionDepth3D_Setup_Downloader to install v3.8.2 and required .bin files.

Thanks to everyone supporting VisionDepth3D and helping shape each release.

VisionDepth3Dv3.8.1 - Release Bug patch

2025-12-28T17:42:36Z

VisionDepth3D v3.8 – Bug Patch

1) Depth Estimation Inference Error

Fixed progress bar error for depth estimation causing error on inference

Back up your weights/ and presets/ folders before uninstalling v3.8
Then run VisionDepth3D_Setup_Downloader to download the official
VisionDepth3D v3.8.1 Windows installer and required .bin files.

VisionDepth3Dv3.8 - Release

2025-12-18T02:28:52Z

VisionDepth3D v3.8 – Changelog

1) Depth Estimation Tab

Depth Models

Fixed ONNX model loading:
- Distill-Any-Depth (inference resolution 518×518, batch size 8)
- Video Depth Anything (inference resolution 512×288, batch size 8)
Implemented LBM depth model (dev version). Thanks to Aether for the implementation fix.
Removed depth models from the dropdown that returned no d_type.
Fixed Hugging Face model downloads and caching so zoo models consistently save inside the app weights/ directory (no more extra .cache downloads).
Updated Transformers image processor loading to prefer use_fast=True when available (with automatic fallback when unsupported).

Depth Backend

Implemented temporal smoothing in the depth pipeline to reduce flicker and improve temporal stability of depth map output.
Packaged VisionDepth3D.exe with Distill-Any-Depth (ONNX), Video Depth Anything (ONNX), and Depth Anything v2 Giant weights.

2) 3D Render Tab

UI Fixes

Added buttons for encoder settings and processing options.
Implemented multi-language support and tooltips for new dialog boxes.
Adjusted preview image window size and video info layout to prevent window overflow.
3D tab columns now stack correctly when resizing the window on smaller screens.

3D Backend

Reworked Auto Crop Black Bars to use first-frame detection with cached crop reuse.
Prevents per-frame crop jitter and depth/frame misalignment.
Improves stability for cinema content with subtle letterboxing.
Keep Audio checkbox now respects the user-selected output container instead of forcing MP4.

3) Frametool Backend

Reworked Frametool backend to support SSResNet models for feature model integration.

4) Console Improvements

Standardized startup console messages to clearly reflect which subsystems are initializing (Torch, depth estimation, upscaler, external 3D pipeline, language, settings).
Unified compute device reporting across pipelines for consistent and clearer console output.
Suppressed optional xFormers dependency warning on startup.
Prevented duplicate language loading during settings restore.

Summary

v3.8 focuses on stabilizing depth estimation, improving model compatibility,
and refining the 3D Render tab UI with better layout behavior, clearer diagnostics, and improved localization support.

Back up your weights/ and presets/ folders before uninstalling v3.7.
Then run VisionDepth3D_Setup_Downloader to download the official
VisionDepth3D v3.8 Windows installer and required .bin files.

(Optional but recommended) Clear the Hugging Face cache to free space and
avoid duplicate model downloads:
C:\Users\YOUR_USERNAME\.cache\huggingface

VisionDepth3Dv3.7 - Release

2025-11-26T17:27:10Z

VisionDepth3D v3.7 –Release Changelog

1) Live 3D Capture Overhaul

Live 3D Capture received a full stability and quality pass.

What is new:

Optional live audio passthrough for external capture devices, with device selection and audio delay control.
Audio routed through DirectShow and WASAPI, with an FFplay based monitor for low latency listening.
Color channel controls to fix purple and red tint issues on some capture cards.
Tuning for real time depth inference so Live Capture can run at practical frame rates on 1080p HDMI sources.
A headless mode (--no-preview) so you can run capture without a local preview window.
Early groundwork for browser based SBS VR streaming with synchronized audio and video.

What is fixed:

GUI settings (resolution, backend, FPS and more) are now correctly applied when starting Live Capture from the UI.
Capture failures like “no frames arriving” are resolved by enforcing the correct fourcc and backend.
Audio is now present in Live Capture sessions instead of silent output.
Frame pacing is smoother and depth plus stereo warp no longer hit the same FPS bottlenecks as before.

2) Floating Window, Depth Stability and Black Bar Handling

The stereo presentation pipeline has been tightened up for more comfortable 3D.

Dynamic Floating Window (DFW):

Rebuilt the floating window logic so it masks only one edge at a time, based on the dominant parallax direction.
Adds a minimum parallax threshold so the window stays off when depth is near the screen plane.
Uses temporal smoothing and easing so the window glides in and out instead of popping or flickering.
Supports both soft faded edges and solid black cinema bars through a single toggle.

Result: fewer edge violations, a cleaner frame in VR and on monitors, and a more cinema friendly presentation.

Frame jitter and temporal stability:

Fixed depth “breathing” where scenes would appear to move in and out over time.
Introduced several smoothing passes over subject depth, depth percentiles and convergence.
Added a global parallax smoother for foreground, midground and background layers.

Result: more stable parallax over time, less shimmer and a more comfortable stereo experience.

Auto crop for black bars:

Improved black bar detection during fades and dark transitions.
Added guards so detection does not update on very dark frames.
Handles changes in letterbox height without vertical drift.

Result: 2.35:1 and similar letterboxed content now auto crops in a reliable and repeatable way.

3) Unified Depth Pipeline and Platform Support

The Depth tab has been upgraded into a unified, cross platform pipeline.

Multi backend support (CUDA, ROCm, MPS, CPU):

Device detection has been rewritten so CUDA is no longer assumed by default.
The app now picks the best available backend automatically.

Supported depth backends:

CUDA on NVIDIA GPUs
ROCm on AMD GPUs
MPS on Apple Silicon
CPU fallback when no GPU is present

This prevents crashes on AMD and macOS, avoids accidental CPU only runs on capable GPUs, and lays the foundations for Linux builds.

Codec selection for depth exports:

The Depth tab now has a Video Codec dropdown.
You can select from hardware encoders (NVENC, AMF, QSV) as well as CPU encoders (libx264, libx265, AV1, and legacy MPEG-4 variants).
XVID and other problematic codecs now have safer defaults and better behavior on non NVIDIA systems.
AV1 has guard rails where OpenCV decoding is limited, with warnings where needed.
Codec support is now aligned with the 3D Converter and FrameTools.

Depth pipeline control:

Depth renders now support Pause, Resume and Cancel.
Pauses release resources more safely and cancels avoid corrupt output files.
Clear status states show when a job is running, paused, canceling or completed.

4) 3D Pipeline and UX Polish

The main 3D converter pipeline has been cleaned up and extended.

New Keep Original Audio option to pass through source audio into the final 3D export without re encoding.
New image based 3D pipeline that runs through the same renderer, ideal for single frame 3D stills.
Mode selector is now wired to switch cleanly between Single, Batch and Image workflows inside the same UI.
A 3D filename suffix system automatically labels exports by format and eye mode
(examples: _LRF_Full_SBS, _LRF_Half_SBS, _VR, _Anaglyph, _Interlaced, _LRF_Left, _LRF_Right).
Multi language labels and tooltips across the app have been reviewed and cleaned up.

5) Depth Blender Preview

The Depth Blender tab has been upgraded into a more visual tool.

Live preview now shows the base V2 depth map and the blended result side by side.
All blend parameters (white strength, feather blur, CLAHE, bilateral filters) update the preview in real time.
A frame scrubber lets you move through frames in a sequence and see how the blend behaves across time, before running a full batch on folders or videos.

Summary

VisionDepth3D v3.7 focuses on stability, cross platform support and workflow quality.

Live 3D Capture is more stable, more accurate and closer to being stream ready.
The stereo pipeline has better temporal behavior and cleaner edges.
The Depth tab now runs on NVIDIA, AMD ROCm, Apple Silicon and CPU only setups with flexible codecs and playback options.
The 3D converter and Depth Blender both gained quality of life improvements that make it easier to preview, tune and export 3D content.

These changes set the stage for future Linux builds, more advanced streaming paths and additional 3D presets in upcoming releases.

How to Install

Go to the VisionDepth3D Releases page
Download the latest installer .exe and .bin parts
Place all files in the same folder
Run the .exe installer and follow the prompts
Launch VisionDepth3D from the Start Menu or Desktop shortcut

Download VisionDepth3D Release Installer to simplify fetching most recent releases

For source installation and advanced setup see the Installation Guide.

VisionDepth3Dv3.6.2 - Release

2025-10-08T18:14:52Z

VisionDepth3D v3.6.2 – Bug Patches & Cleanup

Adapters
• Fixed Hugging Face call in depthanything_adapter.py.
• Fixed Depth Anything V2 Giant download in VisionDepth3D.py.

UI & Codec
• Fixed threaded render button.
• Fixed codec bug where output wasn’t respecting selected codec.

Codebase Cleanup
• Removed broken/unused DepthCrafter files (depth_crafter_ppl.py, depthcrafter_adapter.py, weights dir).
• Cleaned up render_depth.py, dropped legacy/unused code.

Assets
• Deleted old previews and logo icon.
• Added updated UI photos.

VisionDepth3Dv3.6 - Release

2025-10-06T14:02:33Z

VisionDepth3D v3.6 Release

This update is all about quality and speed. A brand-new Depth Blender tab lets you mix models with precision for cleaner separation and smoother parallax, while HDR10 handling has been rebuilt to preserve true 10-bit color and metadata. The experimental Live 3D pipeline makes its debut, turning capture cards, consoles, and webcams into real-time 3D feeds. Upscaling and interpolation have been overhauled with threaded workers, dropping render times from 10 hours to ~1 hour on long projects. Add in clip-range rendering, direct Left/Right output, smarter padding, codec fixes, and a full UI overhaul — v3.6 is the most refined and flexible VisionDepth3D yet.

1) Upscaling & Interpolation – Massive Speed Boost

Rewritten Frames tab pipeline with threaded workers + queues
RIFE, ESRGAN, and FFmpeg writing now run concurrently instead of sequentially
Intelligent frame indexing and buffering preserve order while maximizing throughput
Render time reduced from 10 hours → ~1 hour on long clips
Result: creators can upscale and interpolate full-length videos in a fraction of the time without crashes or dropped frames

2) Depth Pipeline – Refinements & Blending

New Depth Blender tab with sliders for model blend weights
Improved 16-bit depth output handling for smoother disparity
Early percentile clipping reduces outliers without flattening depth
Added Depth Anything V2 Giant model support
Added FP16 precision toggle for faster inference and reduced VRAM use
Result: cleaner separation between foreground and background, less fuzz, and more consistent 3D parallax

3) HDR10 Support – Preservation & Metadata

Fixed washed-out HDR outputs when re-encoding
Preserves:
- 10-bit pixel format (yuv420p10le)
- BT.2020 color space
- PQ curve (smpte2084)
- HDR metadata (Master Display / MaxCLL)
UI toggle: Preserve HDR10 Metadata
Result: HDR content now keeps its original punch and dynamic range

4) Experimental Live 3D (WIP)

Added real-time 3D pipeline for external inputs (consoles, capture cards, webcams)
Uses Depth Anything v2 Small by default (swap models if GPU allows)
Stereo conversion powered by the VisionDepth3D method
End-to-end capture → depth → stereo loop is working
Early tests show playable 3D console and video feeds
Performance optimizations ongoing for fps, latency, and GPU acceleration

5) General Fixes & Stability

Rendering

Restored Clip-range UI — set start/end times for partial renders
Added Left-only / Right-only output modes (no post-split required)
Extra padding + edge reflection reduce stereo bleed-through
Optimized per-eye resize, aspect ratio, and DOF/color grading checks
Fixed floating-window scaling in single-eye renders

UI & Error Handling

Patched white-edge artifact from 16-bit normalization
Better error handling when models fail to load
Synced language packs with new controls (HDR toggle, depth blender, etc.)
All buttons and inputs styled with a new dark theme

Codec & Output

Fixed FFmpeg forcing slow presets on GPU codecs
NVENC now uses correct encoder flags (preset p5, rc vbr, cq)
CPU codecs retain CRF + preset for consistent quality

6) UI & Workflow Enhancements

Full 3D Generator tab UI overhaul for a cleaner look
Hotkeys to import video & depth maps directly into workflow
Save/load presets with one click
Reset button and quick navigation to docs, bug reports, and GitHub
Result: smoother daily workflow and better testing inside VD3D

Summary

v3.6 delivers depth blending refinements, true HDR10 preservation, and massive speed boosts through concurrent processing.
It restores clip-range flexibility, adds direct eye outputs, and debuts the first Live 3D pipeline, moving VisionDepth3D toward real-time stereo rendering.

How to Install

Go to the VisionDepth3D Releases page
Download the latest installer .exe and .bin parts
Place all files in the same folder
Run the .exe installer and follow the prompts
Launch VisionDepth3D from the Start Menu or Desktop shortcut

Download VisionDepth3D Release Installer to simplify fetching most recent releases

For source installation and advanced setup see the Installation Guide.

VisionDepth3D Setup Downloader

2025-12-14T18:30:17Z

VisionDepth3D Setup Instructions

This tool is a Setup Downloader.
It downloads the official VisionDepth3D installer files and then launches the setup wizard.

How to Install VisionDepth3D

1. Open the VisionDepth3D Setup Downloader

Select the latest release from the list.
Click Install.

2. Download Setup Files

The setup files will download automatically.
When finished, the VisionDepth3D setup window (Inno Setup) will open.

⚠️ Do not re-run the Setup Downloader after this point.

3. Complete the Setup Wizard

Choose where you want VisionDepth3D installed.
Follow the on-screen steps until installation is complete.

4. Launch VisionDepth3D

If prompted, you may launch VisionDepth3D immediately after setup.
Otherwise, open it from the Start Menu or Desktop shortcut created during setup.

5. (Optional) Clean Up Installer Files

After installation, you may use the “Remove installer files” button in the Setup Downloader.
This deletes the downloaded setup files only.
This does NOT remove VisionDepth3D itself.

Important Notes

The Setup Downloader is NOT the VisionDepth3D application.
After installation, always launch VisionDepth3D from the Start Menu or Desktop, not by re-running the downloader.
If VisionDepth3D does not open when installed in Program Files, install it to Documents or another user folder to avoid Windows permission issues.
If needed, try right-click → Run as administrator when launching VisionDepth3D.

⚠️ Do NOT select the Setup Downloader inside the Setup application drop down where you select latest version.

VisionDepth3Dv3.5 - Release

2025-08-24T06:06:30Z

VisionDepth3D v3.5 Release

This update transforms VD3D into a cinematic 2D-to-3D studio. Depth of Field has been rebuilt for buttery smooth bokeh with motion-adaptive focus, the Audio Tool is now pro-level with codec, bitrate, and sync offset control, and GPU-accelerated color grading puts saturation, contrast, and brightness right inside the workflow. A new IPD stereo slider lets you dial in the perfect 3D strength for any screen or headset. Add in a streamlined ONNX pipeline, clip rendering tools, and polished multi-language support, and v3.5 delivers the most powerful, creator-focused VisionDepth3D yet.

1) Depth of Field (DOF) – Rewritten and Stabilized

Fully rewritten as a GPU-accelerated, multi-level Gaussian pipeline
Uses per-pixel interpolation between blur levels for smooth transitions
Added motion-adaptive focal tracking:
- Exponential moving average (EMA) for stable focus
- Deadband to ignore micro noise
- Max-step limiter to prevent sudden focus jumps
DOF now applies after stereo warp using warped per-eye depth
DOF slider maps directly to max blur; setting it to 0 disables DOF
Result: smoother bokeh, no ring artifacts, more natural focal transitions

2) Audio Tool – Revamp and Codec Control

Added progress bar for encoding and attaching audio
Codec and bitrate can now be selected before muxing
- aac (default) and libmp3lame supported
- Configurable bitrate (128k, 192k, 320k)
Added offset slider for real-time sync adjustment
Clear distinction between copy vs re-encode:
- If codec/bitrate unchanged → fast copy (-c copy)
- If changed → re-encode
Safe handling of long videos (2+ hours) with progress feedback

3) Color Grading – GPU Accelerated and Integrated

Introduced GPU-accelerated color grading (apply_color_grade) with:
- Saturation
- Contrast
- Brightness
Color grading now applies after stereo warp and DOF, before packing
Added Preview GUI sync:
- Sliders update live in preview with debounced re-rendering
- Two-way binding with main UI controls
Preset/save/load support extended to include color grading
Result: fine-tune the image directly inside VD3D without external tools

4) Stereo Separation (IPD Adjustment) – New 3D Control

Added Interpupillary Distance (IPD) adjustment slider
Works as a global scale factor on pixel shifts (foreground, midground, background)
Allows creators to:
- Increase IPD for stronger 3D on large screens or VR
- Reduce IPD for comfortable viewing on smaller displays
Fully integrated into:
- Preset system (save/load)
- Preview GUI with real-time feedback
- Tooltip and i18n system
Result: match stereo depth strength to your display environment

5) General Fixes and Stability

Fixed tensor size mismatch crash in DOF when depth/resolution mismatched
Preview GUI sliders now wire correctly to main GUI sliders
Minor UI consistency fixes across tools
Language files clean-up:
- Removed duplicate keys and aligned all translations with en.json
- Verified FR/DE/ES/JA language packs
- Added missing entries (Apply Entries, Start Batch Render, scene detection, etc.)

6) New Session Additions – ONNX Pipeline and UI Enhancements

ONNX Integration
- Converted Video Depth Anything (pth → onnx) for faster inference
- Optimized ONNX pipeline path for efficient runtime
UI Enhancements
- New start and end time controls inside Encoding Settings
- Render short clips or preview segments without running full videos
- Inputs section refactored into its own dedicated frame for clarity
Result: streamlined workflow for experimenting with models, and flexible render ranges for testing

Summary

v3.4 gave creators fine-grained depth and subject control.
v3.5 brings cinematic polish with stabilized DOF, a true audio tool with sync and codec options, a GPU color grading suite, stereo separation (IPD adjustment) for display comfort.

How to Install

Go to the VisionDepth3D Releases page
Download the latest installer .exe and .bin parts
Place all files in the same folder
Run the .exe installer and follow the prompts
Launch VisionDepth3D from the Start Menu or Desktop shortcut

For source installation and advanced setup see the Installation Guide in the repository.

VisionDepth3D v3.3 - Release

2025-08-12T13:42:14Z

VisionDepth3D v3.3 — Stability, Accuracy & Artifact Reduction

This update is a major overhaul to both the Depth Estimation Pipeline and the 3D Rendering Pipeline, with a focus on stability, accuracy, and artifact reduction.

How to Install

Because this package is larger than 2 GB, the installer is split into multiple files.

Download VisionDepth3D_v3.3_WIN_x64_SETUP.exe
and all accompanying .bin files from the release page
Place them together in the same folder
Run the .exe and follow the on-screen instructions

Depth Pipeline Updates

Black Bar Cropping for Depth Estimation

New ignore_letterbox_bars detects bars in the first non-empty frame
Crops top/bottom bars before sending frames to the depth model
Re-applies bars after processing with neutral depth values, preventing black regions from appearing closer or farther than the main scene

Output Resolution Preservation

Depth maps resized back to the original cropped resolution before re-adding bars
Ensures final depth video matches original width/height

Safety Checks

If bars exceed frame height or the frame is empty, bars reset to zero to prevent OpenCV errors

Unified Depth-to-Grayscale Conversion

convert_depth_to_grayscale() now handles:
- PIL.Image
- torch.Tensor
- numpy.ndarray
Cleans NaN values and fixes shape inconsistencies
Centralized for consistent grayscale output

Sidecar Metadata for Bars

Saves .letterbox.json with top, bottom, and original_resolution next to the depth video

3D Pipeline Updates

Single or Batch Processing

Process one video or queue multiple for 3D rendering

Stability & Robustness

Render loop wrapped in try/except/finally for guaranteed cleanup
Defensive init for ffmpeg_proc and out
Early exit if VideoWriter fails
Pause handling keeps frame index + ETA/FPS updated
Cancel paths work during processing and pause
Automatic codec fallback if FFmpeg encoder is invalid

Depth Map Processing

TemporalDepthFilter (EMA smoothing) reduces depth flicker
Percentile-based normalization for consistent depth range
Midtone shaping (gamma) improves depth layering
Optional curvature enhancement for roundness

Stereo / Parallax Control

ShiftSmoother damps rapid disparity changes
Edge-aware masking + feathering reduces tearing
Dynamic IPD scaling adapts stereo strength
Subject-tracked zero parallax with window easing
Optional dynamic convergence bias
IPD factor knob for global stereo strength

Image Quality

GPU depth-of-field blur with Gaussian blending
Brightness-preserving sharpening with highlight protection

Framing, Aspect & Output Formats

Aspect-ratio safe resizing with pad_to_aspect_ratio
Two modes:
- Preserve Original Aspect Ratio
- Target Output Aspect
Formats:
- Full-SBS
- Half-SBS
- VR 1440×1600
- Dubois anaglyph
- Passive interlaced

Encoding & I/O

FFmpeg over stdin:
- CRF for libx*
- CQ for NVENC with -b:v 0
CPU/GPU encoder mapping, OpenCV fallback

UX / Telemetry

Smooth, real-time progress/FPS/ETA — also while paused
More descriptive logging

Download

📥 Download VisionDepth3D v3.3

VisionDepth3D is free for personal and non-commercial use.
Commercial use or redistribution without consent is prohibited.

What's Changed

V3.2.5 by @VisionDepth in #48
V3.2.5 by @VisionDepth in #49
V3.2.6 by @VisionDepth in #52
V3.3 by @VisionDepth in #53

Full Changelog: Release-v3.2.4...Release-v3.3

VisionDepth3Dv3.2.4 - Release

2025-06-04T02:07:05Z

VisionDepth3D – Hybrid 2D-to-3D Converter

Convert any 2D video into immersive stereoscopic 3D using AI-powered depth estimation, real-time preview, and fully customizable stereo controls — all built for creators, VR tinkerers, and 3D enthusiasts.

Whether you're producing content for VR headsets, YouTube 3D, or your own Blu-ray collection, VisionDepth3D delivers sharp, artifact-free 3D with GPU-accelerated tools and formats for every workflow.

Download .exe and both .bin files and make sure they are in the same folder before installing

Powered by the VisionDepth3D Method

At the core of this app is the VisionDepth3D Method — a custom real-time rendering technique designed to produce smooth, eye-comfortable stereo from AI depth maps.

Core Features of the Method:

Dynamic parallax scaling
Zero-parallax tracking
Edge-aware masking
Scene-aware stereo dampening

Want to dive deeper into how it works?

📄 Read the Method → VisionDepth3D_Method.md

Features

AI Depth Estimation — Supports 20+ models (DPT, MiDaS, Depth Anything v2, etc.) with CUDA acceleration
Batch Processing — Smart VRAM-aware queueing for images & video
3D Output Modes — Half-SBS, Full-SBS, Interlaced, Anaglyph
Frame Interpolation — Smooth motion via ONNX RIFE (2×–8×)
Super Resolution — Real-ESRGAN upscaling (e.g., 1080p → 4K)
Parallax Tuning — Independent controls for foreground, midground, and background
Smart Mask Effects — Built-in feathering and ghost suppression
Audio Tools — Attach AAC, MP3, or WAV using the FFmpeg GUI
Live Feedback — Real-time FPS, ETA, pause/resume/cancel support
Preview Modes — Heatmap, SBS, Anaglyph, Interlaced
Export-Ready — Output for YouTube 3D, Oculus Quest, and MP4/MKV/AVI with GPU encoding

Free to Use. Built for Creators.

If you find it helpful, consider donating — every bit goes toward:

New hardware for testing
Supporting more depth models
Continued updates and features

Thanks for supporting open 3D tools!

VisionDepth3D v3.2.4 – Changelog

Note: Although the last official version was listed as 3.1.9, several intermediate patches were applied via GitHub and consolidated under version 3.2.4.

GUI Enhancements

Inference Steps Control

Introduced inference_steps_entry field to support user-defined inference steps for diffusion models.
Includes input validation and fallback handling.
Dynamically updates on <Return> and <FocusOut> events.

Resolution Dropdown Improvements

Expanded resolution options to include model-native sizes:
- 512x256, 704x384, 960x540, 1024x576, and others for improved performance and visual quality.
Automatically strips display hints like " (DC-Fastest)" for cleaner parsing of dimensions.

CPU Offload Mode Selection

Added support for multiple modes:
- "model", "vae", "unet", "sequential", "none"
The selected value is passed directly to the pipeline logic via offload_mode_dropdown.get().

Sidebar Layout

Sidebar width increased from 22 to 30 for improved component spacing and usability.

DepthCrafter Integration (Work-in-Progress)

Pipeline integration

load_depthcrafter_pipeline() now supports the following arguments:
- inference_steps
- offload_mode
Additional parameters are currently hardcoded and will be configurable in future updates.
Device mapping is handled dynamically based on offload_mode:
- "sequential" runs all operations on the GPU.
- Other modes selectively offload components to CPU to manage VRAM usage.

Stability Fixes and Improvements

Warm-up logic now includes spinner feedback to prevent GUI freeze during model loading.
All models, including local ones, now run reliably. However, local models still require manual configuration of inference resolution and batch size due to unresolved dynamic resolution handling.
invert_var toggle is now functioning correctly for .
Subject depth smoothing introduced in the 3D pipeline to reduce temporal jitter in estimated depth maps.
Focal depth consistency added for stereo rendering: subject depth is now shared across both eye views.

License Notice

VisionDepth3D is free for personal and non-commercial use only.
Commercial use, modification, or redistribution is not permitted without prior written consent.

Full license terms available in the GitHub repository.