Tags: NVIDIA/warp
Tags
v1.12.1 Highlights: - Fix kernel dispatch using incorrect block_dim across devices, causing crashes or memory corruption in tile kernels - Fix silent precision loss in compile-time constants passed to 64-bit scalar constructors (wp.float64(), wp.int64(), wp.uint64()) - Fix wp.HashGrid neighbor queries missing results for negative coordinates - Fix augmented assignments with subscript/attribute targets double-evaluating the target expression (e.g., s.field += expr, arr[i] *= expr) - Fix wp.tile_matmul() and wp.tile_fft() ignoring module-level enable_backward - Fix @wp.func with tile parameters failing to compile with shared-memory tiles - Fix struct field assignments converting Warp scalar types to plain Python types See the full changelog for more details: https://github.com/NVIDIA/warp/releases/tag/v1.12.1
v1.12.0 Highlights: - Add experimental hardware-accelerated texture sampling on CUDA GPUs with wp.Texture1D/2D/3D and wp.texture_sample() - Add subscript-style type hints (e.g., wp.array[float]) for better Pyright/Pylance compatibility - Add tile arithmetic operators (*, /) with broadcast, differentiable FFT, and wp.tile_from_thread() - Add jax.vmap() support for Warp kernels and callables via jax_kernel() and jax_callable() - Add quaternion/spatial helpers, approximate math intrinsics, and wp.print_diagnostics() - Add B-spline shape functions to warp.fem - Allow NVRTC compilation without a CUDA driver for ahead-of-time compilation in Docker builds See the full changelog for more details: https://github.com/NVIDIA/warp/releases/tag/v1.12.0
v1.11.1 Highlights: - Fix wp.tile_matmul() sometimes producing NaN results when using the `c = wp.tile_matmul(a, b)` form due to reading uninitialized output memory - Fix wp.static() incorrectly resolving loop variables to same-named global Python variables when used for static loop unrolling in kernels - Fix segfault in conditional expressions (ternary if/else) when one branch accesses an array element and the other branch is taken - Fix CUDA graphs with multiple temporary allocations using more memory than necessary due to improper sequencing of memory free operations - Fix @wp.func decorated functions showing generic types in Pyright/Pylance instead of their actual signatures on Python 3.10+ See the full changelog for more details: https://github.com/NVIDIA/warp/releases/tag/v1.11.1
v1.11.0 Highlights: - Add group-aware construction and queries for wp.Bvh and wp.Mesh to support multi-environment workloads - Add wp.grad() to evaluate function gradients inline during the forward pass - Add options to reduce JIT compilation time with precompiled headers, optimization level control, and parallel module compilation - Extend wp.tile_map() to support n-ary operations (up to 8 arguments) and add wp.tile_randf()/wp.tile_randi() for random tile generation - Add unpack operator (*) support in kernels for vectors, matrices, quaternions, and array slices See the full changelog for more details: https://github.com/NVIDIA/warp/releases/tag/v1.11.0
v1.10.1 Highlights: - Fix module="unique" kernels to properly reuse existing module objects, avoiding unnecessary overhead (especially noticeable on macOS) - Fix kernel-local arrays (wp.zeros() in kernels): .ptr access, indexing, and shape parameter handling - Fix code generation ordering for custom gradient functions (@wp.func_grad) when used with nested function calls - Fix loops containing wp.static() expressions to unroll correctly regardless of max_unroll settings - Fix reference cycles in wp.fem.Temporary and wp.fem.ShapeBasisSpace See the full changelog for more details: https://github.com/NVIDIA/warp/releases/tag/v1.10.1
v1.10.0 Highlights: - Add experimental JAX automatic differentiation support with jax_kernel(enable_backward=True) - Add in-place wp.Bvh.rebuild() with CUDA graph support for allocation-free BVH updates - Improve built-in function call performance from Python by up to 70× through caching - Add tile programming enhancements: axis-specific reductions, component indexing, wp.tile_full() - Remove warp.sim module (superseded by Newton library) See the full changelog for more details: https://github.com/NVIDIA/warp/releases/tag/v1.10.0
v1.9.1 Highlights: - Fix crash when using radix sort on multiple streams - Fix memory management issues with shared tiles (double frees, leaks) - Restore support for older GPU architectures (Maxwell, Pascal, Volta) when building with CUDA 12 - Fix TypeError with tuple type hints on Python 3.9/3.10 - Fix empty slice operations arr[i:i] that caused indexing errors See the full changelog for more details: https://github.com/NVIDIA/warp/releases/tag/v1.9.1
v1.9.0 Highlights: - wp.MarchingCubes rewrite in pure Warp, supporting CPU and GPU devices and differentiability - wp.compile_aot_module() and wp.load_aot_module() to support basic ahead-of-time workflows - More flexible indexing support for wp.matrix()/wp.vector()/wp.quaternion() types - Support for IntEnum and IntFlag inside Warp kernels - Add indexed tile operations: wp.tile_index_load(), wp.tile_index_store(), and wp.tile_index_atomic_add() See the full changelog for more details: https://github.com/NVIDIA/warp/releases/tag/v1.9.0
v1.8.1 Highlights: - Deprecate the graph_compatible boolean flag in jax_callable() in favor of the new graph_mode argument with GraphMode enum - Support input-output aliasing in JAX FFI - Support capturing jax_callable() using Warp via the new graph_mode parameter - Fix missing cloth-body contact in wp.sim.VBDIntegrator with handle_self_contact=False - Fix compile time regression for kernels using matmul, Cholesky, and FFT solvers See the full changelog for more details: https://github.com/NVIDIA/warp/releases/tag/v1.8.1
PreviousNext