Tags · nyo16/llama.cpp

b8833

ggml-webgpu: fix compiler warnings and refactor FlashAttention encodi…

…ng (ggml-org#21052)

* Update workflows to remove dependence on llvmpipe

* Try setting Dawn_DIR

* remove c++20 initializers

* Move to proper guid

* Try avoiding segfaults on vulkan backend process exit

* Remove compiler warnings on parameter casting

* Fix soft_max and update reg_tile accumulation to f32 for better precision

* Refactor flash_attn a bit

* remove c++20 initializers and format

* Increase div precision for NVIDIA

* revert div precision and comment out ggml-ci node for now

* Formatting

* Try debugging on a failing CI node

* Revert "Try debugging on a failing CI node"

This reverts commit 1971e33.

Apr 17, 2026
45cac7c
zip
tar.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b8833

Tags: nyo16/llama.cpp

b8833