Skip to content

Tags: nyo16/llama.cpp

Tags

b8833

Toggle b8833's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ggml-webgpu: fix compiler warnings and refactor FlashAttention encodi…

…ng (ggml-org#21052)

* Update workflows to remove dependence on llvmpipe

* Try setting Dawn_DIR

* remove c++20 initializers

* Move to proper guid

* Try avoiding segfaults on vulkan backend process exit

* Remove compiler warnings on parameter casting

* Fix soft_max and update reg_tile accumulation to f32 for better precision

* Refactor flash_attn a bit

* remove c++20 initializers and format

* Increase div precision for NVIDIA

* revert div precision and comment out ggml-ci node for now

* Formatting

* Try debugging on a failing CI node

* Revert "Try debugging on a failing CI node"

This reverts commit 1971e33.