Skip to content

ggml-cpu: fix fallback for RVV kernels without zvfh#21157

Merged
ggerganov merged 2 commits intoggml-org:masterfrom
riseproject-dev:10x/riscv-fix
Apr 1, 2026
Merged

ggml-cpu: fix fallback for RVV kernels without zvfh#21157
ggerganov merged 2 commits intoggml-org:masterfrom
riseproject-dev:10x/riscv-fix

Conversation

@taimur-10x
Copy link
Copy Markdown
Member

Summary

This PR fixes issues in the RISC-V zvfh path for kernels in ggml/src/ggml-cpu/llamafile/sgemm.cpp and ggml/src/ggml-cpu/vec.h (pointed out in #21064)

Key Changes

  • Refactored llamafile_sgemm for inconsistencies in checks for zvfh.
  • The following kernels fall to the GGML_SIMD path (which is undefined for fp16) when riscv_v_intrinsics is defined without zvfh. Refactored these so that the scalar implementation is called instead.
    • ggml_vec_scale_f16
    • ggml_vec_mad_f16
    • ggml_vec_dot_f16_unroll
  • Set GGML_RV_ZVFBFWMA to OFF by default in ggml/CMakeLists.txt.

@github-actions github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label Mar 29, 2026
@taimur-10x taimur-10x requested a review from xctan March 29, 2026 19:01
@ckastner
Copy link
Copy Markdown
Collaborator

I've tested this and can confirm that it fixed #21064 for me. Thanks!

@ggerganov ggerganov merged commit 2b86e5c into ggml-org:master Apr 1, 2026
45 checks passed
slartibardfast pushed a commit to slartibardfast/llama.cpp that referenced this pull request Apr 12, 2026
* ggml-cpu: refactor sgemm; fix rvv checks

* ggml-cpu: refactor rvv kernels; set zvfbfwma default to off
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* ggml-cpu: refactor sgemm; fix rvv checks

* ggml-cpu: refactor rvv kernels; set zvfbfwma default to off
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants