Tags: jeromew/llama.cpp
Tags
Relax prefill parser to allow space. (ggml-org#21240) * Relax prefill parser to allow space. * Move changes from prefix() to parser generation * Only allow spaces if we're not having a pure content parser next
chat : add Granite 4.0 chat template with correct tool_call role mapp… …ing (ggml-org#20804) * chat : add Granite 4.0 chat template with correct tool_call role mapping Introduce `LLM_CHAT_TEMPLATE_GRANITE_4_0` alongside the existing Granite 3.x template (renamed `LLM_CHAT_TEMPLATE_GRANITE_3_X`). The Granite 4.0 Jinja template uses `<tool_call>` XML tags and maps the `assistant_tool_call` role to `<|start_of_role|>assistant<|end_of_role|><|tool_call|>`. Without a matching C++ handler, the fallback path emits the literal role `assistant_tool_call` which the model does not recognize, breaking tool calling when `--jinja` is not used. Changes: - Rename `LLM_CHAT_TEMPLATE_GRANITE` to `LLM_CHAT_TEMPLATE_GRANITE_3_X` (preserves existing 3.x behavior unchanged) - Add `LLM_CHAT_TEMPLATE_GRANITE_4_0` enum, map entry, and handler - Detection: `<|start_of_role|>` + (`<tool_call>` or `<tools>`) → 4.0, otherwise → 3.x - Add production Granite 4.0 Jinja template - Add tests for both 3.x and 4.0 template paths (C++ and Jinja) Co-Authored-By: Claude Opus 4.6 <[email protected]> * Code review: follow standard format and use common logic in test-chat-template.cpp * Rename custom_conversation variable for extra_conversation to give it a more meaningful name --------- Co-authored-by: Claude Opus 4.6 <[email protected]>
hexagon : add cumsum op support (ggml-org#21246) * hexagon : add cumsum op support * hexagon: enable dma for cumsum op * Fix line-ending --------- Co-authored-by: Max Krasnyansky <[email protected]>
ggml-cpu: fix fallback for RVV kernels without zvfh (ggml-org#21157) * ggml-cpu: refactor sgemm; fix rvv checks * ggml-cpu: refactor rvv kernels; set zvfbfwma default to off
PreviousNext