Skip to content

Tags: PMZFX/llama.cpp-sycl

Tags

b8679

Toggle b8679's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llama-bench: add `-fitc` and `-fitt` to arguments (ggml-org#21304)

* llama-bench: add `-fitc` and `-fitt` to arguments

* update README.md

* address review comments

* update compare-llama-bench.py

b8678

Toggle b8678's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
vocab : add byte token handling to BPE detokenizer for Gemma4 (ggml-o…

…rg#21488)

b8676

Toggle b8676's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server : handle unsuccessful sink.write in chunked stream provider (g…

…gml-org#21478)

Check the return value of sink.write() in the chunked content provider
and return false when the write fails, matching cpp-httplib's own
streaming contract. This prevents logging chunks as sent when the sink
rejected them and properly aborts the stream on connection failure.

b8672

Toggle b8672's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
hexagon: slight optimization for argosrt output init (ggml-org#21463)

b8671

Toggle b8671's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llama : correct platform-independent loading of BOOL metadata (ggml-o…

…rg#21428)

* model-loader : fix GGUF bool array conversion

* model-loader : fix remaining GGUF bool pointer uses

b8670

Toggle b8670's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
model : add HunyuanOCR support (ggml-org#21395)

* HunyuanOCR: add support for text and vision models

- Add HunyuanOCR vision projector (perceiver-based) with Conv2d merge
- Add separate HUNYUAN_OCR chat template (content-before-role format)
- Handle HunyuanOCR's invalid pad_token_id=-1 in converter
- Fix EOS/EOT token IDs from generation_config.json
- Support xdrope RoPE scaling type
- Add tensor mappings for perceiver projector (mm.before_rms, mm.after_rms, etc.)
- Register HunYuanVLForConditionalGeneration for both text and mmproj conversion

* fix proper mapping

* Update gguf-py/gguf/tensor_mapping.py

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* Update tools/mtmd/clip.cpp

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* address comments

* update

* Fix typecheck

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <[email protected]>

---------

Co-authored-by: Xuan-Son Nguyen <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>

b8668

Toggle b8668's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server : fix logging of build + system info (ggml-org#21460)

This PR changes the logging that occurs at startup of llama-server.
Currently, it is redundant (including CPU information twice) and it is
missing the build + commit info.

b8667

Toggle b8667's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ci: lower cuda12 floor to 12.8.1 for broader host compatibility (ggml…

…-org#21438)

Co-authored-by: M1DNYT3 <[email protected]>

b8665

Toggle b8665's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
common : add gemma 4 specialized parser (ggml-org#21418)

* common : add gemma4 dedicated parser

* cont : add '<|tool_response>' as eog

* cont : emit JSON from Gemma4 tool call AST

* cont : more fixes

* cont : refactor convert function

* cont : refine rules and mapping

* cont : add more tests

* cont : clean up

* cont : remove autoparser gemma4 implementation

* cont : more cleanup

* cont : rename gemma4.jinja to match the others

* cont : add custom template to support interleaved thinking

* cont : preserve reasoning in model turns

* cont : fix initializer error

* cont : fix unused vars

* cont : fix accidental static

* cont : fix specialized_template signature

* fix extra semicolon

* remove debug line and extra space [no ci]

b8664

Toggle b8664's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server: Fix undefined timing measurement errors in server context (gg…

…ml-org#21201)

Co-authored-by: Dan Hoffman <[email protected]>