Skip to content

Tags: vp2177/llama.cpp

Tags

b6255

Toggle b6255's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
chat : fix debug build assertion in trim function (ggml-org#15520)

b5937

Toggle b5937's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
metal : fuse add, mul + add tests (ggml-org#14596)

ggml-ci

b5936

Toggle b5936's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
graph : fix graph reuse reset of params (ggml-org#14760)

ggml-ci

b5935

Toggle b5935's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
parallel : add option for different RNG seeds (ggml-org#14757)

ggml-ci

b5934

Toggle b5934's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (ggml-org#14741)

* Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs

Gemma3n uses Matrix-Matrix addition as part of their input processing,
wrongly triggering CUDA_GRAPH disablement on NVGPUs even when batch-size
of 1 is used.

* Exclude `project_per_layer_input` by matching node names

This ensures that all other graphs which don't exhibit this pattern do
not have their behavior changed.

* Revert unnecessary formatting changes

b5933

Toggle b5933's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
graph : avoid huge warm-up graphs for MoE models (ggml-org#14753)

* graph : avoid huge warm-up graphs for MoE models

ggml-ci

* cont : bump max nodes to 8x model tensors

b5932

Toggle b5932's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
model : fix build after merge conflict (ggml-org#14754)

b5930

Toggle b5930's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CUDA: set_rows + cpy.cu refactor (ggml-org#14712)

b5929

Toggle b5929's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
graph : refactor context to not pass gf explicitly (ggml-org#14629)

ggml-ci

b5928

Toggle b5928's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
graph : Pass the graph placeholder message in debug mode (ggml-org#14748

)

Without that condition, this debug log clutters the screen every batch treated in the prompt processing, or every token generated in Kobold.cpp.