Tags · vp2177/llama.cpp

b6255

chat : fix debug build assertion in trim function (ggml-org#15520)

Aug 23, 2025
21dc4dd
zip
tar.gz
Downloads

b5937

metal : fuse add, mul + add tests (ggml-org#14596)

ggml-ci

Jul 18, 2025
bf9087f
zip
tar.gz

b5936

graph : fix graph reuse reset of params (ggml-org#14760)

ggml-ci

Jul 18, 2025
9fb1042
zip
tar.gz

b5935

parallel : add option for different RNG seeds (ggml-org#14757)

ggml-ci

Jul 18, 2025
2adf8d8
zip
tar.gz

b5934

cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (ggml-org#14741)

* Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs

Gemma3n uses Matrix-Matrix addition as part of their input processing,
wrongly triggering CUDA_GRAPH disablement on NVGPUs even when batch-size
of 1 is used.

* Exclude `project_per_layer_input` by matching node names

This ensures that all other graphs which don't exhibit this pattern do
not have their behavior changed.

* Revert unnecessary formatting changes

Jul 18, 2025
021cc28
zip
tar.gz

b5933

graph : avoid huge warm-up graphs for MoE models (ggml-org#14753)

* graph : avoid huge warm-up graphs for MoE models

ggml-ci

* cont : bump max nodes to 8x model tensors

Jul 18, 2025
d498af3
zip
tar.gz

b5932

model : fix build after merge conflict (ggml-org#14754)

Jul 18, 2025
eacdeb5
zip
tar.gz

b5930

CUDA: set_rows + cpy.cu refactor (ggml-org#14712)

Jul 18, 2025
f9a31ee
zip
tar.gz

b5929

graph : refactor context to not pass gf explicitly (ggml-org#14629)

ggml-ci

Jul 18, 2025
8f974bc
zip
tar.gz

b5928

graph : Pass the graph placeholder message in debug mode (ggml-org#14748

)

Without that condition, this debug log clutters the screen every batch treated in the prompt processing, or every token generated in Kobold.cpp.

Jul 18, 2025
09651d0
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b6255

b5937

b5936

b5935

b5934

b5933

b5932

b5930

b5929

b5928

Tags: vp2177/llama.cpp