Tags · justinsb/llama.cpp

b4735

CUDA: use async data loading for FlashAttention (ggml-org#11894)

* CUDA: use async data loading for FlashAttention

---------

Co-authored-by: Diego Devesa <[email protected]>

Feb 17, 2025
73e2ed3
zip
tar.gz

b4734

update release requirements (ggml-org#11897)

Feb 17, 2025
f7b1116
zip
tar.gz

b4733

server : fix divide-by-zero in metrics reporting (ggml-org#11915)

Feb 17, 2025
c4d29ba
zip
tar.gz

b4732

vulkan: implement several ops relevant for ggml_opt (ggml-org#11769)

* vulkan: support memset_tensor

* vulkan: support GGML_OP_SUM

* vulkan: implement GGML_OP_ARGMAX

* vulkan: implement GGML_OP_SUB

* vulkan: implement GGML_OP_COUNT_EQUAL

* vulkan: implement GGML_OP_OPT_STEP_ADAMW

* vulkan: fix check_results RWKV_WKV6 crash and memory leaks

* vulkan: implement GGML_OP_REPEAT_BACK

* tests: remove invalid test-backend-ops REPEAT_BACK tests

* vulkan: fix COUNT_EQUAL memset using a fillBuffer command

Feb 17, 2025
2eea03d
zip
tar.gz

b4731

server : bump httplib to 0.19.0 (ggml-org#11908)

Feb 16, 2025
0f2bbe6
zip
tar.gz

b4730

common : Fix a typo in help (ggml-org#11899)

This patch fixes a typo in command help.
prefx -> prefix

Signed-off-by: Masanari Iida <[email protected]>

Feb 16, 2025
fe163d5
zip
tar.gz

b4728

vulkan: support multi/vision rope, and noncontiguous rope (ggml-org#1…

…1902)

Feb 16, 2025
bf42a23
zip
tar.gz

b4727

metal : fix the crash caused by the lack of residency set support on …

…Intel Macs. (ggml-org#11904)

Feb 16, 2025
c2ea16f
zip
tar.gz

b4724

metal : optimize dequant q6_K kernel (ggml-org#11892)

Feb 15, 2025
2288510
zip
tar.gz

b4722

repo : update links to new url (ggml-org#11886)

* repo : update links to new url

ggml-ci

* cont : more urls

ggml-ci

Feb 15, 2025
68ff663
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b4735

b4734

b4733

b4732

b4731

b4730

b4728

b4727

b4724

b4722

Tags: justinsb/llama.cpp