Skip to content

Tags: Dibakar/llama.cpp

Tags

b3337

Toggle b3337's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Update llama-cli documentation (ggml-org#8315)

* Update README.md

* Update README.md

* Update README.md

fixed llama-cli/main, templates on some cmds added chat template sections and fixed typos in some areas

* Update README.md

* Update README.md

* Update README.md

b3162

Toggle b3162's commit message

Verified

This commit was signed with the committer’s verified signature.
ggerganov Georgi Gerganov
cuda : fix bounds check for src0 rows in MMVQ kernel (whisper/2231)

* cuda : fix bounds check for src0 rows in MMVQ kernel

* Update ggml-cuda/mmvq.cu

Co-authored-by: Johannes Gäßler <[email protected]>

---------

Co-authored-by: Johannes Gäßler <[email protected]>

b3087

Toggle b3087's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
common : refactor cli arg parsing (ggml-org#7675)

* common : gpt_params_parse do not print usage

* common : rework usage print (wip)

* common : valign

* common : rework print_usage

* infill : remove cfg support

* common : reorder args

* server : deduplicate parameters

ggml-ci

* common : add missing header

ggml-ci

* common : remote --random-prompt usages

ggml-ci

* examples : migrate to gpt_params

ggml-ci

* batched-bench : migrate to gpt_params

* retrieval : migrate to gpt_params

* common : change defaults for escape and n_ctx

* common : remove chatml and instruct params

ggml-ci

* common : passkey use gpt_params

b3078

Toggle b3078's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llama : offload to RPC in addition to other backends (ggml-org#7640)

* llama : offload to RPC in addition to other backends

* - fix copy_tensor being called on the src buffer instead of the dst buffer

- always initialize views in the view_src buffer

- add RPC backend to Makefile build

- add endpoint to all RPC object names

* add rpc-server to Makefile

* Update llama.cpp

Co-authored-by: slaren <[email protected]>

---------

Co-authored-by: slaren <[email protected]>

b2774

Toggle b2774's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
switch to using localizedDescription (ggml-org#7010)

b2755

Toggle b2755's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix more int overflow during quant (PPL/CUDA). (ggml-org#6563)

* Fix more int overflow during quant.

* Fix some more int overflow in softmax.

* Revert back to int64_t.

b2716

Toggle b2716's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[SYCL] Windows default build instructions without -DLLAMA_SYCL_F16 fl…

…ag activated (ggml-org#6767)

* Fix FP32/FP16 build instructions

* Fix typo

* Recommended build instruction

Co-authored-by: Neo Zhang Jianyu <[email protected]>

* Recommended build instruction

Co-authored-by: Neo Zhang Jianyu <[email protected]>

* Recommended build instruction

Co-authored-by: Neo Zhang Jianyu <[email protected]>

* Add comments in Intel GPU linux

---------

Co-authored-by: Anas Ahouzi <[email protected]>
Co-authored-by: Neo Zhang Jianyu <[email protected]>

b2710

Toggle b2710's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
`build`: generate hex dump of server assets during build (ggml-org#6661)

* `build`: generate hex dumps of server assets on the fly

* build: workaround lack of -n on gnu xxd

* build: don't use xxd in cmake

* build: don't call xxd from build.zig

* build: more idiomatic hexing

* build: don't use xxd in Makefile (od hackery instead)

* build: avoid exceeding max cmd line limit in makefile hex dump

* build: hex dump assets at cmake build time (not config time)

b2699

Toggle b2699's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ci: add ubuntu latest release and fix missing build number (mac & ubu…

…ntu) (ggml-org#6748)

b2674

Toggle b2674's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llama : add missing kv clear in llama_beam_search (ggml-org#6664)