Tags · Dibakar/llama.cpp

b3337

Update llama-cli documentation (ggml-org#8315)

* Update README.md

* Update README.md

* Update README.md

fixed llama-cli/main, templates on some cmds added chat template sections and fixed typos in some areas

* Update README.md

* Update README.md

* Update README.md

Jul 7, 2024
a8db2a9
zip
tar.gz
Downloads

b3162

cuda : fix bounds check for src0 rows in MMVQ kernel (whisper/2231)

* cuda : fix bounds check for src0 rows in MMVQ kernel

* Update ggml-cuda/mmvq.cu

Co-authored-by: Johannes Gäßler <[email protected]>

---------

Co-authored-by: Johannes Gäßler <[email protected]>

Jun 16, 2024
19b7a83
zip
tar.gz
Downloads

b3087

common : refactor cli arg parsing (ggml-org#7675)

* common : gpt_params_parse do not print usage

* common : rework usage print (wip)

* common : valign

* common : rework print_usage

* infill : remove cfg support

* common : reorder args

* server : deduplicate parameters

ggml-ci

* common : add missing header

ggml-ci

* common : remote --random-prompt usages

ggml-ci

* examples : migrate to gpt_params

ggml-ci

* batched-bench : migrate to gpt_params

* retrieval : migrate to gpt_params

* common : change defaults for escape and n_ctx

* common : remove chatml and instruct params

ggml-ci

* common : passkey use gpt_params

Jun 4, 2024
1442677
zip
tar.gz
Downloads

b3078

llama : offload to RPC in addition to other backends (ggml-org#7640)

* llama : offload to RPC in addition to other backends

* - fix copy_tensor being called on the src buffer instead of the dst buffer

- always initialize views in the view_src buffer

- add RPC backend to Makefile build

- add endpoint to all RPC object names

* add rpc-server to Makefile

* Update llama.cpp

Co-authored-by: slaren <[email protected]>

---------

Co-authored-by: slaren <[email protected]>

Jun 3, 2024
bde7cd3
zip
tar.gz
Downloads

b2774

switch to using localizedDescription (ggml-org#7010)

Apr 30, 2024
f364eb6
zip
tar.gz
Downloads

b2755

Fix more int overflow during quant (PPL/CUDA). (ggml-org#6563)

* Fix more int overflow during quant.

* Fix some more int overflow in softmax.

* Revert back to int64_t.

Apr 28, 2024
e00b4a8
zip
tar.gz
Downloads

b2716

[SYCL] Windows default build instructions without -DLLAMA_SYCL_F16 fl…

…ag activated (ggml-org#6767)

* Fix FP32/FP16 build instructions

* Fix typo

* Recommended build instruction

Co-authored-by: Neo Zhang Jianyu <[email protected]>

* Recommended build instruction

Co-authored-by: Neo Zhang Jianyu <[email protected]>

* Recommended build instruction

Co-authored-by: Neo Zhang Jianyu <[email protected]>

* Add comments in Intel GPU linux

---------

Co-authored-by: Anas Ahouzi <[email protected]>
Co-authored-by: Neo Zhang Jianyu <[email protected]>

Apr 23, 2024
4e96a81
zip
tar.gz
Downloads

b2710

`build`: generate hex dump of server assets during build (ggml-org#6661)

* `build`: generate hex dumps of server assets on the fly

* build: workaround lack of -n on gnu xxd

* build: don't use xxd in cmake

* build: don't call xxd from build.zig

* build: more idiomatic hexing

* build: don't use xxd in Makefile (od hackery instead)

* build: avoid exceeding max cmd line limit in makefile hex dump

* build: hex dump assets at cmake build time (not config time)

Apr 21, 2024
5cf5e7d
zip
tar.gz
Downloads

b2699

ci: add ubuntu latest release and fix missing build number (mac & ubu…

…ntu) (ggml-org#6748)

Apr 19, 2024
0e4802b
zip
tar.gz
Downloads

b2674

llama : add missing kv clear in llama_beam_search (ggml-org#6664)

Apr 14, 2024
1958f7e
zip
tar.gz
Downloads

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b3337

b3162

b3087

b3078

b2774

b2755

b2716

b2710

b2699

b2674

Tags: Dibakar/llama.cpp