Tags · jeffmaury/llama.cpp

b3837

test-backend-ops : use flops for some performance tests (ggml-org#9657)

* test-backend-ops : use flops for some performance tests

- parallelize tensor quantization

- use a different set of cases for performance and correctness tests

- run each test for at least one second

Sep 28, 2024
1b2f992
zip
tar.gz

b3835

vocab : refactor tokenizer to reduce init overhead (ggml-org#9449)

* refactor tokenizer

* llama : make llm_tokenizer more private

ggml-ci

* refactor tokenizer

* refactor tokenizer

* llama : make llm_tokenizer more private

ggml-ci

* remove unused files

* remove unused fileds to avoid unused filed build error

* avoid symbol link error

* Update src/llama.cpp

* Update src/llama.cpp

---------

Co-authored-by: Georgi Gerganov <[email protected]>

Sep 28, 2024
6102037
zip
tar.gz

b3834

llama : add support for Chameleon (ggml-org#8543)

* convert chameleon hf to gguf

* add chameleon tokenizer tests

* fix lint

* implement chameleon graph

* add swin norm param

* return qk norm weights and biases to original format

* implement swin norm

* suppress image token output

* rem tabs

* add comment to conversion

* fix ci

* check for k norm separately

* adapt to new lora implementation

* fix layer input for swin norm

* move swin_norm in gguf writer

* add comment regarding special token regex in chameleon pre-tokenizer

* Update src/llama.cpp

Co-authored-by: compilade <[email protected]>

* fix punctuation regex in chameleon pre-tokenizer (@compilade)

Co-authored-by: compilade <[email protected]>

* fix lint

* trigger ci

---------

Co-authored-by: compilade <[email protected]>

Sep 28, 2024
9a91311
zip
tar.gz

b3832

ggml : add run-time detection of neon, i8mm and sve (ggml-org#9331)

* ggml: Added run-time detection of neon, i8mm and sve

Adds run-time detection of the Arm instructions set features
neon, i8mm and sve for Linux and Apple build targets.

* ggml: Extend feature detection to include non aarch64 Arm arch

* ggml: Move definition of ggml_arm_arch_features to the global data section

Sep 28, 2024
6a0f779
zip
tar.gz

b3831

Enable use to the rebar feature to upload buffers to the device. (ggm…

…l-org#9251)

Sep 28, 2024
89f9944
zip
tar.gz

b3829

cmake : add option for common library (ggml-org#9661)

Sep 27, 2024
44f59b4
zip
tar.gz

b3828

[SYCL] add missed dll file in package (ggml-org#9577)

* update oneapi to 2024.2

* use 2024.1

---------

Co-authored-by: arthw <[email protected]>

Sep 26, 2024
95bc82f
zip
tar.gz

b3827

mtgpu: enable VMM (ggml-org#9597)

Signed-off-by: Xiaodong Ye <[email protected]>

Sep 26, 2024
7691654
zip
tar.gz

b3825

ggml : remove assert for AArch64 GEMV and GEMM Q4 kernels (ggml-org#9217

)

* ggml : remove assert for AArch64 GEMV and GEMM Q4 kernels

* added fallback mechanism when the offline re-quantized model is not
optimized for the underlying target.

* fix for build errors

* remove prints from the low-level code

* Rebase to the latest upstream

Sep 25, 2024
1e43630
zip
tar.gz

b3824

server : add more env vars, improve gen-docs (ggml-org#9635)

* server : add more env vars, improve gen-docs

* update server docs

* LLAMA_ARG_NO_CONTEXT_SHIFT

Sep 25, 2024
afbbfaa
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b3837

b3835

b3834

b3832

b3831

b3829

b3828

b3827

b3825

b3824

Tags: jeffmaury/llama.cpp