Tags · tmc/llama.cpp

b1627

english : use `typos` to fix comments and logs (ggml-org#4354)

Dec 12, 2023
9494d7c
zip
tar.gz

b1626

build : target Windows 8 for standard mingw-w64 (ggml-org#4405)

* build : target Windows 8 for standard mingw-w64

* make : fix missing console.o deps

This was causing a link error with `make all` on Windows.

Dec 12, 2023
6138963
zip
tar.gz

b1625

llama : document logits_all deprecation (ggml-org#4418)

llama_context_params.logits_all is a parameter for controlling
llama_eval. This documents that logits_all should not be used with
llama_decode and llama_batch.

Dec 12, 2023
6391817
zip
tar.gz

b1624

server : fix local model name in server (ggml-org#4420)

Dec 12, 2023
d9d4cfe
zip
tar.gz

b1623

ggml : increased GGML_MAX_PARAMS to allow finetuning of 70b models (g…

…gml-org#4424)

Dec 12, 2023
41a11aa
zip
tar.gz

b1621

grammar : revert the replacement of llama_token_to_piece with id_to_t…

…oken (ggml-org#4396)

Dec 9, 2023
e18f734
zip
tar.gz

b1620

sync : ggml (new ops, tests, backend, etc.) (ggml-org#4359)

* sync : ggml (part 1)

* sync : ggml (part 2, CUDA)

* sync : ggml (part 3, Metal)

* ggml : build fixes

ggml-ci

* cuda : restore lost changes

* cuda : restore lost changes (StableLM rope)

* cmake : enable separable compilation for CUDA

ggml-ci

* ggml-cuda : remove device side dequantize

* Revert "cmake : enable separable compilation for CUDA"

This reverts commit 09e35d0.

* cuda : remove assert for rope

* tests : add test-backend-ops

* ggml : fix bug in ggml_concat

* ggml : restore `ggml_get_n_tasks()` logic in `ggml_graph_plan()`

* ci : try to fix macOS

* ggml-backend : remove backend self-registration

* ci : disable Metal for macOS cmake build

ggml-ci

* metal : fix "supports family" call

* metal : fix assert

* metal : print resource path

ggml-ci

---------

Co-authored-by: slaren <[email protected]>

Dec 7, 2023
fe680e3
zip
tar.gz

b1619

llama : per-layer KV cache + quantum K cache (ggml-org#4309)

* per-layer KV

* remove unnecessary copies

* less code duplication, offload k and v separately

* llama : offload KV cache per-layer

* llama : offload K shift tensors

* llama : offload for rest of the model arches

* llama : enable offload debug temporarily

* llama : keep the KV related layers on the device

* llama : remove mirrors, perform Device -> Host when partial offload

* common : add command-line arg to disable KV cache offloading

* llama : update session save/load

* llama : support quantum K cache (ggml-org#4312)

* llama : support quantum K cache (wip)

* metal : add F32 -> Q8_0 copy kernel

* cuda : add F32 -> Q8_0 copy kernel

ggml-ci

* cuda : use mmv kernel for quantum cache ops

* llama : pass KV cache type through API

* llama : fix build

ggml-ci

* metal : add F32 -> Q4_0 copy kernel

* metal : add F32 -> Q4_1 copy kernel

* cuda : wip

* cuda : add F32 -> Q4_0 and F32 -> Q4_1 copy kernels

* llama-bench : support type_k/type_v

* metal : use mm kernel only for quantum KV cache

* cuda : add comment

* llama : remove memory_f16 and kv_f16 flags

---------

Co-authored-by: slaren <[email protected]>

* readme : add API change notice

---------

Co-authored-by: slaren <[email protected]>

Dec 7, 2023
bcc0eb4
zip
tar.gz

b1618

train : fix ggml-org#4227 (double free in examples/train-text-from-sc…

…ratch/train-text-from-scratch.cpp) (ggml-org#4351)

On commit b1108 (44c117f) xaedes added

    ggml_allocr * alloc = NULL;

    ... (many lines in between)

    if (alloc) {
        ggml_allocr_free(alloc);
    }

Which is correct, but it's easy to lose context after many lines in between.

On commit b1287 (0e76a89) xaedes made a big change. From here on, alloc is freed eagerly.

    alloc = ggml_allocr_new(...)
    ... (short lines of code)
    ggml_allocr_free(alloc)

This happens a few times, but alloc is never set to NULL, and many lines below,
we still have

    if (alloc) {
        ggml_allocr_free(alloc);
    }

which causes a double-free.

Dec 7, 2023
81bc921
zip
tar.gz

b1617

server : recognize cache_prompt parameter in OAI API (ggml-org#4347)

Dec 6, 2023
05cd6e5
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b1627

b1626

b1625

b1624

b1623

b1621

b1620

b1619

b1618

b1617

Tags: tmc/llama.cpp