Tags: razims/llama.cpp
Tags
Define non-positive temperature behavior (ggml-org#720)
10+% performance improvement of ggml_vec_dot_q4_0 on AVX2 (ggml-org#654) * Performance improvement of AVX2 code * Fixed problem with MSVC compiler * Reviewer comments: removed double semicolon, deleted empty line 1962
Windows: reactive sigint handler after each Ctrl-C (ggml-org#736)
Added api for getting/setting the kv_cache (ggml-org#685) The api provides access methods for retrieving the current memory buffer for the kv_cache and its token number. It also contains a method for setting the kv_cache from a memory buffer. This makes it possible to load/save history - maybe support --cache-prompt paramater as well? Co-authored-by: Pavol Rusnak <[email protected]>
make : use -march=native -mtune=native on x86 (ggml-org#609)
llama : do not allocate KV cache for "vocab_only == true" (ggml-org#682) Fixes sanitizer CI
Enable -std= for cmake builds, fix warnings (ggml-org#598)
PreviousNext