dev random (7478f102) at 14 Mar 13:29
nv-bench: add SGLang engine support, venv isolation, DRY engine-spe...
... and 1 more commit
dev random (abaea3b1) at 11 Mar 10:43
ad-hoc benchmark
non-trailing zero is invalid
also, don't we need to compare the indices?
we should only ignore trailing zeros, not zeros in any position
how are we using Hash?
dev random (1dd12c18) at 05 Mar 11:43
Add pyrightconfig and type stubs for LSP support
... and 5 more commits
dev random (fa8b6ea0) at 04 Mar 16:24
Trim AGENTS.md: move LiteLLM usage details to docs/operations/infer...
... and 5 more commits
dev random (7fb20194) at 04 Mar 16:11
more benchmarking
dev random (beaa9290) at 03 Mar 17:02
llama.cpp concurrency analysis: KV cache limits, prefill-decode con...
dev random (9c890018) at 02 Mar 20:24
Use inference.xc for LiteLLM API, make 'inference ls' alias for 'mo...
... and 4 more commits
dev random (c1d975aa) at 28 Feb 09:25
Switch inference model management from file-based to LiteLLM DB API
... and 1 more commit
dev random (a1eeb56f) at 27 Feb 22:28
Add /etc/inputrc to macOS nodes for Ctrl+Arrow word navigation
... and 3 more commits
dev random (64a9405f) at 24 Feb 10:40
Add Nvidia node bootstrap and CUDA/driver playbook
dev random (958850d4) at 19 Feb 14:30
Pass GLOBAL_LOG_LEVEL through to Open WebUI container, document deb...
dev random (5d4e2cf4) at 18 Feb 14:56
Add Open WebUI build/upgrade ops docs, remove ineffective stdbuf
dev random (631254ec) at 17 Feb 10:37
Fix macOS NAS mount persistence across NAS reboots
dev random (2cc49887) at 14 Feb 23:23
Increase Open WebUI nginx proxy timeouts to 10m and fix change dete...
dev random (d17009f4) at 11 Feb 20:46
Add LLM performance notes for context length throughput drop and KV...