🦞 I'm Jean-Marc (druide67) — I build tools for local LLM inference on Apple Silicon.
asiai — Benchmark, monitor & compare 6 inference engines (Ollama, LM Studio, mlx-lm, llama.cpp, vllm-mlx, Exo). One CLI. Real numbers.
Built because my AI agents needed to monitor their own inference. So I gave them asiai's API. They started monitoring themselves.
Bench your claw!
- MLX is 2.3x faster than llama.cpp for MoE architectures on Apple Silicon
- DeltaNet KV cache stays flat from 64k to 256k context (same VRAM!)
- Same model, same Mac: 30 tok/s on one engine, 71 tok/s on another
OpenClaw contributor — multi-agent AI assistant.
Strasbourg, France | asiai.dev | @jmn67 on X | LinkedIn

