ML Researcher · Open-Source Engineer · Apple Silicon ML Ecosystem
Stuttgart, Germany — building the infrastructure that makes local AI actually work.
I build the tools that let researchers and engineers run, train, and understand large language models on their own hardware — specifically Apple Silicon. My work spans core MLX contributions, independent research published on arXiv, and open-source projects used by thousands.
If my work has saved you GPU bills, unlocked fine-tuning on your Mac, or ended up in your pipeline — consider sponsoring. Everything I build is free, maintained in my spare time, and driven by the belief that capable AI tooling shouldn't require a data center.
| Paper | Description | Year |
|---|---|---|
| JOSIEfied Qwen3.5 Gabliterated Models | Newest iteration of the JOSIEfied model family — now with vision support | 2026 |
| DynaMoE | Dynamic, adaptive Mixture-of-Experts LLM architecture | 2026 |
| JOSIE Models | World's first fully fine-tuned model family trained entirely on Apple Silicon | 2026 |
| JOSIEfied Qwen3 Abliterated Models | Reached #1 globally on uncensored benchmarks | 2025 |
| Gabliteration | Automated Gabliteration for any Transformers-compatible LLM | 2025 |
Train large language models natively on Apple Silicon
LoRA, QLoRA, and full-precision fine-tuning for LLMs — built on MLX. Supports Preference Optimization, RLHF, RL with and custom reward functions. The go-to fine-tuning toolkit for anyone running Apple Silicon.
- Full-precision, LoRA, and QLoRA training modes
- 12+ training methods and algorythms
- WandB integration for training metrics
- Multiple optimizer support including Muon
- Example notebooks
Abliterate and research the inner workings of LLMs
Interpretability and intervention tooling for language models in MLX. Understand what's happening inside your model, not just what comes out.
Your own Google's NotebookLM — fully local, fully private
PDF-grounded audio generation (podcasts, summaries, and many more with up to 6 speakers) running entirely on-device. No API keys, no cloud, no data leaving your machine. As well as a compation native App in Local NotebookLM-App.
Fine-tune embedding models natively on Apple Silicon
Train and adapt embedding models for retrieval, search, and semantic tasks — directly in MLX.
Benchmark LLMs on Apple MLX framework knowledge and coding tasks.
MLX Benchmark is the first comprehensive CLI tool and dataset that measures how well large language models understand, write, and debug code for Apple's MLX machine learning framework — covering everything from core array operations to LoRA fine-tuning with mlx-lm, mlx-vlm, and mlx-embeddings.
📐 MLX-KAN
Kolmogorov-Arnold Networks in MLX
Native MLX implementation of KANs — a fundamentally different alternative to MLPs, implemented cleanly for Apple Silicon.
Automated abliteration for any Transformers LLM
Companion code to arXiv:2412.06527. Remove refusal directions from any model supported in Hugging Face Transformers.
A family of fine-tuned models that reached #1 globally on relevant benchmarks (arXiv:2512.18901).
I'm an officially acknowledged contributor to the core MLX stack:
ml-explore/mlx— core frameworkml-explore/mlx-lm— language model inference & trainingml-explore/mlx-examples— reference implementationsBlaizzy/mlx-vlm— vision-language models
Expand full list
| Model | Organization |
|---|---|
| Mamba v1 & v2 | State Space |
| MiniCPM & MiniCPM3 | OpenBMB |
| Helium | Kyutai |
| GLM, GLM4, GLM5 | Z.ai & THUKEG |
| dots.llm1 | Rednote |
| Ernie4.5 MoE | Baidu |
| Bailing MoE & Bailing MoE Linear (Ling-family) | inclusionAI |
| Granite MoE | IBM |
| LongCat | Meituan |
| Nemotron H | NVIDIA |
| Apertus | Swiss-AI |
| OLMoE & OLMo 3 | AllenAI |
| Jamba | AI21 Labs |
| And many more... |
- Full weight fine-tuning support in mlx-lm
- Muon optimizer (both mlx and mlx-lm)
- WandB metric reporting during training
- Multiple optimizer choices for training runs
- ReLU² activation function in core MLX
A fully local, real-time, full-duplex multimodal assistant for smart home control
A discrete diffusion language model with a custom tokenizer (ChatML-style format, hardcoded vocabulary covering rooms, devices, properties, and continuous value bins) that can autonomously control and manage a complete smart home environment — sensors, cameras, LEDs, and more. Fully offline. No cloud dependency.
Current focus: training data generation strategies and model architecture validation.
A new Linear Dynamic Mixture-of-Experts LLM architecture — currently in development.
Everything above — the MLX contributions, the research, the open-source tools — is built in my spare time. If any of it has been useful to you or your team, sponsoring directly funds continued development, faster bug fixes, and new features.





