I'm an ML engineer and independent consultant at Parlance Labs. I spend most of my time helping teams build AI products. Previously, I did applied ML at GitHub and Airbnb.
I'm working to bring data science back to AI: helping teams debug, analyze, and measure their systems. I call this "evals," and after doing it across 35+ AI products, I co-authored Evals for AI Engineers (O'Reilly), covering error analysis, LLM-as-a-judge, synthetic data, production monitoring, and building data flywheels. I also co-teach a course on evals on Maven.
I write about what I learn at hamel.dev. Some recent posts:
| Date | Post |
|---|---|
| Mar 2026 | The Revenge of the Data Scientist |
| Mar 2026 | Evals Skills for Coding Agents |
| Jan 2026 | LLM Evals: Everything You Need to Know |
| Jul 2025 | Stop Saying RAG Is Dead |
| Mar 2025 | A Field Guide to Rapidly Improving AI Products |
| Dec 2024 | nbsanity - Share Notebooks as Polished Web Pages in Seconds |
| Nov 2024 | Building an Audience Through Technical Writing: Strategies and Mistakes |
| Oct 2024 | Using LLM-as-a-Judge For Evaluation: A Complete Guide |
| Oct 2024 | Concurrency Foundations For FastHTML |
| Jul 2024 | An Open Course on LLMs, Led by Practitioners |
I've contributed to tools across ML infrastructure, developer experience, and data science workflows. Full list here.

