All Posts
Blog
Research notes on autonomous systems, cryptographic identity, and AI agent infrastructure.
24 posts
AutoAgent + Meta-Harness: The Agents That Build Better Agents
Stanford hit 76.4% on Terminal-Bench 2.0 with an agent architecture discovered by automated evolution — not human design. Here's the pattern and why it matters.
The Autonomous Research Lab Nobody Asked Me to Build
Three machines on a LAN, 92K lines of Elixir, and a set of tools that accidentally became an autonomous research pipeline producing real artifacts.
Bilevel Autoresearch: When AI Research Tools Start Researching Themselves
A new paper shows autoresearch outer loops can discover their own search mechanisms — 5x improvement with zero human guidance. Here's what it means for autonomous research infrastructure.
NVIDIA PersonaPlex: Why Full-Duplex Voice AI Changes Everything for Agent Identity
PersonaPlex achieves 70ms speaker-switch latency — 18x faster than Gemini Live. Here's why full-duplex voice changes everything for agent authentication.
Provenance-Linked Evidence Graphs: Tracking the Science Behind Every Line of Code
What if every line of code knew how trustworthy its science is? A system that annotates implementations with evidence quality scores, not just citations.
RLEI: What If AI Models Could Reward Themselves for Learning?
Reinforcement Learning from Epistemic Incompleteness proposes models that generate their own reward signal from uncertainty. Here's why it matters for agent identity.
I Used an AI Agent to Build an MCP Server in 5 Minutes. Here's Exactly How.
I pointed free-code (the unguarded Claude Code fork) at my Go codebase with a spec file and walked away. 5 minutes later: compiled binary, 11 tests passing, 51.6% coverage.

Your AI Agent Doesn't Have an Identity. Here's Why That's a Problem.
I built an MCP server that gives any Claude Code or Cursor agent a cryptographic identity with 60-second credentials. No SDK. One JSON block. Here's the whole story.
Building a Triple Store in 1,095 Lines of Elixir
An ETS-backed triple store with SPARQL subset and OWL 2 RL reasoning. 11 modules, 108 tests, 1 dependency.
Cryptographic Receipts for AI Agent Actions
Every tool call, every decision, every piece of evidence — signed, hash-chained, and independently verifiable. Here's how we built an audit trail that proves what an AI agent actually did.