NeuronOS

Sovereign AI agent runtime for every device.

Quick Start • Features • Architecture • Hardware • Build • Docs

NeuronOS is a self-contained AI agent engine written in pure C11. It runs complete autonomous agents — with reasoning, memory, tool use, and inter-agent communication — on any device, from a Raspberry Pi to a cloud server, with zero runtime dependencies and zero cloud requirements.

Built on BitNet b1.58 ternary models, NeuronOS delivers useful AI agents on hardware as modest as 1.5 GB of RAM, entirely offline.

$ curl -fsSL https://raw.githubusercontent.com/Neuron-OS/NeuronOS/main/install.sh | sh
$ neuronos
> What files are in my project?
[tool: list_dir] Scanning ./...
Found 12 files. Here's what I see:
  src/main.c        — Entry point
  src/utils.c       — Helper functions
  Makefile           — Build configuration
  ...
> Remember that the deadline for this project is March 15
[tool: memory_store] Saved to archival memory.
Noted. I'll remember the March 15 deadline.

Quick Start

Universal Install (Linux, macOS, Android, Windows via WSL):

curl -fsSL https://raw.githubusercontent.com/Neuron-OS/NeuronOS/main/install.sh | sh

This single command will:

Detect your OS (Debian, Fedora, Arch, macOS, Android/Termux).
Install Dependencies (Vulkan SDK, CMake, Compilers) automatically.
Build & Install neuronos optimized for your hardware.
Download the best 1.58-bit model for your RAM.

Manual Build:

git clone https://github.com/Neuron-OS/neuronos
cd neuronos
./install.sh --build

Web/WASM Build:

./install.sh --wasm

Features

Agent Engine

ReAct reasoning loop — Think → Act → Observe cycles with transparent reasoning
12 built-in tools — Shell, file read/write, directory listing, file search, PDF reading, HTTP requests, calculator, time, and 3 memory tools
10,000+ external tools via MCP client integration
3-format GBNF grammar — Constrained generation for reliable tool calling
Multi-turn conversations with persistent context

Memory (MemGPT 3-Tier)

Core Memory — Key-value blocks injected into every prompt (persona, instructions)
Recall Memory — Full chat history per session, FTS5 full-text searchable
Archival Memory — Permanent facts with unique keys, searchable, access-tracked
Automatic context compaction at ~85% capacity with summarization

Protocols

MCP Server — Expose NeuronOS tools to any MCP-compatible client (JSON-RPC 2.0, STDIO)
MCP Client — Connect to external MCP servers, auto-discover and use their tools (~1,370 lines of pure C)
OpenAI-compatible HTTP API — /v1/chat/completions, /v1/models, SSE streaming
A2A Protocol — Agent-to-agent communication (coming next — first C implementation worldwide)

Inference

BitNet b1.58 ternary models — 2B params in 1.71 GiB, runs on 1.5 GB RAM
21 tokens/sec generation on a laptop CPU (i7-12650H, 4 threads)
95 tokens/sec prompt processing on the same hardware
Multi-model support — BitNet 2B, Falcon3-7B/10B (1.58-bit), Qwen2.5-3B/14B (Q4_K_M)
Automatic model selection based on detected hardware capabilities

Hardware Abstraction

5 ISA backends with automatic runtime detection:
- hal_scalar — Pure C fallback (works everywhere)
- hal_x86_avx2 — Intel/AMD Haswell+ (2013+)
- hal_x86_avxvnni — Intel Alder Lake+ (2021+)
- hal_arm_neon — Apple Silicon, Raspberry Pi 4/5
- CUDA build available for NVIDIA GPUs (Q4_K_M models)

Architecture

┌─────────────────────────────────────────────────┐
│  Layer 7: Applications                          │
│    CLI (8 modes) • HTTP Server • MCP Server     │
├─────────────────────────────────────────────────┤
│  Layer 6: Agent                                 │
│    ReAct Loop • Tool Dispatch • Step Callbacks  │
├─────────────────────────────────────────────────┤
│  Layer 5: Tools                                 │
│    Registry (12 built-in) • MCP Bridge • Sandbox│
├─────────────────────────────────────────────────┤
│  Layer 4: Grammar                               │
│    GBNF Constrained Generation (3 formats)      │
├─────────────────────────────────────────────────┤
│  Layer 3: Inference                             │
│    llama.cpp wrapper (BitNet I2_S kernels)      │
├─────────────────────────────────────────────────┤
│  Layer 2.5: Memory                              │
│    SQLite 3.47.2 + FTS5 (MemGPT 3-tier)        │
├─────────────────────────────────────────────────┤
│  Layer 2: HAL                                   │
│    Runtime ISA dispatch (scalar/AVX2/VNNI/NEON) │
├─────────────────────────────────────────────────┤
│  Layer 1: Hardware                              │
│    x86-64 • ARM64 • RISC-V • WASM (planned)    │
└─────────────────────────────────────────────────┘

~9,400 lines of C11 across 19 source files. No C++ in the public API.

Supported Hardware

Platform	CPU (Avx2/ARM)	GPU (Vulkan)	NPU	Web (WASM)
Linux	✅	✅	🚧	✅
macOS	✅	✅ (MoltenVK)	🚧	✅
Windows	✅	✅	🚧	✅
Android	✅	✅	🚧	✅
iOS	-	-	-	✅ (Safari)

Usage

Interactive Agent (default)

neuronos                          # Auto-detect model, launch agent
neuronos run "Summarize this"     # Single prompt
neuronos agent                    # Explicit agent mode

With MCP Tools

neuronos --mcp                    # Load tools from ~/.neuronos/mcp.json

HTTP Server (OpenAI-compatible)

neuronos serve --port 8080        # Start API server
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"neuronos","messages":[{"role":"user","content":"Hello"}]}'

MCP Server

neuronos mcp                      # JSON-RPC 2.0 over STDIO

Hardware Info

neuronos hwinfo                   # Show detected hardware + backends
neuronos scan                     # Scan for available models

Building from Source

Requirements

C11 compiler (Clang 14+ or GCC 12+ recommended)
CMake 3.20+
~2 GB disk space for build

Build

cmake -B build -S . -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++
cmake --build build -j$(nproc)

Test

./build/bin/test_hal && ./build/bin/test_engine && ./build/bin/test_memory
# Expected: 27/27 PASS

Build Options

Option	Description	Default
`CMAKE_BUILD_TYPE`	Release / Debug	Release
`BITNET_X86_TL2`	x86 TL2 kernel (experimental)	OFF
`CMAKE_EXPORT_COMPILE_COMMANDS`	Generate compile_commands.json	OFF

Project Structure

neuronos/
├── include/neuronos/
│   ├── neuronos.h              # Public API (694 lines, v0.9.1)
│   └── neuronos_hal.h          # HAL API (331 lines)
├── src/
│   ├── hal/                    # Hardware abstraction backends
│   │   ├── hal_registry.c      # Backend registry + CPUID detection
│   │   ├── hal_scalar.c        # Pure C fallback
│   │   ├── hal_x86_avx2.c     # AVX2 backend
│   │   ├── hal_x86_avxvnni.c  # AVX-VNNI backend
│   │   └── hal_arm_neon.c     # ARM NEON backend
│   ├── engine/
│   │   ├── neuronos_engine.c   # Inference engine (llama.cpp wrapper)
│   │   └── neuronos_model_selector.c  # HW detection + model scoring
│   ├── memory/
│   │   └── neuronos_memory.c   # MemGPT 3-tier memory (SQLite+FTS5)
│   ├── agent/
│   │   ├── neuronos_agent.c    # ReAct agent loop + memory integration
│   │   └── neuronos_tool_registry.c   # Tool registry + 12 built-in tools
│   ├── cli/
│   │   └── neuronos_cli.c     # CLI with 8 modes
│   ├── interface/
│   │   └── neuronos_server.c  # HTTP server (OpenAI API + SSE)
│   └── mcp/
│       ├── neuronos_mcp_server.c  # MCP server (JSON-RPC STDIO)
│       └── neuronos_mcp_client.c  # MCP client (~1370 lines)
├── 3rdparty/
│   ├── sqlite/                 # SQLite 3.47.2 amalgamation
│   └── sqlite-vec/            # sqlite-vec v0.1.6 (prepared)
├── tests/
│   ├── test_hal.c             # 4 HAL tests
│   ├── test_engine.c          # 11 engine + agent tests
│   └── test_memory.c          # 12 memory tests
└── grammars/
    ├── tool_call.gbnf         # Tool calling grammar
    └── json.gbnf              # JSON output grammar

Documentation

Document	Description
ROADMAP.md	Strategic roadmap and execution plan
TRACKING.md	Iteration-by-iteration progress log
AGENTS.md	Instructions for AI coding agents
ARSENAL.md	Technology arsenal and market research

What NeuronOS Is Not

Not an inference speed benchmark. llama.cpp will always be faster. We optimize for agent utility.
Not a cloud service. Everything runs locally. Your data never leaves your device.
Not a Python framework. Pure C11, zero runtime dependencies. Compiles to a single binary.
Not a replacement for GPT-5. Ternary models have limits. We bring intelligence where frontier models can't reach: offline, embedded, private, free.

Contributing

We welcome contributions. Please read AGENTS.md for coding standards and architecture guidelines before submitting PRs.

All tests must pass before any commit:

./build/bin/test_hal && ./build/bin/test_engine && ./build/bin/test_memory

License

MIT License. See LICENSE for details.

SQLite is public domain. sqlite-vec is MIT/Apache-2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 159 Commits
.github		.github
3rdparty		3rdparty
assets		assets
docs		docs
gpu		gpu
grammars		grammars
include		include
media		media
neuronos		neuronos
preset_kernels		preset_kernels
scripts		scripts
src		src
utils		utils
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
CMakeLists.txt		CMakeLists.txt
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
install.sh		install.sh
requirements.txt		requirements.txt
run_inference.py		run_inference.py
run_inference_server.py		run_inference_server.py
setup_env.py		setup_env.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeuronOS

Quick Start

Features

Agent Engine

Memory (MemGPT 3-Tier)

Protocols

Inference

Hardware Abstraction

Architecture

Supported Hardware

Usage

Interactive Agent (default)

With MCP Tools

HTTP Server (OpenAI-compatible)

MCP Server

Hardware Info

Building from Source

Requirements

Build

Test

Build Options

Project Structure

Documentation

What NeuronOS Is Not

Contributing

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NeuronOS

Quick Start

Features

Agent Engine

Memory (MemGPT 3-Tier)

Protocols

Inference

Hardware Abstraction

Architecture

Supported Hardware

Usage

Interactive Agent (default)

With MCP Tools

HTTP Server (OpenAI-compatible)

MCP Server

Hardware Info

Building from Source

Requirements

Build

Test

Build Options

Project Structure

Documentation

What NeuronOS Is Not

Contributing

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages