Skip to content

Neuron-OS/NeuronOS

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

159 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

NeuronOS

Sovereign AI agent runtime for every device.

Quick Start β€’ Features β€’ Architecture β€’ Hardware β€’ Build β€’ Docs

Version C11 Tests MIT Zero deps


NeuronOS is a self-contained AI agent engine written in pure C11. It runs complete autonomous agents β€” with reasoning, memory, tool use, and inter-agent communication β€” on any device, from a Raspberry Pi to a cloud server, with zero runtime dependencies and zero cloud requirements.

Built on BitNet b1.58 ternary models, NeuronOS delivers useful AI agents on hardware as modest as 1.5 GB of RAM, entirely offline.

$ curl -fsSL https://raw.githubusercontent.com/Neuron-OS/NeuronOS/main/install.sh | sh
$ neuronos
> What files are in my project?
[tool: list_dir] Scanning ./...
Found 12 files. Here's what I see:
  src/main.c        β€” Entry point
  src/utils.c       β€” Helper functions
  Makefile           β€” Build configuration
  ...
> Remember that the deadline for this project is March 15
[tool: memory_store] Saved to archival memory.
Noted. I'll remember the March 15 deadline.

Quick Start

Universal Install (Linux, macOS, Android, Windows via WSL):

curl -fsSL https://raw.githubusercontent.com/Neuron-OS/NeuronOS/main/install.sh | sh

This single command will:

  1. Detect your OS (Debian, Fedora, Arch, macOS, Android/Termux).
  2. Install Dependencies (Vulkan SDK, CMake, Compilers) automatically.
  3. Build & Install neuronos optimized for your hardware.
  4. Download the best 1.58-bit model for your RAM.

Manual Build:

git clone https://github.com/Neuron-OS/neuronos
cd neuronos
./install.sh --build

Web/WASM Build:

./install.sh --wasm

Features

Agent Engine

  • ReAct reasoning loop β€” Think β†’ Act β†’ Observe cycles with transparent reasoning
  • 12 built-in tools β€” Shell, file read/write, directory listing, file search, PDF reading, HTTP requests, calculator, time, and 3 memory tools
  • 10,000+ external tools via MCP client integration
  • 3-format GBNF grammar β€” Constrained generation for reliable tool calling
  • Multi-turn conversations with persistent context

Memory (MemGPT 3-Tier)

  • Core Memory β€” Key-value blocks injected into every prompt (persona, instructions)
  • Recall Memory β€” Full chat history per session, FTS5 full-text searchable
  • Archival Memory β€” Permanent facts with unique keys, searchable, access-tracked
  • Automatic context compaction at ~85% capacity with summarization

Protocols

  • MCP Server β€” Expose NeuronOS tools to any MCP-compatible client (JSON-RPC 2.0, STDIO)
  • MCP Client β€” Connect to external MCP servers, auto-discover and use their tools (~1,370 lines of pure C)
  • OpenAI-compatible HTTP API β€” /v1/chat/completions, /v1/models, SSE streaming
  • A2A Protocol β€” Agent-to-agent communication (coming next β€” first C implementation worldwide)

Inference

  • BitNet b1.58 ternary models β€” 2B params in 1.71 GiB, runs on 1.5 GB RAM
  • 21 tokens/sec generation on a laptop CPU (i7-12650H, 4 threads)
  • 95 tokens/sec prompt processing on the same hardware
  • Multi-model support β€” BitNet 2B, Falcon3-7B/10B (1.58-bit), Qwen2.5-3B/14B (Q4_K_M)
  • Automatic model selection based on detected hardware capabilities

Hardware Abstraction

  • 5 ISA backends with automatic runtime detection:
    • hal_scalar β€” Pure C fallback (works everywhere)
    • hal_x86_avx2 β€” Intel/AMD Haswell+ (2013+)
    • hal_x86_avxvnni β€” Intel Alder Lake+ (2021+)
    • hal_arm_neon β€” Apple Silicon, Raspberry Pi 4/5
    • CUDA build available for NVIDIA GPUs (Q4_K_M models)

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Layer 7: Applications                          β”‚
β”‚    CLI (8 modes) β€’ HTTP Server β€’ MCP Server     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 6: Agent                                 β”‚
β”‚    ReAct Loop β€’ Tool Dispatch β€’ Step Callbacks  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 5: Tools                                 β”‚
β”‚    Registry (12 built-in) β€’ MCP Bridge β€’ Sandboxβ”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 4: Grammar                               β”‚
β”‚    GBNF Constrained Generation (3 formats)      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 3: Inference                             β”‚
β”‚    llama.cpp wrapper (BitNet I2_S kernels)      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 2.5: Memory                              β”‚
β”‚    SQLite 3.47.2 + FTS5 (MemGPT 3-tier)        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 2: HAL                                   β”‚
β”‚    Runtime ISA dispatch (scalar/AVX2/VNNI/NEON) β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Layer 1: Hardware                              β”‚
β”‚    x86-64 β€’ ARM64 β€’ RISC-V β€’ WASM (planned)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

~9,400 lines of C11 across 19 source files. No C++ in the public API.

Supported Hardware

Platform CPU (Avx2/ARM) GPU (Vulkan) NPU Web (WASM)
Linux βœ… βœ… 🚧 βœ…
macOS βœ… βœ… (MoltenVK) 🚧 βœ…
Windows βœ… βœ… 🚧 βœ…
Android βœ… βœ… 🚧 βœ…
iOS - - - βœ… (Safari)

Usage

Interactive Agent (default)

neuronos                          # Auto-detect model, launch agent
neuronos run "Summarize this"     # Single prompt
neuronos agent                    # Explicit agent mode

With MCP Tools

neuronos --mcp                    # Load tools from ~/.neuronos/mcp.json

HTTP Server (OpenAI-compatible)

neuronos serve --port 8080        # Start API server
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"neuronos","messages":[{"role":"user","content":"Hello"}]}'

MCP Server

neuronos mcp                      # JSON-RPC 2.0 over STDIO

Hardware Info

neuronos hwinfo                   # Show detected hardware + backends
neuronos scan                     # Scan for available models

Building from Source

Requirements

  • C11 compiler (Clang 14+ or GCC 12+ recommended)
  • CMake 3.20+
  • ~2 GB disk space for build

Build

cmake -B build -S . -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++
cmake --build build -j$(nproc)

Test

./build/bin/test_hal && ./build/bin/test_engine && ./build/bin/test_memory
# Expected: 27/27 PASS

Build Options

Option Description Default
CMAKE_BUILD_TYPE Release / Debug Release
BITNET_X86_TL2 x86 TL2 kernel (experimental) OFF
CMAKE_EXPORT_COMPILE_COMMANDS Generate compile_commands.json OFF

Project Structure

neuronos/
β”œβ”€β”€ include/neuronos/
β”‚   β”œβ”€β”€ neuronos.h              # Public API (694 lines, v0.9.1)
β”‚   └── neuronos_hal.h          # HAL API (331 lines)
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ hal/                    # Hardware abstraction backends
β”‚   β”‚   β”œβ”€β”€ hal_registry.c      # Backend registry + CPUID detection
β”‚   β”‚   β”œβ”€β”€ hal_scalar.c        # Pure C fallback
β”‚   β”‚   β”œβ”€β”€ hal_x86_avx2.c     # AVX2 backend
β”‚   β”‚   β”œβ”€β”€ hal_x86_avxvnni.c  # AVX-VNNI backend
β”‚   β”‚   └── hal_arm_neon.c     # ARM NEON backend
β”‚   β”œβ”€β”€ engine/
β”‚   β”‚   β”œβ”€β”€ neuronos_engine.c   # Inference engine (llama.cpp wrapper)
β”‚   β”‚   └── neuronos_model_selector.c  # HW detection + model scoring
β”‚   β”œβ”€β”€ memory/
β”‚   β”‚   └── neuronos_memory.c   # MemGPT 3-tier memory (SQLite+FTS5)
β”‚   β”œβ”€β”€ agent/
β”‚   β”‚   β”œβ”€β”€ neuronos_agent.c    # ReAct agent loop + memory integration
β”‚   β”‚   └── neuronos_tool_registry.c   # Tool registry + 12 built-in tools
β”‚   β”œβ”€β”€ cli/
β”‚   β”‚   └── neuronos_cli.c     # CLI with 8 modes
β”‚   β”œβ”€β”€ interface/
β”‚   β”‚   └── neuronos_server.c  # HTTP server (OpenAI API + SSE)
β”‚   └── mcp/
β”‚       β”œβ”€β”€ neuronos_mcp_server.c  # MCP server (JSON-RPC STDIO)
β”‚       └── neuronos_mcp_client.c  # MCP client (~1370 lines)
β”œβ”€β”€ 3rdparty/
β”‚   β”œβ”€β”€ sqlite/                 # SQLite 3.47.2 amalgamation
β”‚   └── sqlite-vec/            # sqlite-vec v0.1.6 (prepared)
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ test_hal.c             # 4 HAL tests
β”‚   β”œβ”€β”€ test_engine.c          # 11 engine + agent tests
β”‚   └── test_memory.c          # 12 memory tests
└── grammars/
    β”œβ”€β”€ tool_call.gbnf         # Tool calling grammar
    └── json.gbnf              # JSON output grammar

Documentation

Document Description
ROADMAP.md Strategic roadmap and execution plan
TRACKING.md Iteration-by-iteration progress log
AGENTS.md Instructions for AI coding agents
ARSENAL.md Technology arsenal and market research

What NeuronOS Is Not

  • Not an inference speed benchmark. llama.cpp will always be faster. We optimize for agent utility.
  • Not a cloud service. Everything runs locally. Your data never leaves your device.
  • Not a Python framework. Pure C11, zero runtime dependencies. Compiles to a single binary.
  • Not a replacement for GPT-5. Ternary models have limits. We bring intelligence where frontier models can't reach: offline, embedded, private, free.

Contributing

We welcome contributions. Please read AGENTS.md for coding standards and architecture guidelines before submitting PRs.

All tests must pass before any commit:

./build/bin/test_hal && ./build/bin/test_engine && ./build/bin/test_memory

License

MIT License. See LICENSE for details.

SQLite is public domain. sqlite-vec is MIT/Apache-2.0.

About

Sovereign AI agent runtime for every device.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • C 35.3%
  • C++ 27.7%
  • Python 25.5%
  • Shell 4.9%
  • HTML 3.3%
  • CMake 1.7%
  • Other 1.6%