RUX - AI Orchestration Engine

A local AI orchestration backend that turns natural language into safe, persistent state changes with validation, observability, feedback, and critique.

Architecture Snapshot

flowchart TD
    U["User Message"] --> P["Planner"]
    P --> E["Executor / Trust Boundary"]
    E --> T["Tool Adapters"]
    T --> D["Domain Services"]
    D --> R["Repositories"]
    R --> DB["PostgreSQL"]

    E --> O["Observability"]
    E --> C["Confidence + Critic"]

    D --> X["Expense Domain"]
    D --> Y["Project Domain"]
    D --> Z["Future Memory / Knowledge"]

Status

RUX is under active development and is currently being refactored toward a more modular domain-based architecture.

What RUX Is

Most toy agents look like this:

LLM -> tool -> response

RUX is built around a stronger runtime contract:

User
 -> Planner
 -> Executor (trust boundary)
 -> Tool Adapter
 -> Domain Service
 -> Repository
 -> PostgreSQL
 -> Observability / Outcome Tracking / Critique / Confidence
 -> Final Response

The core idea is simple: the LLM is not trusted. Anything before the executor is probabilistic. Anything after schema validation is expected to be deterministic, auditable, and safe to reason about.

Key Design Decisions

1. The Trust Boundary

The Executor is where trust is established. LLM output is treated as untrusted input and must pass schema validation with extra="forbid" before any tool is called. This catches hallucinated field names, invented action types, and malformed JSON before they reach domain logic.

LLM output         -> untrusted - can hallucinate anything
Executor (schema)  -> trust boundary
Tool onward        -> deterministic, validated, safe

2. Why the Planner Doesn't Call Tools Directly

LLMs are probabilistic. Tools are deterministic, state-mutating, and potentially destructive. Mixing these responsibilities makes the system harder to test, harder to reason about, and much easier to break.

Planner  -> intent extraction only
Executor -> structural validation
Tool     -> domain gateway
Service  -> business rules
Memory   -> persistence

3. Three-Layer Planner

Not everything should reach the LLM.

Layer 1 -> greeting keywords -> instant deterministic reply
Layer 2 -> action intent     -> LLM extracts structured JSON
Layer 3 -> open question     -> LLM responds conversationally

This protects confidence score integrity. Earlier, greeting-like inputs could accidentally flow into action logic and pollute outcome history.

4. Confidence from Data, Not from the LLM

Asking an LLM how confident it is usually produces weak signals. RUX is designed to calculate confidence from real historical outcomes:

SELECT domain, task_type,
       COUNT(*)         AS samples,
       AVG(was_correct) AS accuracy
FROM agent_outcomes
WHERE user_id = :user_id
GROUP BY domain, task_type

Confidence should only surface when there is enough history to justify it. Otherwise the system should return something like "Confidence: insufficient data" instead of fabricating certainty.

5. Critic Uses a Different Model

If the Planner and Critic use the same model, the Critic tends to agree with the original reasoning too easily. The idea behind RUX is that critique should be structurally independent, so the second opinion can challenge the first instead of just echoing it.

Core Ideas

Trust boundary: planner output is treated as untrusted until it passes schema validation.
Thin tools: tools translate validated params into domain service calls.
Domain-first structure: business behavior lives inside domains, not inside runtime glue.
Observable execution: runs and outcomes are logged for inspection and feedback.
Confidence from history: confidence is derived from past correctness, not model self-reported certainty.
Critique as a second layer: decisions can be reviewed independently instead of trusting a single model pass.

Current Domains

Expense: expense logging, budget enforcement, spend analysis
Project: project creation and deletion flows
In progress: modular runtime cleanup, hybrid memory direction, future knowledge layer

Current Structure

rux/
├── api/                # FastAPI routes
├── core/               # current runtime layer
├── domains/
│   ├── expense/
│   └── project/
├── repositories/       # shared persistence adapters
├── services/           # shared services + some legacy files
├── memory/             # legacy memory path, planned for refactor
├── tests/
├── database.py
├── models.py
└── main.py

Response Model

RUX is moving toward a shared internal tool contract:

ToolResponse.status
ToolResponse.message
ToolResponse.data
ToolResponse.error
ToolResponse.metadata

This makes tool execution easier to validate, log, test, and later route cleanly through the executor.

Tech Stack

Python
FastAPI
SQLAlchemy async ORM
PostgreSQL
Pydantic
Local LLM serving via LM Studio

Setup

# Clone
git clone https://github.com/rahulT-17/RUX-AI-Companion.git
cd RUX-AI-Companion

# Create virtual environment
python -m venv .venv

# Activate (PowerShell)
.\.venv\Scripts\Activate.ps1

# Install dependencies
python -m pip install -r requirements.txt

# Initialize database tables
python init_db.py

# Run the API
python -m uvicorn main:app --reload

What Works Now

planner -> executor -> domain tool flow
expense logging and budget enforcement
project creation and deletion
database-backed persistence
execution logging and feedback-oriented infrastructure
smoke tests for expense and project tool adapters

Roadmap

Make the executor consume ToolResponse end-to-end
Unify confirmed actions with the normal execution pipeline
Remove legacy duplicated service/repository files
Build hybrid memory: short-term, episodic, semantic retrieval
Add a knowledge layer for reusable facts, concepts, and sources
Improve deployment and production config hygiene

Why I Built This

I built RUX to understand what actually breaks in AI agent systems when you move past demos: unreliable tool calls, weak trust boundaries, missing feedback loops, and no real way to measure correctness over time.

The goal is not to build another chatbot wrapper. The goal is to build the runtime layer underneath an AI agent system: validation, orchestration, observability, critique, and eventually memory and knowledge.

Built as a learning project. Actively evolving.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RUX - AI Orchestration Engine

Architecture Snapshot

Status

What RUX Is

Key Design Decisions

1. The Trust Boundary

2. Why the Planner Doesn't Call Tools Directly

3. Three-Layer Planner

4. Confidence from Data, Not from the LLM

5. Critic Uses a Different Model

Core Ideas

Current Domains

Current Structure

Response Model

Tech Stack

Setup

What Works Now

Roadmap

Why I Built This

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
api		api
core		core
demo		demo
domains		domains
memory		memory
migrations		migrations
repositories		repositories
services		services
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
RUX.code-workspace		RUX.code-workspace
alembic.ini		alembic.ini
database.py		database.py
init_db.py		init_db.py
main.py		main.py
models.py		models.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

RUX - AI Orchestration Engine

Architecture Snapshot

Status

What RUX Is

Key Design Decisions

1. The Trust Boundary

2. Why the Planner Doesn't Call Tools Directly

3. Three-Layer Planner

4. Confidence from Data, Not from the LLM

5. Critic Uses a Different Model

Core Ideas

Current Domains

Current Structure

Response Model

Tech Stack

Setup

What Works Now

Roadmap

Why I Built This

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages