AIaaS (AI as a Service) — Create AI projects and consume them via a simple REST API.
Demo: https://ai.ince.pt — Username: demo Password: demo
git clone https://github.com/apocas/restai && cd restai
make install
make dev # → http://localhost:9000/admin (admin / admin)Or with Docker:
docker compose --env-file .env up --build- Multi-project AI platform — RAG, Agents, Routers, SQL-to-NL, Block (visual logic), and Inference in one place
- Full Web UI included — React dashboard with analytics, not just an API
- Any LLM — OpenAI, Anthropic, Ollama, Gemini, Groq, LiteLLM, vLLM, Azure, and more
- Feature complete — Teams, RBAC, OAuth/LDAP, token tracking, Kubernetes-native
- Extensible tools — MCP (Model Context Protocol) for unlimited agent integrations
- Token tracking & cost analytics — built-in dashboard with daily usage, per-project costs, and top LLM charts
Track token usage, costs, and project activity from a centralized dashboard. Daily charts, top projects, and LLM distribution at a glance.
Create and manage AI projects. Each project has its own LLM, system prompt, tools, and configuration. Test instantly in the built-in chat playground.
Upload documents and query them with LLM-powered retrieval. Supports multiple vector stores, reranking (ColBERT / LLM-based), sandboxed mode to reduce hallucination, and evaluation via deepeval.
Connect a MySQL or PostgreSQL database — RESTai crawls the schema and translates natural language questions into SQL queries automatically.
Zero-shot ReAct agents with built-in tools and MCP (Model Context Protocol) server support for extensible tool access. Connect any MCP-compatible server via HTTP/SSE or stdio.
Direct LLM chat and completion. Supports sending images alongside text using any vision-capable model (LLaVA, Gemini, GPT-4o, etc.).
Routes queries to the most suitable project automatically. Similar to a zero-shot ReAct strategy, but each route is a project scored by relevance.
Build processing logic visually using a Blockly-based IDE — no LLM required. Drag-and-drop blocks to define how input is transformed into output. Use the "Call Project" block to invoke other RESTai projects, enabling composition of AI pipelines without writing code.
Supported blocks: text operations, math, logic, variables, loops, and custom RESTai blocks (Get Input, Set Output, Call Project, Log).
Local and remote image generators loaded dynamically. Supports Stable Diffusion, Flux, DALL-E, RMBG2, and more.
RESTai automatically detects NVIDIA GPUs at startup and displays detailed hardware information in the admin settings — model name, VRAM, temperature, utilization, power draw, driver and CUDA versions. GPU support is auto-enabled when hardware is detected, or can be toggled manually.
make install also detects GPUs automatically and installs GPU dependencies when available.
Use LLMs, image generators, and audio transcription directly via OpenAI-compatible API endpoints — no project required. Team-level permissions control which models each user can access, and all usage counts toward team budgets.
Supported endpoints:
POST /v1/chat/completions— Chat with any LLM (streaming supported)POST /v1/images/generations— Generate images via DALL-E, Flux, Stable Diffusion, etc.POST /v1/audio/transcriptions— Transcribe audio files
Works with any OpenAI-compatible SDK:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:9000/v1", api_key="YOUR_API_KEY")
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)Each team has its own users, admins, projects, and LLM/embedding access controls — including image and audio generator permissions. Users can belong to multiple teams.
Connect any project to Telegram via BotFather. Messages are processed through the project's chat pipeline and responses are sent back automatically.
White-label the UI, configure currency for cost tracking, set agent iteration limits, manage LLM proxy, and more.
Any LLM provider supported by LlamaIndex. Each model has a configurable context window with automatic chat memory management — older messages are summarized rather than dropped.
| Provider | Class | Type |
|---|---|---|
| Ollama | Ollama / OllamaMultiModal |
chat / vision |
| OpenAI | OpenAI |
chat |
| Anthropic | Anthropic |
chat |
| Google Gemini | Gemini / GeminiMultiModal |
chat / vision |
| Groq | Groq |
chat |
| Grok (xAI) | Grok |
chat |
| LiteLLM | LiteLLM |
chat |
| vLLM | vLLM |
chat |
| Azure OpenAI | AzureOpenAI |
chat |
| OpenAI-Compatible | OpenAILike |
chat |
Backend: FastAPI · SQLAlchemy · LlamaIndex · Alembic Frontend: React 18 · MUI v5 · Redux Toolkit Vector Stores: ChromaDB · PGVector · Weaviate · Pinecone Databases: SQLite (default) · PostgreSQL · MySQL Package Manager: uv
All endpoints are documented via Swagger and API reference.
Create a project:
curl -X POST http://localhost:9000/projects \
-u admin:admin \
-H 'Content-Type: application/json' \
-d '{
"name": "my-rag",
"type": "rag",
"llm": "gpt-4o",
"embeddings": "text-embedding-3-small",
"vectorstore": "chroma"
}'Chat with a project:
curl -X POST http://localhost:9000/projects/my-rag/chat \
-u admin:admin \
-H 'Content-Type: application/json' \
-d '{"message": "What is RESTai?"}'RESTai uses uv for dependency management. Python 3.11+ required.
make install # Install deps, initialize DB, build frontend
make dev # Development server with hot reload (port 9000)
make start # Production server (4 workers, port 9000)Default credentials: admin / admin (configurable via RESTAI_DEFAULT_PASSWORD).
# Edit .env with your configuration, then:
docker compose --env-file .env up --buildOptional profiles for additional services:
docker compose --env-file .env --profile redis up --build # + Redis
docker compose --env-file .env --profile postgres up --build # + PostgreSQL
docker compose --env-file .env --profile mysql up --build # + MySQLA Helm chart is provided in chart/restai/.
helm install restai chart/restai/ \
--set config.database.postgres.host=my-postgres \
--set secrets.postgresPassword=mypasswordFor production with multiple replicas, set fixed secrets for JWT and encryption:
helm install restai chart/restai/ \
--set config.database.postgres.host=postgres \
--set secrets.postgresPassword=mypassword \
--set secrets.authSecret=$(openssl rand -base64 48) \
--set secrets.ssoSecretKey=$(openssl rand -base64 48) \
--set secrets.fernetKey=$(python -c 'from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())')See chart/restai/ for full Helm values and configuration options.
No state stored in the RESTai service — ideal for horizontal scaling.
Direct interaction with the GPU layer — ideal for small deployments.
| Variable | Description | Default |
|---|---|---|
RESTAI_DEFAULT_PASSWORD |
Admin user password | admin |
RESTAI_DEV |
Enable dev mode with hot reload | false |
RESTAI_GPU |
Enable GPU features (image gen) | auto-detected |
OPENAI_API_KEY |
OpenAI API key | — |
ANTHROPIC_API_KEY |
Anthropic API key | — |
POSTGRES_HOST |
Use PostgreSQL instead of SQLite | — |
MYSQL_HOST |
Use MySQL instead of SQLite | — |
REDIS_HOST / REDIS_PORT |
Redis for persistent chat history | — |
CHROMADB_HOST / CHROMADB_PORT |
Remote ChromaDB for vector storage | — |
MCP_SERVER |
Enable MCP server endpoint | — |
Full configuration in restai/config.py.
Contributions are welcome! Please open an issue or submit a pull request.
make dev # Run dev server
pytest tests # Run tests
make code # Format with blackPedro Dias - @pedromdias
Licensed under the Apache License, Version 2.0. See LICENSE for details.




















