HyprStream: agentic infrastructure for continously learning applications

Overview

HyprStream is an agentic cloud infrastructure for applications that learn, build, and run. Integrating continous development, training, integration, and deployment of software and AI/ML models. Primary features include an LLM inference and training engine built in Rust, with PyTorch, featuring integrated training capabilities, version control, and secure tool use with microvm containers.

Users may communicate with open weight and custom LLMs via Hyprstream with an OpenAI API.

Easy to get started: download the AppImage and it auto-detects your NVIDIA or ROCm GPU. See docs/quickstart.md for a full walkthrough.

Core Features

Frontend-ready: Use the included TUI for easy of use and share terminals with collaborators and agents.
Collaborative: Multi-user, multi-agent interfaces through a high-speed compositing multiplexer.
LLM Inference & Training: Supporting the dense Qwen3.5 and Qwen3 model architectures.
Test Time Training: Models train models using MCP tools, test-time-training, and the Muon optimizer.
Security-minded: Zero-trust cryptographic architecture with ZK stream proxies, Casbin Policy, and OpenID integration.
Industry-compatible: Providing compatibility with OpenAI's OpenAPI specification.
Hardware Accelerated: NVIDIA CUDA and AMD ROCm support, universal binary.
Version Controlled: Manages source and weights with Git, compatible with HuggingFace.
Systemd Integration - Optional user-level service management for background workers, long-running services, and containers.
Powered by Torch: Built on stable PyTorch C++ API (libtorch) using tch-rs.

Experimental Features

Workers - Isolated workload execution using Kata microvms with cloud-hypervisor.
[Workflows] - Git workflow file support for local continous integration, deployment, and functions-as-a-service.
[Metrics] - Structured knowledge engine and time-series aggregation database powered by DuckDB, ADBC, and Flight.

Installation

Quick Install (AppImage, Linux)

Hyprstream requires git and git-lfs (available in all major Linux distros).

Download the Universal AppImage. We publish AppImages for each CPU/GPU configuration; the Universal image is recommended for ease-of-use and GPU auto-detection.

# Download and install (Universal recommended)
chmod +x hyprstream-v0.3.0-x86_64.AppImage

# Installer Path (v0.4.0+):

./hyprstream-v0.4.0-x86_64.AppImage wizard # add `-y` for autoinstall

# Manual path (< v0.3.0):
./hyprstream-v0.3.0-x86_64.AppImage service install

# Add to PATH
export PATH="$HOME/.local/bin:$PATH"

# Apply policy template (hyprstream is deny-by-default)
hyprstream quick policy apply-template local

hyprstream service start

See docs/quickstart.md for prerequisites, source build, and first-time setup.

NOTE: For CUDA systems, make sure you have installed CUDA Toolkit and set LD_PRELOAD:

systemctl --user set-environment LD_PRELOAD=libtorch_cuda.so && systemctl --user restart hyprstream-model

The installed files will be located in $HOME/.local/hyprstream/ and $HOME/.local/bin/.

Building from source

# Set LIBTORCH to your libtorch path, or use --features download-libtorch
cargo build --release

See docs/quickstart.md for prerequisites and DEVELOP.md for detailed build instructions.

Container deployment

Hyprstream can run inside containers. See README-Docker.md for Docker/Kubernetes deployment.

Quick Start

Clone a model

Hyprstream supports Qwen3 model inference from Git repositories (HuggingFace, GitHub, etc.).

# Clone a model
hyprstream quick clone https://huggingface.co/Qwen/Qwen3-0.6B

# Clone with a custom name
hyprstream quick clone https://huggingface.co/Qwen/Qwen3-0.6B --name qwen3-small

Managing models

Worktrees are automatically managed by hyprstream.

# List all cached models
hyprstream quick list

# Get detailed model information (model:branch format)
hyprstream quick info qwen3-small
hyprstream quick info qwen3-small:main

Run inference

# Basic inference
hyprstream quick infer qwen3-small:main \
    --prompt "Explain quantum computing in simple terms"

# With options
hyprstream quick infer qwen3-small:main \
    --prompt "Write a Python function to sort a list" \
    --temperature 0.7 \
    --top-p 0.9 \
    --max-tokens 1024

Architecture

Integrating Hyprstream into your business or workflow

OpenAI-Compatible REST API

HyprStream provides an OpenAI-compatible API endpoint for easy integration with existing tools and libraries:

# Start API server
hyprstream server --port 6789

# List available models (worktree-based)
curl http://localhost:6789/oai/v1/models

# Example response shows models as model:branch format
# {
#   "object": "list",
#   "data": [
#     {
#       "id": "qwen3-small:main",
#       "object": "model",
#       "created": 1762974327,
#       "owned_by": "system driver:overlay2, saved:2.3GB, age:2h cached"
#     },
#     {
#       "id": "qwen3-small:experiment-1",
#       "object": "model",
#       "created": 1762975000,
#       "owned_by": "system driver:overlay2, saved:1.8GB, age:30m"
#     }
#   ]
# }

# Make chat completions request (OpenAI-compatible)
# NOTE: Models must be referenced with branch (model:branch format)
curl -X POST http://localhost:6789/oai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-small:main",
    "messages": [
      {"role": "user", "content": "Hello, world!"}
    ],
    "max_tokens": 100,
    "temperature": 0.7
  }'

# Or use with any OpenAI-compatible client
export OPENAI_API_KEY="dummy"
export OPENAI_BASE_URL="http://localhost:6789/oai/v1"
# Now use any OpenAI client library
# Note: Specify model as "qwen3-small:main" not just "qwen3-small"

Worktree-Based Model References

HyprStream uses Git worktrees for model management. The /v1/models endpoint lists all worktrees (not base models):

Format: Models are always shown as model:branch (e.g., qwen3-small:main)
Multiple Versions: Each worktree (branch) appears as a separate model
Metadata: The owned_by field includes worktree metadata:
- Storage driver (e.g., driver:overlay2)
- Space saved via CoW (e.g., saved:2.3GB)
- Worktree age (e.g., age:2h)
- Cache status (cached if loaded in memory)

Example: If you have a model qwen3-small with branches main, experiment-1, and training, the API will list three separate entries:

qwen3-small:main
qwen3-small:experiment-1
qwen3-small:training

This allows you to work with multiple versions of the same model simultaneously, each in its own worktree with isolated changes.

MCP Integration (Claude Code, Cursor, etc.)

HyprStream includes a built-in Model Context Protocol server that exposes inference, model management, and repository operations as tools for AI coding assistants.

1. Configure Claude Code:

claude mcp add --transport http hyprstream http://localhost:6790/mcp

2. Authenticate

Use /mcp, select hyprstream, and select Authenticate or Re-authenticate.

3. Available tools:

Once connected, Claude Code can use hyprstream tools directly:

Tool	Description
`model.load`	Load a model for inference
`model.list`	List loaded models
`model.status`	Get model status and memory usage
`registry.list`	List all cloned repositories
`registry.clone`	Clone a model from HuggingFace/GitHub
`repo.*`	Branch, worktree, merge, and tag operations
`policy.*`	Policy checks and token management

Configuration:

The MCP server listens on port 6790 by default. To change it, set in your hyprstream config:

[mcp]
host = "127.0.0.1"
http_port = 6790

Or configure via the OAI-compatible API on port 6789 for non-MCP clients.

Advanced deployments

HyprStream can be configured via environment variables with the HYPRSTREAM_ prefix:

# Server configuration
export HYPRSTREAM_SERVER_HOST=0.0.0.0
export HYPRSTREAM_SERVER_PORT=6789
export HYPRSTREAM_API_KEY=your-api-key

# CORS settings
export HYPRSTREAM_CORS_ENABLED=true
export HYPRSTREAM_CORS_ORIGINS="*"

# Model management
export HYPRSTREAM_PRELOAD_MODELS=model1,model2,model3
export HYPRSTREAM_MAX_CACHED_MODELS=5
export HYPRSTREAM_MODELS_DIR=/custom/models/path

# Performance tuning
export HYPRSTREAM_USE_MMAP=true
export HYPRSTREAM_GENERATION_TIMEOUT=120

Security & Authentication

Hyprstream implements layered security-in-depth:

Security Layers

Layer	Technology	Purpose
Transport	CURVE encryption (TCP)	End-to-end encryption for TCP connections
Application	Ed25519 signed envelopes	Request authentication and integrity
Authorization	Casbin policy engine	RBAC/ABAC access control
Isolation	Kata Containers (optional)	VM-level workload isolation for workers

RPC Architecture

All inter-service communication uses ZeroMQ with Cap'n Proto serialization:

REQ/REP: Synchronous RPC calls (policy checks, model queries)
PUB/SUB: Event streaming (sandbox lifecycle, training progress)
XPUB/XSUB: Steerable proxy for event distribution

Every request is wrapped in a SignedEnvelope:

Ed25519 signature over the request payload
Nonce for replay protection
Timestamp for clock skew validation
Request identity (Local user, API token, Peer, or Anonymous)

Service Spawning

Services can run in multiple modes:

Tokio task: In-process async execution
Dedicated thread: For !Send types (e.g., tch-rs tensors)
Subprocess: Isolated process with systemd or standalone backend

See docs/rpc-architecture.md for detailed RPC infrastructure documentation.

Policy Engine

Quick Start:

# View current policy
hyprstream policy show

# Check if a user has permission
hyprstream policy check alice model:qwen3-small infer

# Create an API token
hyprstream policy token create \
  --user alice \
  --name "dev-token" \
  --expires 30d \
  --scope "model:*"

# Apply a built-in template -- allow all local users access to all actions on all resources
hyprstream policy apply-template local

Built-in Templates:

local - Full access for local users (default)
public-inference - Anonymous inference access
public-read - Anonymous read-only registry access

Worker Resources (experimental):

Resource	Description
`sandbox:*`, `sandbox:{id}`	Pod sandbox (Kata VM) operations
`container:*`, `container:{id}`	Container lifecycle within sandboxes
`image:*`, `image:{name}`	Image pull/push/list operations
`workflow:*`, `workflow:{path}`	Workflow execution (.github/workflows/*.yml)
`tool:*`, `tool:{name}`	MCP tool access (tool:bash, tool:read_file)

Policy History & Rollback:

# View policy commit history
hyprstream policy history

# Compare draft vs running policy
hyprstream policy diff

# Rollback to previous version
hyprstream policy rollback HEAD~1

REST API Authentication:

# Create a token
hyprstream policy token create --user alice --name "my-token" --expires 1d

# Use with API requests
curl -H "Authorization: Bearer eyJ..." http://localhost:6789/v1/models

See docs/rpc-architecture.md for detailed RPC and service infrastructure documentation.

Telemetry & Observability

HyprStream supports OpenTelemetry for distributed tracing, enabled via the otel feature flag.

Building with OpenTelemetry

# Build with otel support
cargo build --features otel --release

# Combine with other features
cargo build --no-default-features --features tch-cuda,otel --release

OpenTelemetry Configuration

Environment Variable	Purpose	Default
`HYPRSTREAM_OTEL_ENABLE`	Enable/disable telemetry	`false`
`OTEL_EXPORTER_OTLP_ENDPOINT`	OTLP backend endpoint	`http://localhost:4317`
`OTEL_SERVICE_NAME`	Service name in traces	`hyprstream`
`HYPRSTREAM_LOG_DIR`	File logging directory	None (console only)

Usage Examples

Local development (stdout exporter):

export HYPRSTREAM_OTEL_ENABLE=true
export RUST_LOG=hyprstream=debug
hyprstream server --port 6789
# Spans printed to console

Production (OTLP to Jaeger/Tempo):

export HYPRSTREAM_OTEL_ENABLE=true
export OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4317
export OTEL_SERVICE_NAME=hyprstream-prod
hyprstream server --port 6789

File logging (separate from OTEL):

export HYPRSTREAM_LOG_DIR=/var/log/hyprstream
hyprstream server --port 6789
# Creates daily-rotated logs at /var/log/hyprstream/hyprstream.log

Exporter Modes

OTLP: Used automatically when running server command; sends traces to backends like Jaeger, Tempo, or Datadog
Stdout: Used for CLI commands; prints spans to console for debugging

Debugging GPU detection issues:

If the Universal AppImage is not detecting your GPU, you may override the settings:

# List all available backends
./hyprstream-v0.2.0-x86_64.AppImage --list-backends

# Detect available backends
./hyprstream-v0.2.0-x86_64.AppImage --detect-gpu

# Override backend selection for Universal AppImage:
HYPRSTREAM_BACKEND=cuda130 ./hyprstream-v0.2.0-x86_64.AppImage server

System Requirements

Operating System: Linux (x86_64, ARM64)
Inference Service Requirements (optional):
- CPU: Full support (x86_64, ARM64)
- CUDA: NVIDIA host kernel modules (nvidia-smi works)
- ROCm: AMDGPU kernel modules and userland (rocm-smi works)
Workers Service Requirements (optional, experimental):
- Nested Virtualization: The host system running hyprstream-workers must support and have enabled nested virtualization, this may require a physical machine, bare-metal VM, or proper configuration in your QEMU/KVM settings.
8GB+ RAM for inference, 16GB+ for training
Optional Dependencies:
- systemd - For service management and worker process isolation
- cloud-hypervisor - For Kata container workers (experimental)

Contributing

See CONTRIBUTING.md for guidelines.

License

This project uses a dual-licensing model:

AGPL-3.0 - The end-user experience and crates providing public APIs:

hyprstream (main application)
hyprstream-metrics
hyprstream-flight

See LICENSE-AGPLV3 for details.

MIT - Library crates for broader reuse:

git2db - Git repository management
gittorrent - P2P git transport
git-xet-filter - XET large file storage filter
cas-serve - CAS server for XET over SSH
hyprstream-rpc - RPC infrastructure
hyprstream-rpc-derive - RPC derive macros

See LICENSE-MIT for details.

Acknowledgments

Built with:

PyTorch - Deep learning framework
tch - Rust bindings for PyTorch
SafeTensors - Efficient tensor serialization
Git2 - Git operations in Rust
Tokio - Async runtime
Casbin - Authorization library for policy engine
Kata Containers - VM-based container isolation (experimental)
cloud-hypervisor - Virtual machine monitor (experimental)

Name		Name	Last commit message	Last commit date
Latest commit History 725 Commits
.github/workflows		.github/workflows
appimage		appimage
archlinux		archlinux
crates		crates
docs		docs
.dockerignore		.dockerignore
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DEVELOP.md		DEVELOP.md
Dockerfile		Dockerfile
LICENSE-AGPLV3		LICENSE-AGPLV3
LICENSE-MIT		LICENSE-MIT
README-Docker.md		README-Docker.md
README.md		README.md
architecture.png		architecture.png
clippy.toml		clippy.toml
docker-compose.yml		docker-compose.yml
hyprstream.sh		hyprstream.sh
install.sh		install.sh

Folders and files

Latest commit

History

Repository files navigation

HyprStream: agentic infrastructure for continously learning applications

Overview

Core Features

Experimental Features

Installation

Quick Install (AppImage, Linux)

Building from source

Container deployment

Quick Start

Clone a model

Managing models

Run inference

Architecture

Integrating Hyprstream into your business or workflow

OpenAI-Compatible REST API

Worktree-Based Model References

MCP Integration (Claude Code, Cursor, etc.)

Advanced deployments

Security & Authentication

Security Layers

RPC Architecture

Service Spawning

Policy Engine

Telemetry & Observability

Building with OpenTelemetry

OpenTelemetry Configuration

Usage Examples

Exporter Modes

Debugging GPU detection issues:

System Requirements

Contributing

License

Acknowledgments

About

Resources

License

Licenses found

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages