Skip to content

hyprstream/hyprstream

HyprStream: agentic infrastructure for continously learning applications

Rust License: AGPL v3 License: MIT DeepWiki

Overview

HyprStream is an agentic cloud infrastructure for applications that learn, build, and run. Integrating continous development, training, integration, and deployment of software and AI/ML models. Primary features include an LLM inference and training engine built in Rust, with PyTorch, featuring integrated training capabilities, version control, and secure tool use with microvm containers.

Users may communicate with open weight and custom LLMs via Hyprstream with an OpenAI API.

Easy to get started: download the AppImage and it auto-detects your NVIDIA or ROCm GPU. See docs/quickstart.md for a full walkthrough.

Core Features

  • Frontend-ready: Use the included TUI for easy of use and share terminals with collaborators and agents.
  • Collaborative: Multi-user, multi-agent interfaces through a high-speed compositing multiplexer.
  • LLM Inference & Training: Supporting the dense Qwen3.5 and Qwen3 model architectures.
  • Test Time Training: Models train models using MCP tools, test-time-training, and the Muon optimizer.
  • Security-minded: Zero-trust cryptographic architecture with ZK stream proxies, Casbin Policy, and OpenID integration.
  • Industry-compatible: Providing compatibility with OpenAI's OpenAPI specification.
  • Hardware Accelerated: NVIDIA CUDA and AMD ROCm support, universal binary.
  • Version Controlled: Manages source and weights with Git, compatible with HuggingFace.
  • Systemd Integration - Optional user-level service management for background workers, long-running services, and containers.
  • Powered by Torch: Built on stable PyTorch C++ API (libtorch) using tch-rs.

Experimental Features

  • Workers - Isolated workload execution using Kata microvms with cloud-hypervisor.
  • [Workflows] - Git workflow file support for local continous integration, deployment, and functions-as-a-service.
  • [Metrics] - Structured knowledge engine and time-series aggregation database powered by DuckDB, ADBC, and Flight.

Installation

Quick Install (AppImage, Linux)

Hyprstream requires git and git-lfs (available in all major Linux distros).

Download the Universal AppImage. We publish AppImages for each CPU/GPU configuration; the Universal image is recommended for ease-of-use and GPU auto-detection.

# Download and install (Universal recommended)
chmod +x hyprstream-v0.3.0-x86_64.AppImage

# Installer Path (v0.4.0+):

./hyprstream-v0.4.0-x86_64.AppImage wizard # add `-y` for autoinstall

# Manual path (< v0.3.0):
./hyprstream-v0.3.0-x86_64.AppImage service install

# Add to PATH
export PATH="$HOME/.local/bin:$PATH"

# Apply policy template (hyprstream is deny-by-default)
hyprstream quick policy apply-template local

hyprstream service start

See docs/quickstart.md for prerequisites, source build, and first-time setup.

NOTE: For CUDA systems, make sure you have installed CUDA Toolkit and set LD_PRELOAD:

systemctl --user set-environment LD_PRELOAD=libtorch_cuda.so && systemctl --user restart hyprstream-model

The installed files will be located in $HOME/.local/hyprstream/ and $HOME/.local/bin/.

Building from source

# Set LIBTORCH to your libtorch path, or use --features download-libtorch
cargo build --release

See docs/quickstart.md for prerequisites and DEVELOP.md for detailed build instructions.

Container deployment

Hyprstream can run inside containers. See README-Docker.md for Docker/Kubernetes deployment.

Quick Start

Clone a model

Hyprstream supports Qwen3 model inference from Git repositories (HuggingFace, GitHub, etc.).

# Clone a model
hyprstream quick clone https://huggingface.co/Qwen/Qwen3-0.6B

# Clone with a custom name
hyprstream quick clone https://huggingface.co/Qwen/Qwen3-0.6B --name qwen3-small

Managing models

Worktrees are automatically managed by hyprstream.

# List all cached models
hyprstream quick list

# Get detailed model information (model:branch format)
hyprstream quick info qwen3-small
hyprstream quick info qwen3-small:main

Run inference

# Basic inference
hyprstream quick infer qwen3-small:main \
    --prompt "Explain quantum computing in simple terms"

# With options
hyprstream quick infer qwen3-small:main \
    --prompt "Write a Python function to sort a list" \
    --temperature 0.7 \
    --top-p 0.9 \
    --max-tokens 1024

Architecture

Architecture

Integrating Hyprstream into your business or workflow

OpenAI-Compatible REST API

HyprStream provides an OpenAI-compatible API endpoint for easy integration with existing tools and libraries:

# Start API server
hyprstream server --port 6789

# List available models (worktree-based)
curl http://localhost:6789/oai/v1/models

# Example response shows models as model:branch format
# {
#   "object": "list",
#   "data": [
#     {
#       "id": "qwen3-small:main",
#       "object": "model",
#       "created": 1762974327,
#       "owned_by": "system driver:overlay2, saved:2.3GB, age:2h cached"
#     },
#     {
#       "id": "qwen3-small:experiment-1",
#       "object": "model",
#       "created": 1762975000,
#       "owned_by": "system driver:overlay2, saved:1.8GB, age:30m"
#     }
#   ]
# }

# Make chat completions request (OpenAI-compatible)
# NOTE: Models must be referenced with branch (model:branch format)
curl -X POST http://localhost:6789/oai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-small:main",
    "messages": [
      {"role": "user", "content": "Hello, world!"}
    ],
    "max_tokens": 100,
    "temperature": 0.7
  }'

# Or use with any OpenAI-compatible client
export OPENAI_API_KEY="dummy"
export OPENAI_BASE_URL="http://localhost:6789/oai/v1"
# Now use any OpenAI client library
# Note: Specify model as "qwen3-small:main" not just "qwen3-small"

Worktree-Based Model References

HyprStream uses Git worktrees for model management. The /v1/models endpoint lists all worktrees (not base models):

  • Format: Models are always shown as model:branch (e.g., qwen3-small:main)
  • Multiple Versions: Each worktree (branch) appears as a separate model
  • Metadata: The owned_by field includes worktree metadata:
    • Storage driver (e.g., driver:overlay2)
    • Space saved via CoW (e.g., saved:2.3GB)
    • Worktree age (e.g., age:2h)
    • Cache status (cached if loaded in memory)

Example: If you have a model qwen3-small with branches main, experiment-1, and training, the API will list three separate entries:

  • qwen3-small:main
  • qwen3-small:experiment-1
  • qwen3-small:training

This allows you to work with multiple versions of the same model simultaneously, each in its own worktree with isolated changes.

MCP Integration (Claude Code, Cursor, etc.)

HyprStream includes a built-in Model Context Protocol server that exposes inference, model management, and repository operations as tools for AI coding assistants.

1. Configure Claude Code:

claude mcp add --transport http hyprstream http://localhost:6790/mcp

2. Authenticate

Use /mcp, select hyprstream, and select Authenticate or Re-authenticate.

3. Available tools:

Once connected, Claude Code can use hyprstream tools directly:

Tool Description
model.load Load a model for inference
model.list List loaded models
model.status Get model status and memory usage
registry.list List all cloned repositories
registry.clone Clone a model from HuggingFace/GitHub
repo.* Branch, worktree, merge, and tag operations
policy.* Policy checks and token management

Configuration:

The MCP server listens on port 6790 by default. To change it, set in your hyprstream config:

[mcp]
host = "127.0.0.1"
http_port = 6790

Or configure via the OAI-compatible API on port 6789 for non-MCP clients.

Advanced deployments

HyprStream can be configured via environment variables with the HYPRSTREAM_ prefix:

# Server configuration
export HYPRSTREAM_SERVER_HOST=0.0.0.0
export HYPRSTREAM_SERVER_PORT=6789
export HYPRSTREAM_API_KEY=your-api-key

# CORS settings
export HYPRSTREAM_CORS_ENABLED=true
export HYPRSTREAM_CORS_ORIGINS="*"

# Model management
export HYPRSTREAM_PRELOAD_MODELS=model1,model2,model3
export HYPRSTREAM_MAX_CACHED_MODELS=5
export HYPRSTREAM_MODELS_DIR=/custom/models/path

# Performance tuning
export HYPRSTREAM_USE_MMAP=true
export HYPRSTREAM_GENERATION_TIMEOUT=120

Security & Authentication

Hyprstream implements layered security-in-depth:

Security Layers

Layer Technology Purpose
Transport CURVE encryption (TCP) End-to-end encryption for TCP connections
Application Ed25519 signed envelopes Request authentication and integrity
Authorization Casbin policy engine RBAC/ABAC access control
Isolation Kata Containers (optional) VM-level workload isolation for workers

RPC Architecture

All inter-service communication uses ZeroMQ with Cap'n Proto serialization:

  • REQ/REP: Synchronous RPC calls (policy checks, model queries)
  • PUB/SUB: Event streaming (sandbox lifecycle, training progress)
  • XPUB/XSUB: Steerable proxy for event distribution

Every request is wrapped in a SignedEnvelope:

  • Ed25519 signature over the request payload
  • Nonce for replay protection
  • Timestamp for clock skew validation
  • Request identity (Local user, API token, Peer, or Anonymous)

Service Spawning

Services can run in multiple modes:

  • Tokio task: In-process async execution
  • Dedicated thread: For !Send types (e.g., tch-rs tensors)
  • Subprocess: Isolated process with systemd or standalone backend

See docs/rpc-architecture.md for detailed RPC infrastructure documentation.

Policy Engine

Quick Start:

# View current policy
hyprstream policy show

# Check if a user has permission
hyprstream policy check alice model:qwen3-small infer

# Create an API token
hyprstream policy token create \
  --user alice \
  --name "dev-token" \
  --expires 30d \
  --scope "model:*"

# Apply a built-in template -- allow all local users access to all actions on all resources
hyprstream policy apply-template local

Built-in Templates:

  • local - Full access for local users (default)
  • public-inference - Anonymous inference access
  • public-read - Anonymous read-only registry access

Worker Resources (experimental):

Resource Description
sandbox:*, sandbox:{id} Pod sandbox (Kata VM) operations
container:*, container:{id} Container lifecycle within sandboxes
image:*, image:{name} Image pull/push/list operations
workflow:*, workflow:{path} Workflow execution (.github/workflows/*.yml)
tool:*, tool:{name} MCP tool access (tool:bash, tool:read_file)

Policy History & Rollback:

# View policy commit history
hyprstream policy history

# Compare draft vs running policy
hyprstream policy diff

# Rollback to previous version
hyprstream policy rollback HEAD~1

REST API Authentication:

# Create a token
hyprstream policy token create --user alice --name "my-token" --expires 1d

# Use with API requests
curl -H "Authorization: Bearer eyJ..." http://localhost:6789/v1/models

See docs/rpc-architecture.md for detailed RPC and service infrastructure documentation.

Telemetry & Observability

HyprStream supports OpenTelemetry for distributed tracing, enabled via the otel feature flag.

Building with OpenTelemetry

# Build with otel support
cargo build --features otel --release

# Combine with other features
cargo build --no-default-features --features tch-cuda,otel --release

OpenTelemetry Configuration

Environment Variable Purpose Default
HYPRSTREAM_OTEL_ENABLE Enable/disable telemetry false
OTEL_EXPORTER_OTLP_ENDPOINT OTLP backend endpoint http://localhost:4317
OTEL_SERVICE_NAME Service name in traces hyprstream
HYPRSTREAM_LOG_DIR File logging directory None (console only)

Usage Examples

Local development (stdout exporter):

export HYPRSTREAM_OTEL_ENABLE=true
export RUST_LOG=hyprstream=debug
hyprstream server --port 6789
# Spans printed to console

Production (OTLP to Jaeger/Tempo):

export HYPRSTREAM_OTEL_ENABLE=true
export OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4317
export OTEL_SERVICE_NAME=hyprstream-prod
hyprstream server --port 6789

File logging (separate from OTEL):

export HYPRSTREAM_LOG_DIR=/var/log/hyprstream
hyprstream server --port 6789
# Creates daily-rotated logs at /var/log/hyprstream/hyprstream.log

Exporter Modes

  • OTLP: Used automatically when running server command; sends traces to backends like Jaeger, Tempo, or Datadog
  • Stdout: Used for CLI commands; prints spans to console for debugging

Debugging GPU detection issues:

If the Universal AppImage is not detecting your GPU, you may override the settings:

# List all available backends
./hyprstream-v0.2.0-x86_64.AppImage --list-backends

# Detect available backends
./hyprstream-v0.2.0-x86_64.AppImage --detect-gpu

# Override backend selection for Universal AppImage:
HYPRSTREAM_BACKEND=cuda130 ./hyprstream-v0.2.0-x86_64.AppImage server

System Requirements

  • Operating System: Linux (x86_64, ARM64)
  • Inference Service Requirements (optional):
    • CPU: Full support (x86_64, ARM64)
    • CUDA: NVIDIA host kernel modules (nvidia-smi works)
    • ROCm: AMDGPU kernel modules and userland (rocm-smi works)
  • Workers Service Requirements (optional, experimental):
    • Nested Virtualization: The host system running hyprstream-workers must support and have enabled nested virtualization, this may require a physical machine, bare-metal VM, or proper configuration in your QEMU/KVM settings.
  • 8GB+ RAM for inference, 16GB+ for training
  • Optional Dependencies:
    • systemd - For service management and worker process isolation
    • cloud-hypervisor - For Kata container workers (experimental)

Contributing

See CONTRIBUTING.md for guidelines.

License

This project uses a dual-licensing model:

AGPL-3.0 - The end-user experience and crates providing public APIs:

  • hyprstream (main application)
  • hyprstream-metrics
  • hyprstream-flight

See LICENSE-AGPLV3 for details.

MIT - Library crates for broader reuse:

  • git2db - Git repository management
  • gittorrent - P2P git transport
  • git-xet-filter - XET large file storage filter
  • cas-serve - CAS server for XET over SSH
  • hyprstream-rpc - RPC infrastructure
  • hyprstream-rpc-derive - RPC derive macros

See LICENSE-MIT for details.

Acknowledgments

Built with:

About

HyprStream: agentic infrastructure for continous online-learning applications

Resources

License

AGPL-3.0, MIT licenses found

Licenses found

AGPL-3.0
LICENSE-AGPLV3
MIT
LICENSE-MIT

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages