Invite BetaPolicy routing + market settlement

Market-grade routing
for AI inference

One OpenAI-compatible endpoint for the best available models. Route by policy, match live seller supply, and keep pricing, reliability, and settlement visible from the first request.

OpenAI-compatible
Live best ask
Transparent reserve

AI model endpoints

24+

DeepSeek, Qwen, GLM, Kimi, MiniMax and more.

Buyer routing modes

3x

Cost-first, balanced, and reliability-first profiles.

OpenAI-compatible API

100%

One base URL for chat, embeddings, audio, image, and video.

Settlement stream

1 ledger

Route attempts, fills, fees, reserve, and payouts in one flow.

policy-route.py
matching engine online
from openai import OpenAI

client = OpenAI(
    base_url="https://dragonfly-api.com/v1",
    api_key="sk-df-your-key",
    default_headers={
        "x-routing-profile": "balanced",
    }
)

response = client.chat.completions.create(
    model="deepseek/deepseek-chat",
    messages=[{"role": "user", "content": "Route this to the best live ask."}],
    stream=True
)

# price, seller eligibility, and settlement remain visible
print(response.choices[0].message.content)|
route resultfilled

best ask selected

Buyer policy matched against live seller capacity with price, latency, and trust filters.

settlementvisible

fee + reserve tracked

Route attempt, payout, reserve, and platform margin stay in one ledger instead of disappearing into a black box.

Market board

Live best-ask pricing, not a static catalog.

Use Dragonfly like a developer console, but keep the market visible: current best input/output asks, seller depth, and board freshness.

Snapshot mode

Last refresh waiting for first board fetch

board state

snapshot

visible models

4

active sellers

board://top-of-bookfallback reference
USD per million tokens
Product surface

A routing console first, a marketing site second.

Dragonfly should feel like an execution venue for developers: pricing visible, policy explicit, and operations traceable.

Matching

Aggregated routing

Demand is matched against listed supply with transparent best-ask pricing per model instead of hand-wavy provider switching.

Control

Policy-driven buying

Bind one routing profile to many keys and keep cost, latency, provider, and privacy constraints enforceable at request time.

Recovery

Fallback ladder

When strict budget has no fill, the next price tier can step in deliberately rather than failing silently or overpaying by default.

Visibility

Full ledger trail

Each request emits route_attempt, route_result, and settlement_result so operations and finance see the same truth.

Marketplace

Buyer + seller surfaces

Buyers define guardrails while sellers publish listings, capacity, health, and payout floors inside the same market system.

Rollout

Invite beta discipline

The product promise stays narrow and credible: supported clients, reserve policy, verified access, and observable execution.

How Dragonfly Works

Buyer intent meets seller supply through policy-based matching.

Buyer Request

Key-bound profile sets budget, latency, provider, and privacy constraints.

Matching Engine

Finds best eligible listing, applies fallback ladder, and emits route events.

Seller Listing

Request executes on listed capacity, then settlement finalizes after cooldown.

Why this market is efficient

Transparent competition and explicit policy controls.

Live ask competition

Sellers compete with real ask prices, so buyers route into a true market spread.

Operational telemetry

Every route and settlement is recorded. You can debug reliability and economics from one stream.

Transparent settlement

Buyer pay-in, seller payout, platform fee, and reserve are explicit and auditable.

Policy guardrails

Profiles enforce max price, provider allow/deny sets, and latency/throughput constraints.