architecture.md

Architecture

This document describes prefixd's design decisions and data flow.

Overview

prefixd is a policy daemon that sits between attack detectors and BGP routers. It doesn't detect attacks or filter packets—it makes policy decisions about when and how to announce FlowSpec rules.

┌─────────────────────────────────────────────────────────────────────────────┐
│                              prefixd                                        │
│                                                                             │
│  ┌──────────┐    ┌──────────────┐    ┌────────────┐    ┌──────────────┐   │
│  │  HTTP    │───▶│   Policy     │───▶│ Guardrails │───▶│  FlowSpec    │   │
│  │  API     │    │   Engine     │    │            │    │  Manager     │   │
│  └──────────┘    └──────────────┘    └────────────┘    └──────┬───────┘   │
│       │                │                   │                   │           │
│       │                │                   │                   │           │
│       ▼                ▼                   ▼                   ▼           │
│  ┌──────────┐    ┌──────────────┐    ┌────────────┐    ┌──────────────┐   │
│  │ Events   │    │  Playbooks   │    │  Safelist  │    │   GoBGP      │   │
│  │   DB     │    │   (YAML)     │    │    DB      │    │   gRPC       │   │
│  └──────────┘    └──────────────┘    └────────────┘    └──────────────┘   │
│                                                                             │
│                        ┌──────────────────┐                                │
│                        │  Reconciliation  │                                │
│                        │      Loop        │                                │
│                        └──────────────────┘                                │
└─────────────────────────────────────────────────────────────────────────────┘
         │                                                        │
         │                                                        │
         ▼                                                        ▼
┌─────────────────┐                                    ┌─────────────────┐
│    Detectors    │                                    │     Routers     │
│   (FastNetMon,  │                                    │   (Juniper,     │
│    Prometheus)  │                                    │    Arista)      │
└─────────────────┘                                    └─────────────────┘

Design Principles

1. Signal-Driven, Not Detection

prefixd doesn't detect attacks. Detectors (FastNetMon, Prometheus alerts, custom scripts) signal attack events, and prefixd applies policy.

Why? Detection is hard and domain-specific. Rate-based detection misses low-and-slow attacks; ML-based detection has false positives. By separating detection from policy, you can:

Use the best detector for each attack type
Tune detection thresholds independently
Correlate multiple weak signals (future: v1.5)

2. Fail-Open

If prefixd crashes or becomes unavailable:

Existing mitigations continue until their TTL expires
No new mitigations are created (attacks may go unblocked)
No permanent rules are left behind

This is intentional. The alternative (fail-closed with permanent rules) risks blocking legitimate traffic indefinitely if prefixd dies.

3. /32 Only (IPv4)

prefixd enforces that mitigations target single IPs (/32 for IPv4, /128 for IPv6). Broader prefixes are rejected.

Why? A misconfigured detector sending a /24 could block 256 IPs. With /32-only, the blast radius of a false positive is one IP.

4. Mandatory TTL

Every mitigation must have a TTL. There are no permanent rules.

Why? Without TTL, a false positive requires manual intervention. With TTL, false positives auto-resolve when the mitigation expires.

5. Guardrails Everywhere

Every mitigation passes through guardrails:

Safelist check (is this IP protected?)
Prefix length check (/32 only)
TTL bounds check (min/max)
Port count check (max 8)
Quota check (per-customer, per-POP, global)

Guardrails can't be bypassed. They're the last line of defense against bad data.

Data Flow

Event Ingestion

POST /v1/events
     │
     ▼
┌─────────────────┐
│ Parse & Validate│───▶ 400 Bad Request
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Rate Limit Check│───▶ 429 Too Many Requests
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Safelist Check  │───▶ 403 Forbidden (safelisted)
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Inventory Lookup│───▶ Find customer/service context
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Playbook Match  │───▶ Find policy for this vector
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Guardrails      │───▶ 422 Validation Failed
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Create/Extend   │───▶ Duplicate? Extend TTL
│ Mitigation      │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Announce via    │───▶ GoBGP gRPC AddPath
│ GoBGP           │
└────────┬────────┘
         │
         ▼
     201 Created

Reconciliation Loop

Runs every 30 seconds:

┌─────────────────────────────────────────────────────────────────┐
│                     Reconciliation Loop                         │
│                                                                 │
│  1. Load desired state (active mitigations from DB)             │
│                         │                                       │
│                         ▼                                       │
│  2. Load actual state (FlowSpec routes from GoBGP RIB)          │
│                         │                                       │
│                         ▼                                       │
│  3. Find expired mitigations (TTL passed)                       │
│     └──▶ Withdraw from GoBGP                                   │
│     └──▶ Mark as "expired" in DB                               │
│                         │                                       │
│                         ▼                                       │
│  4. Find missing rules (in DB, not in RIB)                      │
│     └──▶ Re-announce via GoBGP                                 │
│                         │                                       │
│                         ▼                                       │
│  5. Find orphan rules (in RIB, not in DB)                       │
│     └──▶ Withdraw from GoBGP                                   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Why reconciliation?

GoBGP might restart, losing RIB state
prefixd might restart, missing the window to withdraw
Network issues might cause announcement failures
Manual GoBGP CLI operations might create inconsistency

Reconciliation ensures desired state eventually matches actual state.

Component Details

Policy Engine

The policy engine evaluates playbooks to determine mitigation actions.

# Playbook example
playbooks:
  - name: udp_flood
    match:
      vector: udp_flood
    steps:
      - action: police
        rate_bps: 10000000
        ttl_seconds: 120
      - action: discard
        ttl_seconds: 300
        require_confidence_at_least: 0.8

Evaluation order:

Find playbook matching event vector
Check escalation eligibility (is there an existing mitigation?)
Select step based on confidence threshold
Apply port exclusions from inventory

Guardrails

Guardrails are non-negotiable safety checks:

Check	Rejection Reason
`require_ttl`	TTL is mandatory
`min_ttl_seconds`	TTL below minimum
`max_ttl_seconds`	TTL above maximum
`dst_prefix_maxlen`	Prefix broader than /32
`max_ports`	Too many ports (router memory)
`max_active_per_customer`	Customer quota exceeded
`max_active_per_pop`	POP quota exceeded
`max_active_global`	Global quota exceeded
Safelist	IP is protected

FlowSpec Manager

Translates mitigations into FlowSpec NLRI:

Mitigation                    FlowSpec NLRI
─────────────────────────────────────────────────
dst_prefix: 203.0.113.10/32   Type 1: Destination Prefix
protocol: UDP                 Type 3: IP Protocol
dst_ports: [!53]              Type 5: Destination Port (negated)
action: police 10M            Extended Community: traffic-rate

GoBGP Client

gRPC client for GoBGP v4.x:

AddPath - Announce FlowSpec rule
DeletePath - Withdraw FlowSpec rule
ListPath - Query RIB for reconciliation
ListPeer - Monitor BGP session status

Features:

Connection pooling
Automatic retry with exponential backoff
Timeout handling (10s connect, 30s request)

State Management

PostgreSQL Schema

-- Events (001 + 003)
CREATE TABLE events (
    event_id UUID PRIMARY KEY,
    external_event_id TEXT,
    source TEXT NOT NULL,
    event_timestamp TIMESTAMPTZ NOT NULL,
    ingested_at TIMESTAMPTZ NOT NULL,
    victim_ip TEXT NOT NULL,
    vector TEXT NOT NULL,
    protocol INTEGER,
    bps BIGINT,
    pps BIGINT,
    top_dst_ports_json TEXT NOT NULL DEFAULT '[]',
    confidence REAL,
    schema_version INTEGER NOT NULL DEFAULT 1,
    raw_details JSONB,                          -- 003
    action TEXT NOT NULL DEFAULT 'ban',          -- 003
    UNIQUE(source, external_event_id)
);

-- Mitigations (001 + 005)
CREATE TABLE mitigations (
    mitigation_id UUID PRIMARY KEY,
    scope_hash TEXT NOT NULL,
    pop TEXT NOT NULL,
    customer_id TEXT,
    service_id TEXT,
    victim_ip TEXT NOT NULL,
    vector TEXT NOT NULL,
    schema_version INTEGER NOT NULL DEFAULT 1,
    match_json TEXT NOT NULL,
    action_type TEXT NOT NULL,
    action_params_json TEXT,
    status TEXT NOT NULL,                        -- pending, active, escalated, expired, withdrawn
    created_at TIMESTAMPTZ NOT NULL,
    updated_at TIMESTAMPTZ NOT NULL,
    expires_at TIMESTAMPTZ NOT NULL,
    withdrawn_at TIMESTAMPTZ,
    triggering_event_id UUID NOT NULL,
    last_event_id UUID NOT NULL,
    escalated_from_id UUID,
    reason TEXT,
    rejection_reason TEXT,
    acknowledged_at TIMESTAMPTZ,                 -- 005
    acknowledged_by TEXT                          -- 005
);

-- FlowSpec announcements (001)
CREATE TABLE flowspec_announcements (
    announcement_id UUID PRIMARY KEY,
    mitigation_id UUID NOT NULL REFERENCES mitigations(mitigation_id),
    pop TEXT NOT NULL,
    peer_name TEXT NOT NULL,
    peer_address TEXT NOT NULL,
    nlri_hash TEXT NOT NULL,
    nlri_json TEXT NOT NULL,
    action_json TEXT NOT NULL,
    status TEXT NOT NULL,
    announced_at TIMESTAMPTZ,
    withdrawn_at TIMESTAMPTZ,
    last_error TEXT,
    retry_count INTEGER NOT NULL DEFAULT 0
);

-- Audit log (001)
CREATE TABLE audit_log (
    audit_id UUID PRIMARY KEY,
    timestamp TIMESTAMPTZ NOT NULL,
    schema_version INTEGER NOT NULL DEFAULT 1,
    actor_type TEXT NOT NULL,
    actor_id TEXT,
    action TEXT NOT NULL,
    target_type TEXT,
    target_id TEXT,
    details_json TEXT NOT NULL
);

-- Safelist (001)
CREATE TABLE safelist (
    prefix TEXT PRIMARY KEY,
    added_at TIMESTAMPTZ NOT NULL,
    added_by TEXT NOT NULL,
    reason TEXT,
    expires_at TIMESTAMPTZ
);

-- Config snapshots (001)
CREATE TABLE config_snapshots (
    snapshot_id UUID PRIMARY KEY,
    timestamp TIMESTAMPTZ NOT NULL,
    config_hash TEXT NOT NULL,
    config_json TEXT NOT NULL
);

-- Operators (002)
CREATE TABLE operators (
    operator_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    username TEXT UNIQUE NOT NULL,
    password_hash TEXT NOT NULL,
    role TEXT NOT NULL DEFAULT 'operator',
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    created_by TEXT,
    last_login_at TIMESTAMPTZ
);

-- Sessions (002) — tower-sessions-sqlx-store
CREATE TABLE tower_sessions.session (
    id TEXT PRIMARY KEY NOT NULL,
    data BYTEA NOT NULL,
    expiry_date TIMESTAMPTZ NOT NULL
);

-- Schema migrations (004)
CREATE TABLE schema_migrations (
    version INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    applied_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Notification preferences (006)
CREATE TABLE notification_preferences (
    operator_id UUID PRIMARY KEY REFERENCES operators(operator_id) ON DELETE CASCADE,
    muted_events TEXT[] NOT NULL DEFAULT '{}',
    quiet_hours_start SMALLINT,
    quiet_hours_end SMALLINT,
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Multi-POP Model

Multiple prefixd instances share one PostgreSQL:

prefixd (iad1)  ──┐
prefixd (fra1)  ──┼──▶ PostgreSQL
prefixd (sin1)  ──┘

Each instance:

Filters by its own pop field
Announces to its local GoBGP
Has cross-POP visibility via ?pop=all

Security Model

Authentication Layers

Network level - API should be on private network or behind load balancer
Bearer token - For API/CLI access
Session cookie - For dashboard access
mTLS - Optional mutual TLS for zero-trust environments

Trust Boundaries

┌─────────────────────────────────────────────────────────────────┐
│ Trusted Zone                                                    │
│                                                                 │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐                 │
│  │ prefixd  │◀──▶│ PostgreSQL│    │  GoBGP   │                 │
│  └──────────┘    └──────────┘    └──────────┘                 │
│       ▲                               │                        │
│       │                               ▼                        │
│       │                         ┌──────────┐                   │
│       │                         │  Routers │                   │
│       │                         └──────────┘                   │
└───────┼─────────────────────────────────────────────────────────┘
        │
        │ HTTPS + Auth
        │
┌───────┼─────────────────────────────────────────────────────────┐
│ Untrusted Zone                                                  │
│       │                                                         │
│       ▼                                                         │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐                 │
│  │Detectors │    │Dashboard │    │   CLI    │                 │
│  └──────────┘    └──────────┘    └──────────┘                 │
└─────────────────────────────────────────────────────────────────┘

Key principle: Detectors are untrusted. They can suggest mitigations, but guardrails have final say.

Performance Characteristics

Latency

Operation	Typical	P99
Event ingestion	<5ms	<20ms
Inventory lookup	<1μs	<10μs
DB query	<2ms	<10ms
GoBGP announce	~10ms	~50ms

Throughput

Operation	Sustained	Burst
Event ingestion	~4,700/s	10,000/s
DB writes	2,000/s	5,000/s
GoBGP announces	100/s	200/s

Bottleneck: GoBGP

GoBGP gRPC is the bottleneck for high-volume events. Each announcement requires:

gRPC round-trip (~10ms)
BGP UPDATE construction
RIB insertion

For 1000+ mitigations/second, consider batching (not yet implemented).

Failure Modes

prefixd Crash

Existing mitigations continue (TTL-based expiry)
No new mitigations until restart
Reconciliation syncs state on restart

PostgreSQL Unavailable

Event ingestion fails (503)
Reconciliation pauses
GoBGP rules remain (no withdrawal)

GoBGP Unavailable

Event ingestion fails at announcement step
Mitigation saved to DB as "pending"
Reconciliation retries on GoBGP recovery

Router Crash

GoBGP session drops
Router recovers, re-establishes BGP
GoBGP re-advertises routes
No prefixd intervention required

Future Architecture

v1.5: Multi-Signal Correlation

FastNetMon  ──┐
Prometheus  ──┼──▶ Correlation ──▶ Policy Engine
Router CPU  ──┘    Engine

Combine weak signals into high-confidence decisions.

v2.0: Native BGP Speaker

Replace GoBGP sidecar with embedded BGP:

prefixd ──── BGP ────▶ Routers
  │
  └── No GoBGP dependency

Reduces operational complexity but increases code complexity.

Why These Choices?

Why GoBGP sidecar instead of native BGP?

Time to market - GoBGP is battle-tested
Flexibility - GoBGP supports features we might need later
Debugging - gobgp CLI is useful for troubleshooting
Risk isolation - BGP bugs don't crash prefixd

Why PostgreSQL instead of embedded DB?

Multi-POP - Shared state across instances
Operational familiarity - Teams know Postgres
Tooling - pgAdmin, backups, monitoring
HA - Postgres replication is well-understood

Why Rust?

Performance - Minimal latency in hot path
Safety - No null pointers, no data races
Ecosystem - Great async runtime (tokio), gRPC (tonic)
Binary size - Single ~15MB binary, no runtime deps

Uh oh!

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

Architecture

Overview

Design Principles

1. Signal-Driven, Not Detection

2. Fail-Open

3. /32 Only (IPv4)

4. Mandatory TTL

5. Guardrails Everywhere

Data Flow

Event Ingestion

Reconciliation Loop

Component Details

Policy Engine

Guardrails

FlowSpec Manager

GoBGP Client

State Management

PostgreSQL Schema

Multi-POP Model

Security Model

Authentication Layers

Trust Boundaries

Performance Characteristics

Latency

Throughput

Bottleneck: GoBGP

Failure Modes

prefixd Crash

PostgreSQL Unavailable

GoBGP Unavailable

Router Crash

Future Architecture

v1.5: Multi-Signal Correlation

v2.0: Native BGP Speaker

Why These Choices?

Why GoBGP sidecar instead of native BGP?

Why PostgreSQL instead of embedded DB?

Why Rust?