High-performance Go replacement for the Node.js proxy handler (port 9212).
Coding Agents (Claude Code, Cursor, etc.)
|
[Go binary :9212] <-- hot path, streaming, routing, failover
| reads/writes
[SQLite DB ./data/codegate.db] <-- shared state
^ writes
[Node.js :9211] <-- dashboard UI, config CRUD, sessions
The Go proxy reads accounts, configs, and settings from the same SQLite database that the Node.js dashboard manages. Both processes can run simultaneously.
cd go
go build -o ../bin/codegate-proxy ./cmd/codegate-proxy/Or using Make:
cd go
make build-
Start the Node.js dashboard (handles UI on port 9211):
npm run dev # or: node dist/server/index.js -
Start the Go proxy (handles LLM requests on port 9212):
cd go make run
| Variable | Default | Description |
|---|---|---|
PROXY_PORT |
9212 |
Port for the LLM proxy |
DATA_DIR |
./data |
Path to the SQLite database directory |
PROXY_API_KEY |
(empty) | Optional API key for proxy authentication |
- Health check endpoint (
/health) - Models endpoint (
/v1/models) - Anthropic Messages API proxy (
/v1/messages) - OpenAI Chat Completions proxy (
/v1/chat/completions) - SQLite database access (shared with Node.js)
- Account decryption (AES-256-GCM)
- Config-based routing with tier detection
- Routing strategies (priority, round-robin, least-used, budget-aware)
- Sliding-window rate limiting
- Exponential backoff cooldown
- Multi-account failover
- SSE streaming passthrough with token extraction
- Provider dispatch (Anthropic, OpenAI, OpenRouter, custom)
- Async usage recording
- Request logging
- Format conversion (Anthropic <-> OpenAI, both directions)
- Stream format conversion (Anthropic SSE <-> OpenAI SSE)
- Guardrails pipeline (anonymize/deanonymize)
- 12 pattern-based guardrails (email, phone, SSN, credit card, etc.)
- API key detection (40+ vendor prefixes + entropy analysis)
- Password detection (key-value, URL, connection string)
- Deterministic AES-256-CTR encryption for guardrails
- SSE stream deanonymization
- Name guardrail (full dictionary port)
- OAuth token refresh
- Model limits / max_tokens clamping