Welcome to Dataiku's Kiji Privacy Proxy! This guide will help you get started with installation, configuration, and your first release.
- What is Kiji Privacy Proxy?
- Quick Installation
- Platform-Specific Installation
- First Run
- Your First Release
- Next Steps
Dataiku's Kiji Privacy Proxy is a privacy-preserving proxy with integrated PII (Personally Identifiable Information) detection and masking capabilities. It intercepts HTTP/HTTPS traffic to LLM providers, automatically detects sensitive data using machine learning or regex patterns, and masks it before forwarding requests. When responses arrive, the proxy restores the original values so your application receives the expected data.
Kiji Privacy Proxy supports multiple LLM providers out of the box:
- OpenAI (ChatCompletions API)
- Anthropic (Messages API)
- Google Gemini (GenerateContent API)
- Mistral (ChatCompletions API)
The proxy automatically detects which provider to route to based on request characteristics. See Provider Detection for details.
Kiji Privacy Proxy can be deployed in two ways:
| Mode | Description | Platform |
|---|---|---|
| Desktop App | Electron-based GUI for configuration and monitoring. Bundles the Go backend. | macOS |
| Standalone Backend | Headless Go server configured via config file or environment variables. No UI. | Linux, macOS |
The backend supports two proxy modes that can run simultaneously:
| Mode | Default Port | Description |
|---|---|---|
| Forward Proxy | 8080 | Clients send requests directly to this port with the provider's API path (e.g., /v1/chat/completions). Provider detected from path or optional provider field in body. Best for CLI tools and programmatic access. |
| Transparent Proxy | 8081 | MITM proxy that intercepts HTTPS traffic to provider domains. Requires CA certificate trust. Provider detected from request host. On macOS, PAC auto-configuration routes browser traffic automatically. |
Key Features:
- Multi-provider support (OpenAI, Anthropic, Gemini, Mistral)
- ML-powered PII detection (emails, phone numbers, SSNs, credit cards, etc.)
- Automatic PII masking and restoration
- Forward proxy and transparent MITM proxy modes
- Desktop app (macOS) or standalone server (Linux/macOS)
- Request/response logging with masked data
- Download the latest DMG from Releases
- Open the DMG file
- Drag "Kiji Privacy Proxy" to Applications
- Launch the app
The desktop app provides a GUI for configuration and monitoring. It bundles the Go backend and manages both proxy modes automatically.
The app is code-signed and notarized, so it should launch without any macOS Gatekeeper warnings.
- Download the latest tarball from Releases
- Extract:
tar -xzf kiji-privacy-proxy-*-linux-amd64.tar.gz
cd kiji-privacy-proxy-*-linux-amd64- Run:
./run.shThe standalone backend runs as a headless server. Configure via environment variables or a config file (see Configuration).
System Requirements:
- macOS 10.13 or later
- 500MB free disk space
- Intel or Apple Silicon processor
Installation Steps:
-
Download DMG:
- Visit Releases
- Download
Kiji-Privacy-Proxy-{version}.dmg
-
Install:
# Mount DMG open Kiji-Privacy-Proxy-*.dmg # Drag to Applications folder # Or via command line: cp -r "/Volumes/Kiji Privacy Proxy/Kiji Privacy Proxy.app" /Applications/
-
First Launch:
open "/Applications/Kiji Privacy Proxy.app"
Installing CA Certificate (Required for HTTPS):
The proxy uses a self-signed certificate for MITM interception. You must trust it:
# System-wide trust (recommended)
sudo security add-trusted-cert \
-d \
-r trustRoot \
-k /Library/Keychains/System.keychain \
~/.kiji-proxy/certs/ca.crtOr use Keychain Access GUI:
- Open Keychain Access
- File → Import Items → Select
~/.kiji-proxy/certs/ca.crt - Double-click "Kiji Privacy Proxy CA" certificate
- Expand Trust → Set to Always Trust
See Advanced Topics: Transparent Proxy for details.
Automatic Proxy Configuration (PAC) - macOS:
For automatic transparent proxying without setting environment variables, run the proxy with sudo:
# Start proxy with automatic system configuration
sudo /Applications/Kiji\ Privacy\ Proxy.app/Contents/MacOS/kiji-proxy
# Or if running from source
sudo ./build/kiji-proxyThis automatically configures your system to route api.openai.com and openai.com through the proxy. Browsers and GUI apps will work without manual configuration.
Note: CLI tools like curl still need HTTP_PROXY environment variables (see test examples below).
See transparent-proxy-setup.md for complete details.
System Requirements:
- Linux kernel 3.10+
- 200MB free disk space
- x86_64 architecture
- GCC runtime libraries (usually pre-installed)
Installation Steps:
-
Download and Extract:
# Download wget https://github.com/dataiku/kiji-proxy/releases/download/v{version}/kiji-privacy-proxy-{version}-linux-amd64.tar.gz # Verify checksum (optional) wget https://github.com/dataiku/kiji-proxy/releases/download/v{version}/kiji-privacy-proxy-{version}-linux-amd64.tar.gz.sha256 sha256sum -c kiji-privacy-proxy-{version}-linux-amd64.tar.gz.sha256 # Extract tar -xzf kiji-privacy-proxy-{version}-linux-amd64.tar.gz cd kiji-privacy-proxy-{version}-linux-amd64
-
Install System-Wide (Optional):
# Copy to /opt sudo cp -r . /opt/kiji-privacy-proxy # Create service user sudo useradd -r -s /bin/false kiji sudo chown -R kiji:kiji /opt/kiji-privacy-proxy
-
Configure Environment:
# Create environment file sudo tee /etc/kiji-proxy.env << EOF OPENAI_API_KEY=your-api-key-here PROXY_PORT=:8080 LOG_PII_CHANGES=false EOF
-
Install Systemd Service (Optional):
sudo cp kiji-proxy.service /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl enable kiji-proxy sudo systemctl start kiji-proxy # Check status sudo systemctl status kiji-proxy
Installing CA Certificate (Required for HTTPS):
# Ubuntu/Debian
sudo cp ~/.kiji-proxy/certs/ca.crt /usr/local/share/ca-certificates/kiji-proxy-ca.crt
sudo update-ca-certificates
# RHEL/CentOS/Fedora
sudo cp ~/.kiji-proxy/certs/ca.crt /etc/pki/ca-trust/source/anchors/kiji-proxy-ca.crt
sudo update-ca-trust
# Arch Linux
sudo cp ~/.kiji-proxy/certs/ca.crt /etc/ca-certificates/trust-source/anchors/kiji-proxy-ca.crt
sudo trust extract-compat-
Launch the app:
open "/Applications/Kiji Privacy Proxy.app" -
Configure via UI:
- Set your API keys (OpenAI, Anthropic, Gemini, Mistral)
- Configure proxy ports (forward proxy default: 8080, transparent proxy default: 8081)
- Enable/disable PII logging
-
Test the proxy:
Transparent Proxy (browsers with PAC):
- Open Safari/Chrome and make requests to any configured provider domain
- Requests to
api.openai.com,api.anthropic.com, etc. are automatically intercepted - No client configuration needed - PAC handles routing
Forward Proxy (with curl):
# Send directly to forward proxy port (8080) with provider's path curl -X POST http://127.0.0.1:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello"}]}'
Note: CLI tools like curl don't use PAC, so they cannot use the transparent proxy without explicit configuration. Use the forward proxy (port 8080) for CLI testing.
-
Start the server:
./run.sh
-
Check health:
curl http://localhost:8080/health # Response: {"status":"healthy"} -
Check version:
curl http://localhost:8080/version # Response: {"version":"0.1.1"} -
Test proxy functionality:
# Set environment variables export OPENAI_API_KEY="your-key" export HTTP_PROXY=http://127.0.0.1:8081 export HTTPS_PROXY=http://127.0.0.1:8081 # Make request through proxy curl https://api.openai.com/v1/models \ -H "Authorization: Bearer $OPENAI_API_KEY"
Note: Linux doesn't support automatic PAC configuration. Always use
HTTP_PROXYenvironment variables.
The proxy can be configured via:
- Environment variables (recommended for Linux)
- Config file (
config.json) - UI settings (macOS only)
Environment Variables:
# Proxy settings
export PROXY_PORT=":8080"
# Provider API keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GEMINI_API_KEY="AIza..."
export MISTRAL_API_KEY="..."
# Provider base URLs (optional, defaults to official API domains)
export OPENAI_BASE_URL="api.openai.com"
export ANTHROPIC_BASE_URL="api.anthropic.com"
export GEMINI_BASE_URL="generativelanguage.googleapis.com"
export MISTRAL_BASE_URL="api.mistral.ai"
# PII detection
export DETECTOR_NAME="onnx_model_detector" # or "regex_detector", "model_detector"
export LOG_PII_CHANGES="true"
export LOG_VERBOSE="true"
export LOG_REQUESTS="true"
export LOG_RESPONSES="true"
# Database (optional)
export DB_ENABLED="false"
# Transparent proxy settings
export TRANSPARENT_PROXY_ENABLED="true"
export TRANSPARENT_PROXY_PORT=":8081"
export TRANSPARENT_PROXY_CA_PATH="~/.kiji-proxy/certs/ca.crt"
export TRANSPARENT_PROXY_KEY_PATH="~/.kiji-proxy/certs/ca.key"Config File Example:
{
"providers": {
"default_providers_config": {
"openai_subpath": "openai"
},
"openai_provider_config": {
"api_domain": "api.openai.com",
"api_key": "sk-...",
"additional_headers": {}
},
"anthropic_provider_config": {
"api_domain": "api.anthropic.com",
"api_key": "sk-ant-...",
"additional_headers": {}
},
"gemini_provider_config": {
"api_domain": "generativelanguage.googleapis.com",
"api_key": "AIza...",
"additional_headers": {}
},
"mistral_provider_config": {
"api_domain": "api.mistral.ai",
"api_key": "...",
"additional_headers": {}
}
},
"ProxyPort": ":8080",
"DetectorName": "onnx_model_detector",
"Logging": {
"LogRequests": true,
"LogResponses": true,
"LogPIIChanges": true,
"LogVerbose": true
},
"Proxy": {
"transparent_enabled": true,
"proxy_port": ":8081",
"ca_path": "~/.kiji-proxy/certs/ca.crt",
"key_path": "~/.kiji-proxy/certs/ca.key",
"enable_pac": true
}
}Provider Configuration Notes:
default_providers_config.openai_subpath: When using the forward proxy, OpenAI and Mistral share the same API subpath (/v1/chat/completions). This setting determines which provider is used by default when only the subpath is available. Valid values:"openai"or"mistral".additional_headers: Custom headers to include with requests to each provider.- API keys can be set in the config file or overridden via environment variables (recommended for security).
Kiji Privacy Proxy supports multiple LLM providers (OpenAI, Anthropic, Gemini, Mistral) and automatically detects which provider to route requests to based on the proxy mode.
Forward Proxy Mode (default port 8080)
In forward proxy mode, the proxy determines the provider using these methods in order:
-
Optional
providerfield in request body: You can explicitly specify the provider by including a"provider"field in your JSON request body. Valid values:"openai","anthropic","gemini","mistral". This field is automatically stripped before forwarding.{ "provider": "openai", "model": "gpt-4", "messages": [...] } -
Request subpath: If no
providerfield is present, the proxy determines the provider from the API endpoint path:/v1/chat/completions→ OpenAI or Mistral (based ondefault_providers_config.openai_subpath)/v1/messages→ Anthropic/v1beta/models/*/generateContent→ Gemini
Transparent Proxy Mode (default port 8081)
In transparent proxy mode (MITM), the proxy determines the provider from the request host, e.g.:
api.openai.com→ OpenAIapi.anthropic.com→ Anthropicgenerativelanguage.googleapis.com→ Geminiapi.mistral.ai→ Mistral
The transparent proxy intercepts HTTPS traffic using the configured CA certificate. Requests to non-configured domains are passed through without interception.
Intercept Domains
The list of domains to intercept is automatically derived from all configured provider API domains. Only traffic to these domains is processed for PII masking; all other traffic is passed through unchanged.
If you're a contributor or maintainer, here's how to create your first release using Changesets.
- Node.js 20+ installed
- Git configured with GitHub credentials
- Write access to the repository
- All pending changes committed
1. Install Dependencies:
cd src/frontend
npm install2. Verify Current Version:
make info
# Or directly:
cd src/frontend
node -p "require('./package.json').version"3. Make Your Changes:
For this example, we'll create a changeset documenting your feature:
cd src/frontend
npm run changesetFollow the prompts:
- Select bump type:
patch,minor, ormajor - Write a description of your changes
- Changeset file is created in
.changeset/
4. Commit and Push:
git add .
git commit -m "feat: add new feature
- Detailed description
- of what changed
Closes #123"
git push origin main5. Wait for Changesets Action:
After pushing to main:
- Go to Actions tab
- Find "Changesets Release" workflow
- Wait for completion (~1-2 minutes)
What happens:
- Changesets detects your changeset
- Bumps version (e.g., 1.0.0 → 1.0.1)
- Updates
CHANGELOG.md - Creates PR titled "chore: version packages"
6. Review and Merge Version PR:
- Go to Pull Requests
- Find PR titled "chore: version packages"
- Review changes:
package.json- version updatedCHANGELOG.md- your changes documented- Changeset file removed
- Merge the PR
7. Create Release Tag:
After merging the version PR:
# Pull latest changes
git checkout main
git pull origin main
# Verify new version
make info
# Create annotated tag
git tag -a v1.0.1 -m "Release version 1.0.1
Summary of changes:
- Feature 1
- Feature 2
- Bug fix 3
"
# Push tag
git push origin v1.0.18. Wait for Release Builds:
Both macOS and Linux builds start automatically:
- macOS DMG build (~15 minutes)
- Linux tarball build (~12 minutes)
9. Verify Release:
- Go to Releases
- Find "Release v1.0.1"
- Verify artifacts:
Kiji-Privacy-Proxy-1.0.1.dmgkiji-privacy-proxy-1.0.1-linux-amd64.tar.gzkiji-privacy-proxy-1.0.1-linux-amd64.tar.gz.sha256
10. Test the Release:
Download and test on your platform:
# macOS
open Kiji-Privacy-Proxy-1.0.1.dmg
# Linux
tar -xzf kiji-privacy-proxy-1.0.1-linux-amd64.tar.gz
cd kiji-privacy-proxy-1.0.1-linux-amd64
./run.shCongratulations! You've created your first release.
Now that you have Kiji Privacy Proxy running, here's what to explore next:
-
Configure HTTPS Interception:
- Install CA certificate (see above)
- Test with HTTPS endpoints
- Review Transparent Proxy Guide
-
Test PII Detection:
- Send requests with sensitive data
- Review logs to see masked values
- Configure PII detection sensitivity
-
Production Deployment:
- Set up systemd service (Linux)
- Configure monitoring
- Set up log rotation
-
Set Up Development Environment:
- Read Development Guide
- Configure VSCode debugger
- Run tests
-
Build From Source:
- Review Building & Deployment
- Build platform-specific packages
- Customize build process
-
Contribute:
- Read contributing guidelines
- Create changesets for your changes
- Submit pull requests
- Development Guide - Development setup and workflows
- Building & Deployment - Building from source
- Release Management - Versioning and releases
- Advanced Topics - MITM proxy, model signing, troubleshooting
- Documentation Issues: Open an issue on GitHub
- Bug Reports: Use GitHub Issues with reproduction steps
- Questions: Start a GitHub Discussion
- Security Issues: Email [email protected] (do not open public issues)
"SSL certificate problem"
- Install and trust the CA certificate (see installation steps above)
"Port 8080 already in use"
# Find what's using the port
lsof -i :8080
# Kill it or use a different port
export PROXY_PORT=:8081"Model files not found"
- Ensure Git LFS pulled the files:
git lfs pull - Check file size:
ls -lh model/quantized/model_quantized.onnx(should be ~63MB)
"Permission denied"
# Make binary executable (Linux)
chmod +x bin/kiji-proxyFor more troubleshooting, see Advanced Topics: Troubleshooting.