Guardian is a high-performance, scalable spam/phishing detection and enforcement service designed to run next to your MTA and filtering engine.
It analyzes incoming emails ultra-fast (structure fingerprinting + proximity detection), applies immediate local learning from operator/user reports, and only reaches out to the Mailuminati Oracle when needed for shared, collaborative intelligence.
Guardian is built for anyone operating email infrastructure, from large providers to small and community-run servers, who wants fast decisions with minimal overhead.
- Quick Start
- Prerequisites & Requirements
- Installation Options
- Configuration
- How Guardian Works
- Architecture & Ecosystem
- API Reference
- License
Install Guardian with a single command:
/bin/bash -c "$(curl -fsSL https://guardian.mailuminati.com/install.sh)"The installer will:
- Detect your system configuration
- Install Guardian and dependencies
- Integrate with your existing email filtering system (Rspamd, SpamAssassin, etc.)
- Start the service automatically
Custom installation options:
/bin/bash -c "$(curl -fsSL https://guardian.mailuminati.com/install.sh)" -- --redis-host 192.168.1.50 --redis-port 6380(note the double dashes -- before the options)
Install using our official Docker image:
docker pull mailuminati/guardian:latestFor detailed prerequisites configuration options and environment variables, see below.
- Linux server
redisserver for local caching and learning (can be on the same host or remote)- POSIX compatible shell (
/bin/shor/bin/bash) curltarsudo(unless installing as root)
systemdfor service management- An anti-spam engine capable of calling HTTP APIs
Examples: Rspamd, SpamAssassin, custom filters - An IMAP server supporting Sieve
Examples: Dovecot, Cyrus, or equivalent
- IMAP credentials
- Access to raw mailbox content
- Heavy runtime dependencies
You can customize the installation by passing arguments to the installer.
See all available options:
/bin/bash -c "$(curl -fsSL https://guardian.mailuminati.com/install.sh)" -- --helpCommon options:
-
Redis Configuration:
If your Redis instance is not on localhost (orredisfor Docker):--redis-host 192.168.1.50 --redis-port 6380
-
Filter Integration:
Skip all filter integration prompts:--no-filter-integration
Disable a specific integration even if installed:
--no-rspamd --no-spamassassin
-
Force Re-installation:
Force the re-installation of the Guardian engine even if the version matches:--force-reinstall
Guardian can be configured via environment variables or a configuration file, depending on your installation method.
For Source installations:
The configuration file is located at /etc/mailuminati-guardian/guardian.conf.
You can edit this file to change settings. To apply changes without restarting the service (hot-reload), run:
sudo systemctl reload mailuminati-guardianFor Docker installations:
Configuration is primarily managed via environment variables in docker-compose.yaml.
| Variable | Description | Default |
|---|---|---|
REDIS_HOST |
Hostname or IP of the Redis server | localhost (Source) / redis (Docker) |
REDIS_PORT |
Port of the Redis server | 6379 |
GUARDIAN_BIND_ADDR |
The network interface IP to bind to. Use 127.0.0.1 for localhost only, or 0.0.0.0 for all interfaces. |
127.0.0.1 |
MI_ENABLE_IMAGE_ANALYSIS |
Set to 1 to enable the analysis of external images for low-text emails. |
0 (Disabled) |
FORCE_REINSTALL |
Set to 1 to force re-installation of the Guardian engine. |
0 |
SPAM_WEIGHT |
Weight applied to hashes reported as spam. | 1 |
HAM_WEIGHT |
Weight applied to hashes reported as ham (false positive). | 2 |
SPAM_THRESHOLD |
Minimum score required for a message to be considered spam locally. By default ( 1), a single spam report (with weight 1) is enough to block similar messages.Increase this value (e.g., to 2) to require multiple reports before blocking. |
1 |
LOCAL_RETENTION_DAYS |
Retention period (in days) for local learning entries. | 15 |
LOG_LEVEL |
Logging verbosity leval (DEBUG, INFO, WARN, ERROR). |
INFO |
LOG_FORMAT |
Format of logs (JSON for tools/ELK, TEXT for human reading). |
JSON |
The weight and threshold variables work together to give you full control over the local learning mechanism:
- Detection Logic: A message is considered spam if its calculated score is greater than or equal to
SPAM_THRESHOLD. - Spam Reports: Reporting a message as spam adds
SPAM_WEIGHTto its score. - Ham Reports: Reporting a message as legit (ham) subtracts
HAM_WEIGHTfrom its score.
Example Scenarios:
- Default (Aggressive):
SPAM_WEIGHT=1,SPAM_THRESHOLD=1.- 1 Spam Report = Score 1. Since
1 >= 1, it is blocked immediately.
- 1 Spam Report = Score 1. Since
- Cautious:
SPAM_WEIGHT=1,SPAM_THRESHOLD=2.- 1 Spam Report = Score 1. Not blocked (
1 < 2). - 2 Spam Reports = Score 2. Blocked (
2 >= 2). - 1 Spam Report + 1 Ham Report = Score 0. Not blocked (
0 < 2).
- 1 Spam Report = Score 1. Not blocked (
Guardian combines local intelligence with shared threat detection to provide fast, accurate spam filtering.
Local Intelligence:
- Instant analysis and learning from operator-specific threats
- Immediate impact after user/operator reports
- Works even when disconnected from the Oracle
- Zero-latency decisions for most messages
Shared Intelligence (via Oracle):
- Cross-operator correlation of spam campaigns
- Shared threat clusters from independent reports
- Protection against large-scale, fast-moving threats
- Early detection of previously unseen campaigns
By querying the Oracle only when meaningful proximity is detected, Guardian benefits from collective intelligence without sacrificing performance or privacy.
For each incoming email, Guardian:
- Normalizes textual and HTML content
- Extracts meaningful attachments
- Computes one or more TLSH structural fingerprints
This process is fast, deterministic, and does not rely on external calls.
Image Analysis (Optional):
When enabled via MI_ENABLE_IMAGE_ANALYSIS=1, Guardian can fetch and analyze external images for emails containing very little text. This is beneficial for detecting "image-only" spam where the message content is hidden in a remote picture to bypass text-based filters.
⚠️ Performance & Privacy Warning:
- Latency: Guardian must download images from external servers. If the remote server is slow or under load, this will increase scan time.
- Tracking: Downloading external images may trigger "read receipts" (tracking pixels) on the sender's side.
Each fingerprint is split into overlapping bands using LSH (Locality-Sensitive Hashing) techniques.
Guardian checks:
- Its local learning database
- A locally cached subset of Oracle band data
If sufficient proximity is detected, Guardian may:
- Classify the message locally
- Flag it as a partial or suspicious match
- Escalate to the Oracle for confirmation
Only when proximity thresholds are met, Guardian contacts the Oracle to:
- Compute exact distances against known threat clusters
- Compare fingerprints against cluster medoids built from confirmed reports
- Receive a final verdict
This design ensures that only a small fraction of messages require remote confirmation.
Guardian supports learning through reports such as:
- User complaints (via IMAP/Sieve integration)
- Operator validation
- Abuse desk signals
Confirmed reports immediately reinforce local detection and can be shared with the Oracle, contributing to global Mailuminati intelligence.
Incoming Email
|
v
+---------------------+
| Mailuminati |
| Guardian (Local) |
+---------------------+
| |
| +--------------------+
| |
v v
Local Analysis Local Learning
(TLSH + LSH) (Immediate Effect)
|
| No proximity
|-----------------------------> ALLOW / LOCAL DECISION
|
| Proximity detected
v
+---------------------+
| Mailuminati |
| Oracle (Remote) |
+---------------------+
|
v
Shared Intelligence
(Clusters, Medoids,
Community Reports)
|
v
Verdict Returned
|
v
Local Enforcement
(Spam / Allow / Flag)
- Very low latency — Most decisions made locally without network calls
- Immediate learning — Reports take effect instantly
- Minimal resources — Low CPU and memory footprint
- Privacy-preserving — No raw email content sharing
- Resilient — Works even when Oracle is unavailable
- Scalable — Suitable for high-volume and small operators alike
Guardian is responsible for:
- Local spam/phishing analysis of incoming emails
- Structural fingerprinting using TLSH
- Fast proximity detection via locality-sensitive hashing (LSH)
- Immediate application of local learning
- Remote confirmation through the Mailuminati Oracle
- Enforcing final decisions (allow, spam, proximity match)
It acts as the first line of defense, minimizing latency and resource usage while remaining connected to a broader community-driven detection network.
Guardian typically runs as:
- A local HTTP service on port
12421 - A bridge between your MTA and the Mailuminati ecosystem
- A containerized service alongside Redis
Your email filtering engine (Rspamd, SpamAssassin, etc.) calls Guardian's /analyze endpoint for each incoming email, then acts on the verdict.
- Guardian performs local detection, learning, and enforcement
- Oracle provides shared intelligence and collaborative confirmation
Guardian can operate independently. Its effectiveness increases when connected to the Oracle, where local signals become part of a collective defense.
Guardian exposes a simple HTTP API on port 12421.
⚠️ Security WarningGuardian provides no authentication on its API. It is strongly recommended to:
- NOT expose port
12421to the Internet- Block external access with a firewall
- Allow only
localhostor your internal network
http://<guardian-host>:12421
Health and version information endpoint.
Example:
curl -sS http://localhost:12421/status | jqResponse:
{
"node_id": "6c0a5e16-2b32-4f86-9b3d-2b2e3df5c7d8",
"current_seq": 0,
"version": "0.3.2"
}Analyzes an email provided as raw RFC822/MIME bytes. Maximum request size: 15 MB.
Request:
curl -sS -X POST \
-H 'Content-Type: message/rfc822' \
--data-binary @message.eml \
http://localhost:12421/analyze | jqResponse:
{
"action": "allow",
"proximity_match": false,
"hashes": [
"T1A9B0E0F2D3C4B5A6..."
]
}Response Fields:
action:allow|spamlabel(optional): e.g.,local_spam,oracle_spamproximity_match: boolean indicating if similar spam was detecteddistance(optional): TLSH distance to nearest threat (lower = more similar)hashes(optional): array of computed TLSH signatures
Notes:
- If the email lacks a
Message-IDheader, Guardian will still analyze it, but/reportwon't be able to reference it later. - The
hashesfield contains the computed TLSH fingerprints for the message.
Reports a previously scanned email to improve Guardian's learning.
Guardian will:
- Apply local learning immediately (spam or ham correction)
- Forward the report to the Oracle for shared intelligence
Request:
curl -sS -X POST \
-H 'Content-Type: application/json' \
-d '{"message-id":"<your-message-id@example>","report_type":"spam"}' \
http://localhost:12421/reportRequest Body:
{
"message-id": "<your-message-id@example>",
"report_type": "spam"
}Report Types:
spam: Reports a missed spam (false negative)ham: Reports a false positive (legitimate email incorrectly flagged)
Notes:
- Guardian must have previously scanned this email (identified by
Message-ID) - Returns
404 Not Foundif no scan data exists for this Message-ID - Response is proxied from the Oracle when reachable
Exposes internal metrics in Prometheus format for monitoring.
Example:
curl -sS http://localhost:12421/metricsAvailable Metrics:
mailuminati_guardian_scanned_total: Total emails scannedmailuminati_guardian_local_match_total: Emails detected using local intelligencemailuminati_guardian_oracle_match_total: Emails matched via Oraclemailuminati_guardian_cache_hits_total: Cache hit efficiency
This client is open-source software licensed under the GNU GPLv3.
Copyright © 2025 Simon Bressier.
Please note: This license applies strictly to the client-side code contained in this repository.
See the LICENSE file for details.
