Mailuminati Guardian

Guardian is a high-performance, scalable spam/phishing detection and enforcement service designed to run next to your MTA and filtering engine.

It analyzes incoming emails ultra-fast (structure fingerprinting + proximity detection), applies immediate local learning from operator/user reports, and only reaches out to the Mailuminati Oracle when needed for shared, collaborative intelligence.

Guardian is built for anyone operating email infrastructure, from large providers to small and community-run servers, who wants fast decisions with minimal overhead.

Quick Start

Install Guardian with a single command:

/bin/bash -c "$(curl -fsSL https://guardian.mailuminati.com/install.sh)"

The installer will:

Detect your system configuration
Install Guardian and dependencies
Integrate with your existing email filtering system (Rspamd, SpamAssassin, etc.)
Start the service automatically

Custom installation options:

/bin/bash -c "$(curl -fsSL https://guardian.mailuminati.com/install.sh)" -- --redis-host 192.168.1.50 --redis-port 6380

(note the double dashes -- before the options)

Install using our official Docker image:

docker pull mailuminati/guardian:latest

For detailed prerequisites configuration options and environment variables, see below.

Prerequisites & Requirements

Mandatory

Linux server
redis server for local caching and learning (can be on the same host or remote)
POSIX compatible shell (/bin/sh or /bin/bash)
curl
tar
sudo (unless installing as root)

Optional but Recommended

systemd for service management
An anti-spam engine capable of calling HTTP APIs
Examples: Rspamd, SpamAssassin, custom filters
An IMAP server supporting Sieve
Examples: Dovecot, Cyrus, or equivalent

What Guardian Does NOT Require

IMAP credentials
Access to raw mailbox content
Heavy runtime dependencies

Installation Options

You can customize the installation by passing arguments to the installer.

See all available options:

/bin/bash -c "$(curl -fsSL https://guardian.mailuminati.com/install.sh)" -- --help

Common options:

Redis Configuration:
If your Redis instance is not on localhost (or redis for Docker):
```
--redis-host 192.168.1.50 --redis-port 6380
```
Filter Integration:
Skip all filter integration prompts:
```
--no-filter-integration
```
Disable a specific integration even if installed:
```
--no-rspamd
--no-spamassassin
```
Force Re-installation:
Force the re-installation of the Guardian engine even if the version matches:
```
--force-reinstall
```

Configuration

Guardian can be configured via environment variables or a configuration file, depending on your installation method.

For Source installations:
The configuration file is located at /etc/mailuminati-guardian/guardian.conf.
You can edit this file to change settings. To apply changes without restarting the service (hot-reload), run:

sudo systemctl reload mailuminati-guardian

For Docker installations:
Configuration is primarily managed via environment variables in docker-compose.yaml.

Available Configuration Variables

Variable	Description	Default
`REDIS_HOST`	Hostname or IP of the Redis server	`localhost` (Source) / `redis` (Docker)
`REDIS_PORT`	Port of the Redis server	`6379`
`GUARDIAN_BIND_ADDR`	The network interface IP to bind to. Use `127.0.0.1` for localhost only, or `0.0.0.0` for all interfaces.	`127.0.0.1`
`MI_ENABLE_IMAGE_ANALYSIS`	Set to `1` to enable the analysis of external images for low-text emails.	`0` (Disabled)
`FORCE_REINSTALL`	Set to `1` to force re-installation of the Guardian engine.	`0`
`SPAM_WEIGHT`	Weight applied to hashes reported as spam.	`1`
`HAM_WEIGHT`	Weight applied to hashes reported as ham (false positive).	`2`
`SPAM_THRESHOLD`	Minimum score required for a message to be considered spam locally. By default (`1`), a single spam report (with weight 1) is enough to block similar messages. Increase this value (e.g., to `2`) to require multiple reports before blocking.	`1`
`LOCAL_RETENTION_DAYS`	Retention period (in days) for local learning entries.	`15`
`LOG_LEVEL`	Logging verbosity leval (`DEBUG`, `INFO`, `WARN`, `ERROR`).	`INFO`
`LOG_FORMAT`	Format of logs (`JSON` for tools/ELK, `TEXT` for human reading).	`JSON`

The weight and threshold variables work together to give you full control over the local learning mechanism:

Detection Logic: A message is considered spam if its calculated score is greater than or equal to SPAM_THRESHOLD.
Spam Reports: Reporting a message as spam adds SPAM_WEIGHT to its score.
Ham Reports: Reporting a message as legit (ham) subtracts HAM_WEIGHT from its score.

Example Scenarios:

Default (Aggressive): SPAM_WEIGHT=1, SPAM_THRESHOLD=1.
- 1 Spam Report = Score 1. Since 1 >= 1, it is blocked immediately.
Cautious: SPAM_WEIGHT=1, SPAM_THRESHOLD=2.
- 1 Spam Report = Score 1. Not blocked (1 < 2).
- 2 Spam Reports = Score 2. Blocked (2 >= 2).
- 1 Spam Report + 1 Ham Report = Score 0. Not blocked (0 < 2).

How Guardian Works

Guardian combines local intelligence with shared threat detection to provide fast, accurate spam filtering.

Core Concepts

Local Intelligence:

Instant analysis and learning from operator-specific threats
Immediate impact after user/operator reports
Works even when disconnected from the Oracle
Zero-latency decisions for most messages

Shared Intelligence (via Oracle):

Cross-operator correlation of spam campaigns
Shared threat clusters from independent reports
Protection against large-scale, fast-moving threats
Early detection of previously unseen campaigns

By querying the Oracle only when meaningful proximity is detected, Guardian benefits from collective intelligence without sacrificing performance or privacy.

Analysis Pipeline

1. Local Analysis

For each incoming email, Guardian:

Normalizes textual and HTML content
Extracts meaningful attachments
Computes one or more TLSH structural fingerprints

This process is fast, deterministic, and does not rely on external calls.

Image Analysis (Optional):
When enabled via MI_ENABLE_IMAGE_ANALYSIS=1, Guardian can fetch and analyze external images for emails containing very little text. This is beneficial for detecting "image-only" spam where the message content is hidden in a remote picture to bypass text-based filters.

⚠️ Performance & Privacy Warning:

Latency: Guardian must download images from external servers. If the remote server is slow or under load, this will increase scan time.

Tracking: Downloading external images may trigger "read receipts" (tracking pixels) on the sender's side.

2. Local Proximity Detection

Each fingerprint is split into overlapping bands using LSH (Locality-Sensitive Hashing) techniques.

Guardian checks:

Its local learning database
A locally cached subset of Oracle band data

If sufficient proximity is detected, Guardian may:

Classify the message locally
Flag it as a partial or suspicious match
Escalate to the Oracle for confirmation

3. Oracle Confirmation (When Needed)

Only when proximity thresholds are met, Guardian contacts the Oracle to:

Compute exact distances against known threat clusters
Compare fingerprints against cluster medoids built from confirmed reports
Receive a final verdict

This design ensures that only a small fraction of messages require remote confirmation.

4. Learning and Feedback

Guardian supports learning through reports such as:

User complaints (via IMAP/Sieve integration)
Operator validation
Abuse desk signals

Confirmed reports immediately reinforce local detection and can be shared with the Oracle, contributing to global Mailuminati intelligence.

Architecture Diagram

Incoming Email
      |
      v
+---------------------+
|  Mailuminati        |
|  Guardian (Local)   |
+---------------------+
   |           |
   |           +--------------------+
   |                                |
   v                                v
Local Analysis                  Local Learning
(TLSH + LSH)                (Immediate Effect)
   |
   |  No proximity
   |----------------------------->  ALLOW / LOCAL DECISION
   |
   |  Proximity detected
   v
+---------------------+
|   Mailuminati       |
|   Oracle (Remote)   |
+---------------------+
        |
        v
Shared Intelligence
(Clusters, Medoids,
Community Reports)
        |
        v
   Verdict Returned
        |
        v
Local Enforcement
(Spam / Allow / Flag)

Design Goals

Very low latency — Most decisions made locally without network calls
Immediate learning — Reports take effect instantly
Minimal resources — Low CPU and memory footprint
Privacy-preserving — No raw email content sharing
Resilient — Works even when Oracle is unavailable
Scalable — Suitable for high-volume and small operators alike

Architecture & Ecosystem

Role in the Mailuminati Ecosystem

Guardian is responsible for:

Local spam/phishing analysis of incoming emails
Structural fingerprinting using TLSH
Fast proximity detection via locality-sensitive hashing (LSH)
Immediate application of local learning
Remote confirmation through the Mailuminati Oracle
Enforcing final decisions (allow, spam, proximity match)

It acts as the first line of defense, minimizing latency and resource usage while remaining connected to a broader community-driven detection network.

Deployment Model

Guardian typically runs as:

A local HTTP service on port 12421
A bridge between your MTA and the Mailuminati ecosystem
A containerized service alongside Redis

Your email filtering engine (Rspamd, SpamAssassin, etc.) calls Guardian's /analyze endpoint for each incoming email, then acts on the verdict.

Relationship to Other Components

Guardian performs local detection, learning, and enforcement
Oracle provides shared intelligence and collaborative confirmation

Guardian can operate independently. Its effectiveness increases when connected to the Oracle, where local signals become part of a collective defense.

API Reference

Guardian exposes a simple HTTP API on port 12421.

⚠️ Security Warning

Guardian provides no authentication on its API. It is strongly recommended to:

NOT expose port 12421 to the Internet

Block external access with a firewall

Allow only localhost or your internal network

Base URL

http://<guardian-host>:12421

Endpoints

GET /status

Health and version information endpoint.

Example:

curl -sS http://localhost:12421/status | jq

Response:

{
  "node_id": "6c0a5e16-2b32-4f86-9b3d-2b2e3df5c7d8",
  "current_seq": 0,
  "version": "0.3.2"
}

POST /analyze

Analyzes an email provided as raw RFC822/MIME bytes. Maximum request size: 15 MB.

Request:

curl -sS -X POST \
  -H 'Content-Type: message/rfc822' \
  --data-binary @message.eml \
  http://localhost:12421/analyze | jq

Response:

{
  "action": "allow",
  "proximity_match": false,
  "hashes": [
    "T1A9B0E0F2D3C4B5A6..."
  ]
}

Response Fields:

action: allow | spam
label (optional): e.g., local_spam, oracle_spam
proximity_match: boolean indicating if similar spam was detected
distance (optional): TLSH distance to nearest threat (lower = more similar)
hashes (optional): array of computed TLSH signatures

Notes:

If the email lacks a Message-ID header, Guardian will still analyze it, but /report won't be able to reference it later.
The hashes field contains the computed TLSH fingerprints for the message.

POST /report

Reports a previously scanned email to improve Guardian's learning.

Guardian will:

Apply local learning immediately (spam or ham correction)
Forward the report to the Oracle for shared intelligence

Request:

curl -sS -X POST \
  -H 'Content-Type: application/json' \
  -d '{"message-id":"<your-message-id@example>","report_type":"spam"}' \
  http://localhost:12421/report

Request Body:

{
  "message-id": "<your-message-id@example>",
  "report_type": "spam"
}

Report Types:

spam: Reports a missed spam (false negative)
ham: Reports a false positive (legitimate email incorrectly flagged)

Notes:

Guardian must have previously scanned this email (identified by Message-ID)
Returns 404 Not Found if no scan data exists for this Message-ID
Response is proxied from the Oracle when reachable

GET /metrics

Exposes internal metrics in Prometheus format for monitoring.

Example:

curl -sS http://localhost:12421/metrics

Available Metrics:

mailuminati_guardian_scanned_total: Total emails scanned
mailuminati_guardian_local_match_total: Emails detected using local intelligence
mailuminati_guardian_oracle_match_total: Emails matched via Oracle
mailuminati_guardian_cache_hits_total: Cache hit efficiency

License

This client is open-source software licensed under the GNU GPLv3.

Please note: This license applies strictly to the client-side code contained in this repository.

See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 218 Commits
.github		.github
configs		configs
docs/assets/images		docs/assets/images
install		install
src/mi_guardian		src/mi_guardian
.env		.env
.gitattributes		.gitattributes
.gitignore		.gitignore
.release-it.json		.release-it.json
CHANGELOG.md		CHANGELOG.md
Dockerfile.Mailuminati-Guardian		Dockerfile.Mailuminati-Guardian
LICENSE		LICENSE
README.md		README.md
docker-compose.yaml		docker-compose.yaml
install.sh		install.sh

Folders and files

Latest commit

History

Repository files navigation

Mailuminati Guardian

Table of Contents

Quick Start

Prerequisites & Requirements

Mandatory

Optional but Recommended

What Guardian Does NOT Require

Installation Options

Configuration

Available Configuration Variables

How Guardian Works

Core Concepts

Analysis Pipeline

1. Local Analysis

2. Local Proximity Detection

3. Oracle Confirmation (When Needed)

4. Learning and Feedback

Architecture Diagram

Design Goals

Architecture & Ecosystem

Role in the Mailuminati Ecosystem

Deployment Model

Relationship to Other Components

API Reference

Base URL

Endpoints

GET /status

POST /analyze

POST /report

GET /metrics

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 13

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages