Skip to content

DeepKnowledge1/AnomaVision

Repository files navigation

AnomaVision banner

AnomaVision 🔍

High-performance visual anomaly detection. Fast, lightweight, production-ready.

AnomaVision detects defects without ever seeing defective examples during training.

PyPI PyPI Downloads Python PyTorch License: MIT ONNX TensorRT OpenVINO HuggingFace


Live Demo · Docs · Quickstart · Models · Tasks · Integrations · Issues · Discussions


🤗 Live Demo

Try AnomaVision instantly — no installation required.

Open in Spaces

The live demo runs a PaDiM model trained on MVTec bottle images and shows:

  • 🌡️ Anomaly Heatmap — spatial score map highlighting defect regions
  • 🖼️ Overlay — original image with anomaly contours drawn
  • 🎭 Predicted Mask — binary segmentation of detected defects
  • Real-time inference — results in milliseconds on CPU

Upload your own bottle image or pick from the provided samples to see anomaly detection in action.


What is AnomaVision?

AnomaVision delivers visual anomaly detection optimized for production deployment. Based on PaDiM, it learns the distribution of normal images in a single forward pass — no labels, no segmentation masks, no lengthy training loops.

The result: a 15 MB model that runs at 43 FPS on CPU and 547 FPS on GPU, with higher AUROC than the existing best-in-class baseline.


🚀 Quickstart

Install

⚠️ torch is hardware-specific. A plain pip install anomavision skips PyTorch entirely. Always install with an [extra] to get the right binaries for your hardware.

Don't have uv? Install it first — it's faster than pip and handles PyTorch's hardware routing correctly:

pip install uv

Option A — From Source (development)

git clone https://github.com/DeepKnowledge1/AnomaVision.git
cd AnomaVision

# Create and activate a virtual environment
uv venv --python 3.11 .venv
source .venv/bin/activate        # Windows: .venv\Scripts\Activate.ps1

# Install with your hardware extra
uv sync --extra cpu              # CPU
uv sync --extra cu121            # CUDA 12.1

Or install from requirements.txt directly:

uv pip install -r requirements.txt

Option B — From PyPI (production / quick start)

# CPU  ·  Mac, CI runners, edge devices
uv pip install "anomavision[cpu]"

# NVIDIA GPU  ·  pick your CUDA version
uv pip install "anomavision[cu118]"   # CUDA 11.8
uv pip install "anomavision[cu121]"   # CUDA 12.1
uv pip install "anomavision[cu124]"   # CUDA 12.4

Option C — Already installed without extras?

If you're seeing ModuleNotFoundError: No module named 'torch', add PyTorch into your current environment:

# CPU
uv pip install torch torchvision torchaudio

# GPU (CUDA 12.1)
uv pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Verify

python -c "import anomavision, torch; print('✅ Ready —', torch.__version__)"

CLI

AnomaVision ships a unified anomavision command — no need to run individual scripts directly.

# Train
anomavision train --config config.yml

# Detect (images or folder)
anomavision detect --config config.yml --img_path ./test_images --thresh 13.0

# Evaluate on MVTec
anomavision eval --config config.yml --enable_visualization

# Export to ONNX / TorchScript / OpenVINO / all
anomavision export --config config.yml --model model.pt --format all --precision fp16

Every command has full --help:

anomavision --help
anomavision train --help
anomavision export --help

🐍 Python API

Use the Python API when you want to embed AnomaVision into a larger pipeline, run it inside a notebook, or integrate it with your own data loading logic.

import torch
import anomavision
from torch.utils.data import DataLoader

# --- 1. Dataset ---
# AnodetDataset reads normal images from a folder and applies preprocessing.
# You only need train/good/ — no labels, no anomalous images required.
dataset = anomavision.AnodetDataset(
    image_directory_path="./dataset/bottle/train/good",
    resize=(224, 224),     # resize before crop
    crop_size=(224, 224),  # center crop to this size
    normalize=True,        # ImageNet mean/std normalization
)
loader = DataLoader(dataset, batch_size=16)

# --- 2. Train ---
# fit() does a single forward pass — no gradient updates, no epochs.
# Fits a multivariate Gaussian at each spatial position of the feature map.
# Typical training time: under 10 seconds on CPU for ~200 images.
model = anomavision.Padim(
    backbone="resnet18",  # or "wide_resnet50" for higher accuracy
    device="cpu",         # or "cuda"
    feat_dim=100,         # number of random feature dimensions to keep
)
model.fit(loader)

# --- 3. Save ---
torch.save(model, "model.pt")                  # full model (for export or further use)
model.save_statistics("model.pth", half=True)  # stats-only (smaller, faster to load)

# --- 4. Infer ---
# scores: (batch_size,)      — scalar anomaly score per image. Higher = more anomalous.
# maps:   (batch_size, H, W) — spatial heatmap showing *where* the anomaly is.
#                              Pass directly to matplotlib.imshow() to visualize.
batch, *_ = next(iter(loader))
scores, maps = model.predict(batch)

See the FAQ for how to pick a classification threshold from scores.

🌐 REST API

Use the REST API when you want to integrate AnomaVision into an existing service, call it from any language, or expose it on a network without installing Python on the client.

First, start the FastAPI server (keep this terminal open):

uvicorn apps.api.fastapi_app:app --host 0.0.0.0 --port 8000

Then send images from any client:

import requests

with open("image.jpg", "rb") as f:
    r = requests.post("http://localhost:8000/predict", files={"file": f})

print(r.json()["anomaly_score"])   # e.g. 14.3
print(r.json()["is_anomaly"])      # True / False

Full docs at http://localhost:8000/docs once the server is running.


📊 Models & Performance

MVTec AD — Average over 15 Classes

Model Image AUROC ↑ Pixel AUROC ↑ CPU FPS ↑ GPU FPS ↑ Size ↓
AnomaVision (resnet18) 0.850 0.956 43.4 547 15 MB
Anomalib PaDiM (baseline) 0.810 0.935 13.0 356 40 MB
Δ +4.9% +2.2% +233% +54% −25%

CPU: Intel Core i9 (single process). GPU: NVIDIA A100. Batch size 1. Reproduce: anomavision eval --config config.yml

VisA — Average over 12 Classes

Model Image AUROC ↑ Pixel AUROC ↑ CPU FPS ↑
AnomaVision 0.812 0.962 44.8
Anomalib PaDiM 0.783 0.954 13.5
📋 Per-class MVTec breakdown
Class AV Image AUROC AL Image AUROC AV Pixel AUROC AL Pixel AUROC AV FPS
bottle 0.997 0.996 0.984 0.987 42.2
cable 0.772 0.742 0.936 0.935 36.1
capsule 0.839 0.846 0.929 0.977 40.2
carpet 0.908 0.594 0.971 0.987 44.0
grid 0.881 0.832 0.964 0.965 41.3
hazelnut 0.984 0.949 0.978 0.974 29.0
leather 0.985 0.879 0.985 0.982 48.7
metal_nut 0.940 0.878 0.963 0.963 41.4
pill 0.793 0.773 0.957 0.964 45.4
screw 0.941 0.787 0.970 0.982 42.4
tile 0.851 0.876 0.969 0.971 46.0
toothbrush 0.978 0.883 0.993 0.989 44.8
transistor 0.800 0.853 0.968 0.962 42.2
wood 0.986 0.915 0.973 0.975 45.3
zipper 0.914 0.979 0.972 0.971 41.0

🎯 Tasks & Modes

Task Train Detect Eval Export Stream REST
Anomaly Detection (image score)
Anomaly Localization (pixel map)
Normal / Anomalous Classification

Export Formats

Format Flag CPU GPU Edge Quantization
PyTorch .pt pt
ONNX .onnx onnx INT8 dynamic / static
TorchScript .torchscript torchscript
OpenVINO (dir) openvino FP16
TensorRT .engine engine FP16
C++ ONNX Runtime
anomavision export \
  --model_data_path ./distributions/anomav_exp \
  --model model.pt \
  --format onnx \
  --precision fp16 \
  --quantize-dynamic

📺 Streaming Sources

Run inference on live sources without changing your model or code:

Source stream_source.type Use case
Webcam webcam Lab / demo
Video file video Offline replay
MQTT mqtt Industrial IoT cameras
TCP socket tcp High-throughput line scanners
# stream_config.yml
stream_mode: true
stream_source:
  type: webcam
  camera_id: 0
model: model.onnx
thresh: 13.0
enable_visualization: true
anomavision detect --config stream_config.yml
⚙️ Configuration

All scripts accept --config config.yml and CLI overrides. CLI always wins.

# Minimal working config.yml
dataset_path:    ./dataset
class_name:      bottle

resize:          [256, 192]
crop_size:       [224, 224]
normalize:       true
norm_mean:       [0.485, 0.456, 0.406]
norm_std:        [0.229, 0.224, 0.225]

backbone:        resnet18
batch_size:      16
feat_dim:        100
layer_indices:   [0, 1, 2]
output_model:    model.pt
run_name:        exp1
model_data_path: ./distributions/anomav_exp

model:           model.onnx
device:          auto        # auto | cpu | cuda
thresh:          13.0

log_level:       INFO

Full key reference: docs/config.md

🔌 Integrations
Integration Description
FastAPI REST API — /predict, /predict/batch, Swagger UI at /docs
Streamlit Browser demo — heatmap overlay, threshold slider, batch upload
Gradio Live HuggingFace Space — try it instantly
C++ Runtime ONNX + OpenCV, no Python required — see docs/cpp/
OpenVINO Intel CPU/VPU edge optimization
TensorRT NVIDIA GPU maximum throughput
INT8 Quantization Dynamic + static INT8 via ONNX Runtime
# Terminal 1 — backend
uvicorn apps.api.fastapi_app:app --host 0.0.0.0 --port 8000

# Terminal 2 — UI
streamlit run apps/ui/streamlit_app.py -- --port 8000

Open http://localhost:8501

📂 Dataset Format

AnomaVision uses MVTec AD layout. Custom datasets work with the same structure:

dataset/
└── <class_name>/
    ├── train/
    │   └── good/          ← normal images only (no anomalies needed)
    └── test/
        ├── good/          ← normal test images
        └── <defect_name>/ ← anomalous test images (any subfolder name)

🏗️ Architecture

AnomaVision architecture

Key design decisions:

PaDiM needs no gradient training. Features are extracted once with a frozen ResNet. The model fits a multivariate Gaussian at each spatial location — training is a matrix decomposition, not backprop. That's why it finishes in ~8 seconds.

ModelWrapper makes the backend transparent. The same predict(batch) → (scores, maps) call works whether you loaded .pt, .onnx, .engine, or an OpenVINO directory. Every downstream caller — CLI, FastAPI, Streamlit, eval loop — uses the same interface.

Adaptive Gaussian post-processing is applied to score maps after inference. The kernel is sized relative to the image resolution, which is a key factor behind the Pixel AUROC gain over baseline.


🛠️ Development

# Clone and create environment
git clone https://github.com/DeepKnowledge1/AnomaVision.git
cd AnomaVision

uv venv --python 3.11 .venv
source .venv/bin/activate        # Windows: .venv\Scripts\Activate.ps1

# Install with dev dependencies
uv sync --extra cpu              # or --extra cu121 for GPU

# Install the package in editable mode
uv pip install -e .

# Verify CLI is working
anomavision --help

# Test
pytest tests/

# Format + lint
black . && isort . && flake8 .

Commit convention:

feat(export):  add TensorRT INT8 calibration
fix(detect):   handle empty directories
docs(api):     improve ModelWrapper examples

Types: feat · fix · docs · refactor · test · chore

PRs must pass pytest + flake8 and include doc updates if behavior changes. See docs/contributing.md.


🚢 Deploy

Docker
FROM python:3.11-slim

RUN apt-get update && apt-get install -y \
    git git-lfs libsm6 libxext6 libgl1 libglib2.0-0 \
    && rm -rf /var/lib/apt/lists/*

RUN useradd -m -u 1000 user

RUN pip install --upgrade pip setuptools wheel && \
    pip install --no-cache-dir uv && \
    uv pip install --system "anomavision[cpu]"

USER user
ENV PATH="/home/user/.local/bin:/usr/local/bin:$PATH"

WORKDIR /home/user/app
COPY --chown=user . .

EXPOSE 7860
CMD ["python", "app.py"]
docker build -t anomavision .
docker run -p 7860:7860 -v $(pwd)/distributions:/home/user/app/distributions anomavision
Production (Gunicorn + Uvicorn)
gunicorn apps.api.fastapi_app:app \
  --workers 4 \
  --worker-class uvicorn.workers.UvicornWorker \
  --bind 0.0.0.0:8000 \
  --timeout 120

Production tip: Serve ONNX or TensorRT models — .pt inference is 2–3× slower than ONNX Runtime at batch size 1.


❓ FAQ

Training is slow on CPU

Lower resize (e.g. [128, 128]), reduce batch_size, or use --device cuda. PaDiM training is a single forward pass — it should finish in under 30 s for most datasets even on CPU.

All anomaly scores are low / nothing detected

Run anomavision eval --config config.yml first to see the score distribution histogram. Set --thresh just above the peak of the normal score distribution. Typical values: 10–20 for ResNet18 with default preprocessing.

RuntimeError: Input size mismatch during inference

Your resize / crop_size must match what was used at training time. Load the config saved alongside the model: --config ./distributions/anomav_exp/exp1/config.yml.

CUDA version mismatch
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

Replace cu121 with your actual CUDA version (cu118, cu124, etc.).

Unsupported operator during ONNX export

Try --opset 16. If it still fails, use --format torchscript — TorchScript has no ONNX operator constraints.

Can I use my own dataset without MVTec structure?

Yes. Put your normal training images in <any_root>/train/good/. For evaluation, add test images under <root>/test/<defect_name>/. No anomalous images are needed at training time.

More: docs/troubleshooting.md


🗺️ Roadmap

  • Pre-trained model zoo for all 15 MVTec classes
  • Few-shot adaptation (5–10 anomalous examples)
  • Native TensorRT export in export.py
  • Pixel-level mask in REST /predict response
  • ONNX Runtime Web (browser inference via WASM)
  • Helm chart for Kubernetes deployment

Request a feature →


📚 Documentation

Quick Start Train → detect → eval → export in 5 minutes
CLI Reference All arguments for all anomavision subcommands
Python API Library usage and class reference
Config Guide Every YAML key explained
Benchmarks Full per-class results vs Anomalib
FastAPI Backend REST API setup and endpoints
C++ Inference Deploy without Python
Troubleshooting Common issues and fixes
Contributing Development workflow

💬 Community


Citation

@software{anomavision2025,
  title   = {AnomaVision: Edge-Ready Visual Anomaly Detection},
  author  = {DeepKnowledge Contributors},
  year    = {2025},
  url     = {https://github.com/DeepKnowledge1/AnomaVision},
}

License

Released under the MIT License. Built on Anodet — thanks to the original authors.