TENSURE

Fuzzing framework for sparse tensor compilers — find crashes and silent miscompilations automatically.

First open-source implementation of TENSURE (NDSS Fuzzing Workshop 2026) — constraint-based fuzzing for sparse tensor compilers.

Korean / 한국어

Why TENSURE?

Sparse tensor compilers (TACO, Finch, etc.) are critical infrastructure for scientific computing, machine learning, and data analytics. But they have serious bugs:

TACO: ~65% defect rate on valid einsum operations (crashes + wrong results)
Finch: Crashes on valid inputs
Traditional grammar-based fuzzers achieve only 3.3% valid test generation for tensor expressions

TENSURE solves this with a constraint-based approach that guarantees 100% semantically valid test generation, combined with metamorphic testing to catch silent miscompilations.

Source: "TENSURE: Fuzzing Sparse Tensor Compilers" — NDSS Fuzzing Workshop 2026

How It Works

[1] Constraint-Based Einsum Generator
    Generate random but guaranteed-valid Einstein summation expressions
    (100% validity vs 3.3% for grammar-based fuzzers)
              |
              v
[2] Sparse Tensor Data Generator
    Create tensors with controlled sparsity patterns
    (Dense, CSR, CSC, COO formats)
              |
              v
[3] Metamorphic Mutation
    Apply semantics-preserving transformations:
    - Operand permutation (algebraic commutativity)
    - Storage format heterogeneity (same math, different code paths)
              |
              v
[4] Multi-Backend Execution
    Run on reference (NumPy) and target backends
    (pydata/sparse, opt_einsum, TACO, Finch)
              |
              v
[5] Oracle Comparison
    Detect: crashes, wrong results, NaN/Inf, shape mismatches
    with configurable floating-point tolerance
              |
              v
[6] Bug Report
    Classify bugs by type and severity
    Console output + JSON report

Key Innovations from the Paper

Constraint-Based Generation: Einsum expressions are context-sensitive (not context-free), so grammar-based fuzzers fail catastrophically. TENSURE treats generation as a constraint satisfaction problem:

Every contraction index must appear in >= 2 input tensors
Output indices must be a subset of input indices
Dimension sizes must be consistent across all tensors sharing an index

Metamorphic Testing: Instead of requiring a reference compiler, TENSURE uses algebraic properties as test oracles:

Operand permutation: einsum("ij,jk->ik", A, B) must equal einsum("jk,ij->ik", B, A)
Format equivalence: Same computation in CSR vs CSC vs COO vs Dense must produce identical results

Quick Start

Install

pip install numpy click rich    # Core dependencies
pip install sparse              # Optional: pydata/sparse backend
pip install opt_einsum          # Optional: optimized contraction backend

Run a Fuzzing Campaign

# Basic: 1000 iterations with NumPy self-testing
tensure fuzz

# With seed for reproducibility
tensure fuzz -n 5000 --seed 42

# Test pydata/sparse backend against NumPy
tensure fuzz -n 2000 -b sparse -o report.json

# Custom parameters
tensure fuzz \
  -n 10000 \
  --max-operands 5 \
  --max-rank 3 \
  --max-dim 8 \
  --mutations 10 \
  --seed 42 \
  -o report.json

Test a Specific Expression

# Matrix multiplication
tensure test "ij,jk->ik" -s "3x4,4x5"

# 3D tensor contraction
tensure test "ijk,jkl->il" -s "2x3x4,3x4x5"

# With sparse backend
tensure test "ij,jk->ik" -s "10x20,20x15" -b sparse -d 0.1

System Info

tensure info

Python API

from tensure.fuzzer import Fuzzer
from tensure.models import FuzzConfig

# Configure and run
config = FuzzConfig(
    num_iterations=5000,
    max_operands=4,
    max_rank=3,
    seed=42,
)
fuzzer = Fuzzer(config)
stats = fuzzer.run()

print(f"Found {stats.total_bugs} bugs in {stats.duration_seconds:.1f}s")
print(f"  Crashes: {stats.crashes}")
print(f"  Wrong results: {stats.wrong_results}")
print(f"  Defect rate: {stats.defect_rate:.1%}")

for bug in stats.bugs:
    print(f"  [{bug.bug_id}] {bug.bug_type.value}: {bug.description}")

Example Output

╭───────── TENSURE: 3 bug(s) found ��────────╮
│  Iterations:          1000                  │
│  Expressions:         1000                  │
│  Mutations:           4523                  │
│  Total Executions:    10046                 │
│  Duration:            12.3s                 │
│                                             │
│  Crashes:             2                     │
│  Wrong Results:       1                     │
│  Timeouts:            0                     │
│  Rejections:          0                     │
│                                             │
│  Total Bugs:          3                     │
│  Defect Rate:         0.03%                 │
╰─────────────────────────────────────────────╯

┌─────────┬──────────────┬──────────┬─────────┬────────────────┬───────────────────┐
│ ID      │ Type         │ Severity │ Backend │ Einsum         │ Description       │
├─────────┼──────────────┼──────────┼─────────┼────────────────┼───────────────────┤
│ BUG-A1B │ Crash        │ CRITICAL │ sparse  │ abc,bcd->ad    │ IndexError: ...   │
│ BUG-C3D │ Wrong Result │ HIGH     │ sparse  │ ij,jk->ik      │ max diff = 1.2e-2 │
│ BUG-E5F │ Crash        │ CRITICAL │ sparse  │ ijkl,jl->ik    │ ValueError: ...   │
└─────────┴──────────────┴──────────┴─────────┴────────────────┴───────────────────┘

Supported Backends

Backend	Package	Status	Description
NumPy	`numpy`	Always available	Reference oracle (ground truth)
pydata/sparse	`sparse`	Optional	N-dimensional sparse arrays (COO, GCXS)
opt_einsum	`opt_einsum`	Optional	Optimized contraction path finding
TACO	`tensora`	Optional	Tensor Algebra Compiler (C++ backend)

NumPy is always the reference oracle. When no external backends are specified, TENSURE tests np.einsum(optimize=True) vs np.einsum(optimize=False) as a self-consistency check.

Mutation Types

Mutation	Metamorphic Relation	What It Tests
Operand Permutation	AB = BA (element-wise commutativity)	Different iteration schedules
Format Heterogeneity	CSR(A)CSC(B) = Dense(A)Dense(B)	Format-specific code generation

Both mutations are semantics-preserving — any output difference is a confirmed compiler bug.

Configuration Reference

Option	Default	Description
`--iterations` / `-n`	1000	Number of fuzzing iterations
`--max-operands`	4	Max input tensors per expression
`--max-rank`	4	Max tensor dimensions
`--max-dim`	10	Max size per dimension
`--mutations`	5	Mutations per expression
`--timeout`	30.0	Per-execution timeout (seconds)
`--seed`	random	Random seed for reproducibility
`--backend` / `-b`	numpy	Target backend(s)
`--rtol`	1e-5	Relative tolerance
`--atol`	1e-8	Absolute tolerance
`--output-json` / `-o`	None	Save JSON report

Project Structure

tensure/
├── src/tensure/
│   ├── models.py          # 12 core data models (EinsumExpr, Bug, FuzzConfig, etc.)
│   ├── generator.py       # Constraint-based einsum generator (Algorithm 3)
│   ├── tensor_gen.py      # Sparse tensor data generator
│   ├── mutator.py         # Metamorphic mutation operators
│   ├── oracle.py          # Comparison oracle (crash, wrong result, NaN/Inf)
│   ├── reducer.py         # Test case minimization (delta debugging)
│   ├── fuzzer.py          # Main fuzzing engine (orchestrator)
│   ├── reporter.py        # Console + JSON reporting
│   ├── cli.py             # Click CLI (fuzz, test, info)
│   └── backends/
│       ├── base.py             # Abstract backend interface
│       ├── numpy_backend.py    # NumPy reference (always available)
│       ├── sparse_backend.py   # pydata/sparse backend
│       └── opt_einsum_backend.py  # opt_einsum backend
└── tests/                 # 127 tests across 8 test files

Background: TENSURE Paper

This project implements the core concepts from:

"TENSURE: Fuzzing Sparse Tensor Compilers" (NDSS Fuzzing Workshop 2026)

Paper: NDSS Symposium
Authors: Kabilan Mahathevan, Yining Zhang, Muhammad Ali Gulzar, Kirshanthan Sundararajah (Virginia Tech)

Key findings:

TACO: 14,400 crash bugs + 5,758 wrong-code bugs in 6 hours (~65% defect rate)
Finch: 57 crash bugs in 6 hours
100% valid test generation (vs 3.3% for grammar-based fuzzers like Grammarinator)
Einsum expressions are context-sensitive — grammar-based approaches fundamentally cannot handle them

This is an independent open-source implementation not affiliated with the original authors.

Known Limitations

Backend coverage: Currently tests NumPy, pydata/sparse, and opt_einsum. Native TACO/Finch integration requires additional setup.
Floating-point tolerance: Some false positives may occur for numerically unstable expressions. Adjust --rtol and --atol as needed.
Test case reduction: The delta-debugging reducer is basic. Advanced minimization (C-Reduce style) is planned.
Performance: Python-based execution. The original paper used compiled C/Julia kernels for faster throughput.

Contributing & Feedback

This is the first open-source implementation of the TENSURE paper. We'd love your feedback!

Found a real compiler bug? Please open an issue with the bug report JSON — we'll help you file it upstream.
False positive? Share the expression and tolerance settings so we can improve the oracle.
New backend? PRs welcome for PyTorch, TensorFlow, JAX, or other tensor libraries.
Feature request? Open an issue describing your use case.

License

MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src/tensure		src/tensure
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_KO.md		README_KO.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TENSURE

Why TENSURE?

How It Works

Key Innovations from the Paper

Quick Start

Install

Run a Fuzzing Campaign

Test a Specific Expression

System Info

Python API

Example Output

Supported Backends

Mutation Types

Configuration Reference

Project Structure

Background: TENSURE Paper

Known Limitations

Contributing & Feedback

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TENSURE

Why TENSURE?

How It Works

Key Innovations from the Paper

Quick Start

Install

Run a Fuzzing Campaign

Test a Specific Expression

System Info

Python API

Example Output

Supported Backends

Mutation Types

Configuration Reference

Project Structure

Background: TENSURE Paper

Known Limitations

Contributing & Feedback

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages