All notable changes to Comprexx are documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
0.3.0 - 2026-04-11
- Benchmarking module (
comprexx.benchmark):cx.benchmark()measures real inference latency with configurable warmup/iters, reporting mean, median, std, p50/p90/p99, min/max, and throughput.cx.compare_benchmarks()returns a before/after comparison with speedup and latency/throughput deltas. Quantized models are automatically run on CPU. Newcomprexx benchCLI command. - Example notebooks with cell outputs: ResNet18 edge deployment (fuse, prune, benchmark, ONNX export) and linear layer compression (SVD, weight-only INT4, dynamic INT8).
- GitHub Actions CI workflow running
pyteston Python 3.10, 3.11, 3.12 plus aruff checklint job. CHANGELOG.mdwith history for v0.1.0 and v0.2.0.
- Silenced the
torch.ao.quantization is deprecatedwarning inside the PTQ dynamic and static stages. The underlying API is still used, with a TODO marking the upcoming migration totorchao.quantization. - Fixed the package
__version__to report the correct version. - Tightened the codebase against
ruff checkand added a per-file ignore forE741in tests.
0.2.0 - 2026-04-07
- Unstructured pruning stage: magnitude or random element-wise pruning with global/local scope and optional gradual cubic schedule.
- N:M sparsity stage: structured N-of-M sparsity (default 2:4) for NVIDIA Ampere sparse tensor cores.
- Weight-only quantization stage: group-wise INT4/INT8 with symmetric or asymmetric scaling for Linear and Conv2d layers.
- Low-rank decomposition stage: truncated SVD factorization of Linear layers, with fixed rank-ratio or energy-threshold selection modes.
- Operator fusion stage: Conv2d + BatchNorm2d folding via
torch.fxwith graceful fallback on non-traceable models. - Weight clustering stage: per-layer k-means codebook clustering.
cx.analyze_sensitivity(): per-layer sensitivity probing via prune or noise perturbation. Returns aSensitivityReportthat ranks layers by metric drop and can suggestexclude_layersabove a threshold.- New techniques are wired through the recipe schema and loader, and
exported from
comprexx.stages.
- 163 passing (up from 91).
0.1.0 - 2026-04-06
Initial release.
- Model analysis and profiling via
cx.analyze(). - Structured pruning with L1/L2/random criteria and global/local scope.
- Post-training dynamic and static INT8 quantization.
- ONNX export with manifest and optional
onnxruntimevalidation. - Recipe-driven pipelines (YAML) validated via Pydantic.
- CLI commands:
comprexx analyze,compress,export. - Accuracy guards with halt/warn actions.
- Per-stage compression reports persisted under
comprexx_runs/.