CHANGELOG.md

Changelog

All notable changes to Comprexx are documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Benchmarking module (comprexx.benchmark): cx.benchmark() measures real inference latency with configurable warmup/iters, reporting mean, median, std, p50/p90/p99, min/max, and throughput. cx.compare_benchmarks() returns a before/after comparison with speedup and latency/throughput deltas. Quantized models are automatically run on CPU. New comprexx bench CLI command.
Example notebooks with cell outputs: ResNet18 edge deployment (fuse, prune, benchmark, ONNX export) and linear layer compression (SVD, weight-only INT4, dynamic INT8).
GitHub Actions CI workflow running pytest on Python 3.10, 3.11, 3.12 plus a ruff check lint job.
CHANGELOG.md with history for v0.1.0 and v0.2.0.

Silenced the torch.ao.quantization is deprecated warning inside the PTQ dynamic and static stages. The underlying API is still used, with a TODO marking the upcoming migration to torchao.quantization.
Fixed the package __version__ to report the correct version.
Tightened the codebase against ruff check and added a per-file ignore for E741 in tests.

Unstructured pruning stage: magnitude or random element-wise pruning with global/local scope and optional gradual cubic schedule.
N:M sparsity stage: structured N-of-M sparsity (default 2:4) for NVIDIA Ampere sparse tensor cores.
Weight-only quantization stage: group-wise INT4/INT8 with symmetric or asymmetric scaling for Linear and Conv2d layers.
Low-rank decomposition stage: truncated SVD factorization of Linear layers, with fixed rank-ratio or energy-threshold selection modes.
Operator fusion stage: Conv2d + BatchNorm2d folding via torch.fx with graceful fallback on non-traceable models.
Weight clustering stage: per-layer k-means codebook clustering.
cx.analyze_sensitivity(): per-layer sensitivity probing via prune or noise perturbation. Returns a SensitivityReport that ranks layers by metric drop and can suggest exclude_layers above a threshold.
New techniques are wired through the recipe schema and loader, and exported from comprexx.stages.

Initial release.