Releases: cachevector/comprexx
Releases · cachevector/comprexx
v0.3.0
Added
- Benchmarking module (
comprexx.benchmark):cx.benchmark()measures inference
latency with warmup, percentiles (p50/p90/p99), and throughput.
cx.compare_benchmarks()returns before/after comparisons with speedup deltas.
Newcomprexx benchCLI command. - Example notebooks with cell outputs: ResNet18 edge deployment (fuse, prune,
benchmark, ONNX export) and linear layer compression (SVD, weight-only INT4,
dynamic INT8). - GitHub Actions CI: pytest on Python 3.10/3.11/3.12 + ruff lint.
- CHANGELOG.md covering v0.1.0 through v0.3.0.
Changed
- Silenced
torch.ao.quantization is deprecatedwarnings in PTQ stages. - Fixed
__version__to report the correct version. - Cleaned up all ruff lint errors across the codebase.
Stats
- 174 tests passing
- 9 compression techniques + sensitivity analysis
- Python 3.10+, PyTorch 2.0+
Full changelog: v0.2.0...v0.3.0
v0.2.0
New compression techniques
- Unstructured pruning: magnitude or random, with optional gradual cubic schedule
- N:M sparsity: default 2:4, for NVIDIA Ampere sparse tensor cores
- Weight-only INT4/INT8 quantization: group-wise, symmetric or asymmetric
- Low-rank decomposition: truncated SVD for Linear layers, with rank-ratio or energy-threshold selection
- Operator fusion: Conv2d + BatchNorm2d folding via torch.fx
- Weight clustering: per-layer k-means codebook
New analysis tools
cx.analyze_sensitivity(): probes each Conv2d/Linear layer with a prune or noise perturbation, ranks layers by metric drop, and can suggestexclude_layersabove a threshold
163 tests passing (up from 91).
v0.1.0
Initial release.
Features
- Model analysis and profiling (
cx.analyze) - Structured pruning (L1/L2/random, global or per-layer scope)
- Post-training quantization (dynamic and static INT8)
- ONNX export with manifest and onnxruntime validation
- Recipe-driven pipelines (YAML)
- CLI:
comprexx analyze,compress,export - Accuracy guards with halt/warn actions
- Per-stage compression reports persisted to
comprexx_runs/
91 tests passing.