Releases · cachevector/comprexx · GitHub

11 Apr 08:41

maskedsyntax

v0.3.0 Latest

Latest

Added

Benchmarking module (comprexx.benchmark): cx.benchmark() measures inference
latency with warmup, percentiles (p50/p90/p99), and throughput.
cx.compare_benchmarks() returns before/after comparisons with speedup deltas.
New comprexx bench CLI command.
Example notebooks with cell outputs: ResNet18 edge deployment (fuse, prune,
benchmark, ONNX export) and linear layer compression (SVD, weight-only INT4,
dynamic INT8).
GitHub Actions CI: pytest on Python 3.10/3.11/3.12 + ruff lint.
CHANGELOG.md covering v0.1.0 through v0.3.0.

Changed

Silenced torch.ao.quantization is deprecated warnings in PTQ stages.
Fixed __version__ to report the correct version.
Cleaned up all ruff lint errors across the codebase.

Stats

174 tests passing
9 compression techniques + sensitivity analysis
Python 3.10+, PyTorch 2.0+

Full changelog: v0.2.0...v0.3.0

Assets 2

07 Apr 10:07

maskedsyntax

v0.2.0

New compression techniques

Unstructured pruning: magnitude or random, with optional gradual cubic schedule
N:M sparsity: default 2:4, for NVIDIA Ampere sparse tensor cores
Weight-only INT4/INT8 quantization: group-wise, symmetric or asymmetric
Low-rank decomposition: truncated SVD for Linear layers, with rank-ratio or energy-threshold selection
Operator fusion: Conv2d + BatchNorm2d folding via torch.fx
Weight clustering: per-layer k-means codebook

New analysis tools

cx.analyze_sensitivity(): probes each Conv2d/Linear layer with a prune or noise perturbation, ranks layers by metric drop, and can suggest exclude_layers above a threshold

163 tests passing (up from 91).

Assets 2

07 Apr 10:07

maskedsyntax

v0.1.0

Initial release.

Features

Model analysis and profiling (cx.analyze)
Structured pruning (L1/L2/random, global or per-layer scope)
Post-training quantization (dynamic and static INT8)
ONNX export with manifest and onnxruntime validation
Recipe-driven pipelines (YAML)
CLI: comprexx analyze, compress, export
Accuracy guards with halt/warn actions
Per-stage compression reports persisted to comprexx_runs/

91 tests passing.

Assets 2