Skip to content

Releases: cachevector/comprexx

v0.3.0

11 Apr 08:41

Choose a tag to compare

Added

  • Benchmarking module (comprexx.benchmark): cx.benchmark() measures inference
    latency with warmup, percentiles (p50/p90/p99), and throughput.
    cx.compare_benchmarks() returns before/after comparisons with speedup deltas.
    New comprexx bench CLI command.
  • Example notebooks with cell outputs: ResNet18 edge deployment (fuse, prune,
    benchmark, ONNX export) and linear layer compression (SVD, weight-only INT4,
    dynamic INT8).
  • GitHub Actions CI: pytest on Python 3.10/3.11/3.12 + ruff lint.
  • CHANGELOG.md covering v0.1.0 through v0.3.0.

Changed

  • Silenced torch.ao.quantization is deprecated warnings in PTQ stages.
  • Fixed __version__ to report the correct version.
  • Cleaned up all ruff lint errors across the codebase.

Stats

  • 174 tests passing
  • 9 compression techniques + sensitivity analysis
  • Python 3.10+, PyTorch 2.0+

Full changelog: v0.2.0...v0.3.0

v0.2.0

07 Apr 10:07

Choose a tag to compare

New compression techniques

  • Unstructured pruning: magnitude or random, with optional gradual cubic schedule
  • N:M sparsity: default 2:4, for NVIDIA Ampere sparse tensor cores
  • Weight-only INT4/INT8 quantization: group-wise, symmetric or asymmetric
  • Low-rank decomposition: truncated SVD for Linear layers, with rank-ratio or energy-threshold selection
  • Operator fusion: Conv2d + BatchNorm2d folding via torch.fx
  • Weight clustering: per-layer k-means codebook

New analysis tools

  • cx.analyze_sensitivity(): probes each Conv2d/Linear layer with a prune or noise perturbation, ranks layers by metric drop, and can suggest exclude_layers above a threshold

163 tests passing (up from 91).

v0.1.0

07 Apr 10:07

Choose a tag to compare

Initial release.

Features

  • Model analysis and profiling (cx.analyze)
  • Structured pruning (L1/L2/random, global or per-layer scope)
  • Post-training quantization (dynamic and static INT8)
  • ONNX export with manifest and onnxruntime validation
  • Recipe-driven pipelines (YAML)
  • CLI: comprexx analyze, compress, export
  • Accuracy guards with halt/warn actions
  • Per-stage compression reports persisted to comprexx_runs/

91 tests passing.