Releases: unitaryfoundation/qrack
Avoid state vector duplication in lossy save/load
This release avoids a duplication of the state vector in almost all cases (except QPager or QStabilizer) of lossy saving and loading, which makes the operations faster, with a lower memory footprint.
Full Changelog: vm6502q.v10.6.0...vm6502q.v10.6.1
sha1sum results:
d22793bae9dc27e920908577c07e8930d69bf1c3 libqrack-macosx_14_0_arm64.zip
a9d00e8c1876cd67149a88181c18c766fd2f912a libqrack-macosx_15_0_arm64.zip
857af0c1206f5c54622baceed7061b39b55763a9 libqrack-manylinux_2_35_x86_64.zip
edbc63deb4545bf275b61f26cd208cae7ffe7061 libqrack-manylinux_2_39_x86_64.zip
dfd7c2b947990582fbac1d2478717faca16b7613 libqrack-win-amd64.zip
Rudimentary TurboQuant-based compression to disk
Claude and I (Dan, @WrathfulSpatula) had a marathon session today with discussing the (obvious) possibility of using a TurboQuant-based compression methods for quantum state vectors and particularly simulated random circuit sampling (RCS) output states. Claude wrote almost all the lines-of-code implementation; I mostly corrected small implementation bugs and coached the approach. Claude produced a header for StateVectorTurboQuant, based on the StateVector contract API I wrote years ago, and it fulfills the full API contract. However, round-trip decompression and recompression of in-flight results from circuit simulations might not be the right application for this, yet (or ever). But we managed to achieve significant (technically "lossy") compression to disk that can preserve RCS cross-entropy benchmark (XEB) fidelity, at least for preliminary tests at small qubit widths, which is a slightly surprising result, given that we might expect XEB, by definition of the benchmark for RCS, to be exactly the tiny variance tail that would otherwise be thrown away by lossy compression of the bulk state vector (or rank-1 "tensor").
Again, Claude wrote basically every line of code in StateVectorTurboQuant, from the existing API contract example, and based on TurboQuant (Zandieh et al., arXiv:2504.19874) and an Apache 2.0 open-source implementation by @TheTom (github.com/TheTom/turboquant_plus). My contribution was mostly thinking to direct Claude's efforts here, reporting back what failed to actually compress size-on-disk or preserve fidelity, and finally pointing out that real and imaginary components of Hilbert-space dimensions tended to approach Haar-uniformity as respective pairs (under the summed L2 norm) rather than as independent streams of dimensions, and that this holds for most meaningful quantum algorithms (possibly with exceptions like VQE and Shor's integer-factoring algorithm). By that point, the empirical evidence clicked into place. (Claude deserves quite a bit of credit and thanks, and it's a pleasure talking and working with them.)
Less "rudimentary" versions will follow, hopefully avoiding the memory spike upon saving to disk. I'm not overly optimistic about in-flight lossy compression of the state vector during execution, yet, but it's meaningful that XEB can be preserved with something like maybe a ~4:1 compression ratio on 10 qubits, in very cursory preliminary tests.
Full Changelog: vm6502q.v10.5.3...vm6502q.v10.6.0
sha1sum results:
50142182bd8db7cd741aa86fe20d21df44355481 libqrack-macosx_14_0_arm64.zip
b7553bb89f67ff69a3a6d78af40332c6020dc093 libqrack-macosx_15_0_arm64.zip
049023d172fce35fb332c77b9bf92642f7b1876a libqrack-manylinux_2_35_x86_64.zip
3fe2725916ae3ca644f1de5f9405138c85579bbe libqrack-manylinux_2_39_x86_64.zip
2af8a57be4479834d87d3f88e40bd0cbe02982b8 libqrack-win-amd64.zip
Fix potential stack smashing
Experimentation in Weed happened to raise a case where clFinish and tryOcl were stuck in a recursive stack smash. We have verified that the current release avoids this recursive loop. The chances of this case ever occurring naturally outside of Weed seem very small, but anything that qualifies as a "memory-safety error," like stack smashing, requires immediate attention.
Full Changelog: vm6502q.v10.5.2...vm6502q.v10.5.3
sha1sum results:
2dfdaa5eca87f04f61c8b0265f4cb124aec51ddd libqrack-macosx_14_0_arm64.zip
a78d502365bbe6deda9743c02620a35e5133798d libqrack-macosx_15_0_arm64.zip
84fe2ca8ef299ff8b7e6865f4bbd982d0cb63e75 libqrack-manylinux_2_35_x86_64.zip
1c9a450774cbae151c79da047e2f5bb0f1b5b4b3 libqrack-manylinux_2_39_x86_64.zip
713e3157f73fd6cb3e3487032c00cf0e87168eb1 libqrack-win-amd64.zip
Claude Code debugging
I asked Claude Code for a bug report on Qrack: they identified some edge cases where conditionals were incorrect, but, notably, they found no memory-safety errors. Acting on the report, this release contains the fixes.
Full Changelog: vm6502q.v10.5.1...vm6502q.v10.5.2
sha1sum results:
8ca3c73b3ccd49e22f1f77dad9ac07a1b21ceda0 libqrack-macosx_14_0_arm64.zip
30d16fbe570e92336711d01046a4e83e14db61d3 libqrack-macosx_15_0_arm64.zip
4fd8085a83caaaa1769c7746748391a63e32fd73 libqrack-manylinux_2_35_x86_64.zip
9c14e442bcfeaefdf62e50f8422c580c91f856bf libqrack-manylinux_2_39_x86_64.zip
5a4ccc0a3c713332219d475e483ae964b1fc1dbe libqrack-win-amd64.zip
Fix segfault (from near-Clifford buffers)
When I introduced buffers to help approximate near-Clifford simulation in QStabilizer, I failed to copy them across simulator instances in Compose(). Hence, qubit-by-qubit allocation of a simulator ran into out-of-bounds reads and write, as raised by issue unitaryfoundation/pyqrack#43. (valgrind quickly surfaced the issue, but this wasn't a use-case I was looking at.) This release closes the issue.
Full Changelog: vm6502q.v10.5.0...vm6502q.v10.5.1
sha1sum results:
968b3581d078b7a8eec3abbb191caf407ddac7bf libqrack-macosx_14_0_arm64.zip
20087328c78a022f1c8ad11770c1178701e3adc0 libqrack-macosx_15_0_arm64.zip
4e2770a29f324570ea3689aa223499c4a3859a9a libqrack-manylinux_2_35_x86_64.zip
f4e019beb2b74108daa0c0fd324b69eb6f1e9b50 libqrack-manylinux_2_39_x86_64.zip
699220c66afaab9902ae88927eb16f58af3d0c67 libqrack-win-amd64.zip
Hashable "Big Integer" (for sparse simulation)
Given the recent success in "spoofing" quantum volume, with a combination of automatic circuit elision (ACE) and sparse simulation, it made sense to try to reduce the significant computational overhead of sparse simulation. (Anthropic) Claude proposed a meaningful improvement on the sparse truncation method, for when memory limit is exceeded, but we had another productive back-and-forth about std::map vs. std::unordered_map, and the readily-hashable big_integer.hpp in Qrack. So, the credit for the truncation rewrite and hashing is largely due to Claude, with thanks.
Full Changelog: vm6502q.v10.4.1...vm6502q.v10.5.0
sha1sum results:
525bfc0153a3187741315a7a389eb800e42d21a0 libqrack-macosx_14_0_arm64.zip
029bb3bf604c7dca428fcb4bbb652fcf06729977 libqrack-macosx_15_0_arm64.zip
90a157209a382e71fc63ec294fd0ea42d7f6598a libqrack-manylinux_2_35_x86_64.zip
fc97863bccbdb164e6af6cf2cef5212902688531 libqrack-manylinux_2_39_x86_64.zip
1fae9d5e74e0dfe9e5c6ecbe1f818072ad1441aa libqrack-win-amd64.zip
Debug sparse simulation
The previous release introduced a bug in sparse simulation. It was initially moderately difficult to adapt sparse simulation code to handle widths greater than 64 qubits, but, with the benefit of the earlier draft attempt in hand, it became easier to produce a minimally invasive code differential that satisfied the aim. This has been tested.
Full Changelog: vm6502q.v10.4.0...vm6502q.v10.4.1
sha1sum results:
147da51f3a018e52f91e331c4496c96795a1bdb5 libqrack-macosx_14_0_arm64.zip
b4398f40019b660bd2e5774f1c3e47f41d33b00e libqrack-macosx_15_0_arm64.zip
76551ccf9b91cede5f29c8d026d78f896b4779a6 libqrack-manylinux_2_35_x86_64.zip
32406a564bd675437a9a91d62d65c7df38afc166 libqrack-manylinux_2_39_x86_64.zip
feca95903aa9f9ca1b6228b66e5458a3ea612cae libqrack-win-amd64.zip
>64 qb of sampling in shared library API
This release allows >64 qb of measurement distribution sampling in the shared library API, as well as >64qb of sparse simulation width.
Claude, the LLM by Anthropic, has made their first contribution in this release, helping me quickly produce the the shared-library wrapper for the >64 qb output-packing case.
Full Changelog: vm6502q.v10.3.0...vm6502q.v10.4.0
sha1sum results:
a80dce4458616a351fa2ad1f13645df975666cc3 libqrack-macosx_14_0_arm64.zip
2d7ba0975f1e8b54d9d3a3b9776bb0598fbd4719 libqrack-macosx_15_0_arm64.zip
6ca9a036c94ad5999607b6639bd9d6f8689e5526 libqrack-manylinux_2_35_x86_64.zip
cee2342a50dfa806c787fc2b2c0bb4f4c17a5432 libqrack-manylinux_2_39_x86_64.zip
ac90de5aa92ba68629436a7ab8b964a224036c30 libqrack-win-amd64.zip
Sparse probability rounding parameter (SPRP) in shared API
Locally equivalent to the global QRACK_SPARSE_TRUNCATION_THRESHOLD environment variable setting, we expose "sparse probability rounding parameter" (SPRP) in the shared library API, for convenience of per-simulator localized control.
Note that this really isn't another fidelity-tradeoff approximation knob to tune: if one sets a memory limit on maximum sparse amplitudes retained, it's theoretically always preferable in the ideal to retain all nonzero amplitudes until the memory-limit truncation kicks in. However, this isn't the most performant way for the algorithm to proceed, in terms of time-to-solution, and gentle tuning of "SPRP" might help "smooth out" sensitivity to floating-point error compounding, in some cases.
For random circuit sampling (RCS), we've found empirically that, for connectivity order parameter "z" and qubit count "n," the optimal SPRP setting is approximately 1 / (z * (2 ** n)). In other words, on a square nearest-neighbor grid layout of qubits, each qubit directly connects to 4 others, so z=4 (and SPRP should be 1 / (4 * (2 ** n)); for fully-connected RCS, each qubit of the n in the circuit connects to n-1 other qubits, so z=n-1 (and ideal SPRP setting is approximately 1 / ((n - 1) * (2 ** n))).
Full Changelog: vm6502q.v10.2.0...vm6502q.v10.3.0
sha1sum results:
3abd49ca35b5b516cb834fcb9fbea55a80228c3d libqrack-macosx_14_0_arm64.zip
aff191e3abd992bd0a3f236b99c5180bff697e73 libqrack-macosx_15_0_arm64.zip
ad8a9be64f41fdc63024242443b6dd1dffed2fd6 libqrack-manylinux_2_35_x86_64.zip
90dbb9c236ad340bf7032024964c65372c6194db libqrack-manylinux_2_39_x86_64.zip
46eb33193c4107e974922cc77ff8312757f81288 libqrack-win-amd64.zip
Better-balanced approximate near-Clifford in QStabilizer
We have an exact near-Clifford method (which is too slow for almost anything but one-off checks of single probability amplitudes, or a small number at a time), and we have two approximate near-Clifford methods. QStabilizerHybrid co-opts the "reverse phase-injection gadgets" from the exact method as stochastic injection points, at point of terminal measurement, for either the closest or second-closest Clifford phase gate to a requested RZ operation: any RZ gate implies a "phase quadrant" defined by the closest possible Clifford rotation only in terms of S-gate increments and the complementary second-closest possible S-gate increment on the unit circle. (If the RZ gate is equivalent to a T gate, then the ideal gate sits at the exactly balanced midpoint between these two closest possible S gates.) QStabilizer relies on this same concept of an "S-gate quadrant," though it applies the stochastic RZ gate correction at the point the gate is requested, instead of deferred with an "injection gadget" until terminal measurement, and it tries to "complete the S gate" downstream, propagating and completing complements of phase remainder to at least first order.
In QStabilizer, at the point an RZ gate is requested, there are four "logically equivalent" representations of state that lead to different downstream stochastic side effects, holding the requested RZ gate definition fixed. The "remainder" of the RZ gate that cannot be applied exactly is retained as a buffered scalar and continues to participate statistically downstream. Upon application of the RZ gate, we could apply side-effects to other scalar phase buffers based on either the closest or else second-closest pure-Clifford S-gate quadrant. (We have taken to calling the closest state, "major," and the second-closest state, "minor.") The decision of which of these two to apply initially is based on a biased "coin flip," weighted linearly by the angle imbalance between closest and second-closest states. (So, for a T gate, at the exact midpoint, the probability of either approximate state convention is equal, and for the square root of T, for example, it would be 3/4 probability in favor of the closest state vs. 1/4 in favor of the second closest.) Secondly, downstream, we continue to incorporate the statistical side effects of the "remainder" between pure-Clifford phase application and requested RZ gate, for either "major" or "minor" representation, but we don't apply cumulative "side effects" on other scalar phase remainder buffers, as if this were the initial application of the RZ gate. Hence, we also have "reverse major" and "reverse minor" possible states, where "major" and "minor" representations are reversed without statistical side effects of initial gate application. (Empirically, this is the best policy we have found, for the stochastic approximation.) Hence, we also choose "reverse" or "forward" representation with equal probability, for best possible statistical balance, to approximate the non-Clifford phase in ensemble of multiple independent measurement shots.
This release of Qrack consistently applies the logic described above in QStabilizer. It also exposes new methods for fine-grained user control to toggle between "major," "minor," "forward," and "reverse" representations that are considered the four closest "logically-equivalent" representations to any arbitrary non-Clifford RZ gate application.
Full Changelog: vm6502q.v10.1.9...vm6502q.v10.2.0
sha1sum results:
70b142761be4f46de752d0ae9b809414924a0133 libqrack-macosx_14_0_arm64.zip
b79e6297a3d7c6616f43e096013b87e324f9085e libqrack-macosx_15_0_arm64.zip
9e1c6d7ac436c654fdf960293c6fc7990d99248c libqrack-manylinux_2_35_x86_64.zip
8a45ddb929ad0f183fc293b83e48761744e4f0d5 libqrack-manylinux_2_39_x86_64.zip
4bb742a3f0c7d01f1974d5ed1b9c5f07ba76691b libqrack-win-amd64.zip