Add serialization/hashing for integer/natural#6163
Conversation
|
I think the overall structure is fine. Adding As far as the format goes, my opinion is based on no actual usage data. However, my expectation is that most of these numbers are still going to be small. So maybe something that doesn't necessarily use 8 bytes would still be a good idea. Just off the top of my head, since you're writing them big endian, maybe you can serialize just the first However, it might also be good to just use My other comment is that you only changed the version 5 value format. At this point, it's probably what everyone is using. But it's probably good for us to support this in the earlier formats as well. The earlier format might also be used if you try to hash (cryptographic, not murmur) the value. That just involves the same changes in other Also, I just want to make sure: this |
|
That's correct, |
|
OK @dolio I have changed to a varint encoding. LMK if this is better. |
|
The implementation looks good. I think it'd be a good idea to make the tests use |
|
Do we want to wait for the tests? My gut says yes, but someone let me know. |

Overview
This change adds support for serializing and hashing arbitrary-precision
IntegerandNaturalvalues in the Unison runtime.Problem: When a Unison value containing an
IntegerorNaturalwas serialized viareflectValue, the runtime would fail with the error:reflectValue: cannot prepare value for serialization: foreign value. This caused operations like storing values in event logs or transferring them across the network to fail.Solution: Added
BigIntandBigNatconstructors to theBLit(boxed literal) type in the ANF representation, with corresponding serialization, deserialization, and hashing support.User experience: Code that previously hung or failed silently when serializing values containing
IntegerorDecimaltypes now works correctly.Implementation approach and notes
ANF.hs: Added
BigInt IntegerandBigNat Naturalconstructors to theBLitdata type, alongside existing literals likeText,Bytes,Pos,Neg, etc.Tags.hs: Added
BigIntT(tag 15) andBigNatT(tag 16) to theBLTagenum for wire format identification.ValueV5.hs & ANF/Serialize.hs: Added serialization functions that encode arbitrary-precision numbers as:
Natural: length-prefixed list of Word64 chunks (most significant first)Integer: sign byte (0=positive, 1=negative) + Natural magnitudeMurmurHash/Untyped.hs: Added hashing support for
BigIntandBigNatby hashing the sign and Word64 chunks.Machine.hs: Updated
reflectValue(goF) to convertWrapInteger/WrapNaturalforeign values toANF.BigInt/ANF.BigNat, and updatedreifyValue(goL) to convert back.Interesting/controversial decisions
Chose
BLitover alternatives:BLitalready contains other serializable foreign values without literal syntax (Code,Quote,BArr,Arr), so addingBigInt/BigNathere is consistent with existing patterns.Serialization format: Used a simple length-prefixed Word64 chunk format rather than a more compact variable-length encoding. Happy to change this if controversial.
Tail recursion: Made the
naturalToWord64sandintegerToWord64shelper functions tail-recursive and strict.Test coverage
Unison/Test/Runtime/ANF/Serialization.hsgenIntegergenerator that produces small integers (fit in Int64), large positive integers (2^64 to 2^256), and large negative integersgenNaturalgenerator that produces small naturals (fit in Word64) and large naturals (2^64 to 2^256)genBLit, which feeds into the existingvalueRoundtripproperty testAll 276 runtime tests pass.
Loose ends
Serialization format can be debated.