Spec: GAAIM Core v0.1-draft §4.2 (Canonical Serialization) License: Apache-2.0 Status: Reference implementations — validated against shared test vectors
This package contains reference implementations of the GAAIM canonical JSON serialization in TypeScript and C#. Both produce bit-identical output for the same input, enabling cross-platform signature interoperability.
The canonicalizer takes a GAAIM event (a JSON object) and produces the canonical UTF-8 byte sequence that is signed per §5. This is the single source of truth for what gets hashed and signed — implementations that canonicalize differently will not verify each other's signatures.
The canonicalization procedure, per §4.2:
- Remove top-level
signatureandsignaturekeyattributes. - Apply JCS (RFC 8785) to the remaining object:
- Sort object keys lexicographically by UTF-16 code unit, recursively.
- No insignificant whitespace.
- Numbers serialized per ECMAScript
ToString(Number). - Strings emitted as UTF-8 with minimal I-JSON escaping.
- Emit UTF-8 bytes (no BOM).
Per §4.2.1, absent attributes MUST NOT be synthesized as null. Both implementations enforce this: only keys present in the source object are emitted in the canonical form.
canonicalize/
├── README.md ← this file
├── test-vectors/
│ └── canonical-vectors-v0.1.json ← 6 shared test vectors with SHA-256 hashes
├── typescript/
│ ├── package.json
│ ├── tsconfig.json
│ └── src/
│ ├── canonicalize.ts ← the implementation
│ └── test.ts ← test runner
└── csharp/
├── src/
│ ├── Gaaim.Canonicalization.csproj
│ └── Canonicalizer.cs ← the implementation
└── tests/
├── Gaaim.Canonicalization.Tests.csproj
└── Program.cs ← test runner
The test-vectors/canonical-vectors-v0.1.json file contains six normative vectors. Every conformant canonicalizer MUST produce matching output for all six. The vectors cover:
| Vector | What it exercises | Byte length | SHA-256 prefix |
|---|---|---|---|
full-event-no-audit |
Complete A.1 event, all optional attributes present | 903 | 8d028f41... |
minimal-event |
REQUIRED fields only, many absent attributes | 378 | d6249473... |
unicode-content |
Non-ASCII in keys, values, tags | 502 | f22007e2... |
signature-stripping |
Source has signature/signaturekey that must be removed | 378 | 81a503c8... |
prev-null |
Explicit prev: null preserved vs absent keys dropped |
411 | 8d59333b... |
number-edge-cases |
Integer vs float (1.0 → 1), zero, negative, fractional |
421 | 73bce3c5... |
Full hashes and expected canonical forms are in the vectors file.
cd typescript
npm install
npm run testExpected output:
Running 6 test vectors from .../canonical-vectors-v0.1.json
✓ full-event-no-audit
✓ minimal-event
✓ unicode-content
✓ signature-stripping
✓ prev-null
✓ number-edge-cases
Verifying canonicalSha256Hex() async helper:
✓ full-event-no-audit (async)
... [all 12 checks pass]
==================================================
6 passed, 0 failed
Status in this package: validated end-to-end. All six vectors pass, plus six additional async-helper validations.
import { canonicalize, canonicalizeToString, canonicalSha256Hex } from '@gaaim/canonicalize';
const event = {
specversion: "1.0",
id: "01HQ5P3KJ6X8W2YQGMZB9N4T7R",
source: "ide-plugin://example.org/code-adapter/instance-7",
type: "gaaim.core.artifact.created",
time: "2026-04-04T21:30:15.123Z",
gaaimversion: "1.0",
profile: "core",
data: { /* ... */ }
};
// Get canonical UTF-8 bytes (this is what you sign)
const canonicalBytes: Uint8Array = canonicalize(event);
// Or get the canonical string
const canonicalStr: string = canonicalizeToString(event);
// Or compute SHA-256 directly
const hash: string = await canonicalSha256Hex(event);cd csharp/tests
dotnet run -- ../../test-vectors/canonical-vectors-v0.1.jsonRequires .NET 8 SDK or later. Uses System.Text.Json (no external dependencies).
Status in this package: written and verified by inspection against the TypeScript reference. Number-formatting edge cases (very large/small doubles using scientific notation) are the most likely divergence point; run the test vectors to confirm on your target .NET version.
using Gaaim.Canonicalization;
using System.Text.Json;
string eventJson = "{ \"specversion\": \"1.0\", /* ... */ }";
// Get canonical UTF-8 bytes (this is what you sign)
byte[] canonicalBytes = Canonicalizer.Canonicalize(eventJson);
// Or from a JsonElement
using var doc = JsonDocument.Parse(eventJson);
byte[] bytes2 = Canonicalizer.Canonicalize(doc.RootElement);
// Get canonical string
string canonicalStr = Canonicalizer.CanonicalizeToString(doc.RootElement);
// Compute SHA-256 directly
string hash = Canonicalizer.CanonicalSha256Hex(eventJson);The trickiest part of JCS. Both implementations handle:
- Integer fast path:
2847→"2847",1247→"1247" - Whole-valued doubles:
1.0→"1"(matchesString(1.0) === "1"in ECMAScript) - Negative zero:
-0.0→"0" - Fractional values:
0.92→"0.92"via shortest-round-trip formatting - Rejects NaN and Infinity (not valid JSON)
Known limitation: the exponent threshold where scientific notation kicks in can differ between the ECMAScript algorithm and .NET's default double.ToString("R"). ECMAScript switches to exponent form at ≥ 1e21 or < 1e-6. .NET's threshold is similar but not guaranteed identical for all edge cases. Audit-event payloads rarely contain such extreme values (token counts, lines changed, durations in ms), so this matters only in unusual cases.
The C# NormalizeExponent helper converts "1E+21" → "1e+21" and ensures the + sign is present on positive exponents, matching ECMAScript's output format.
Per RFC 8785 §3.2.2.2, JCS uses I-JSON (RFC 7493) string encoding: escape only the minimal set.
- Always escaped:
",\, and control characters U+0000 through U+001F - Short escapes preferred:
\b\t\n\f\r - Other control chars:
\u00XXlowercase hex - Non-ASCII characters emitted as UTF-8 bytes, NOT as
\uXXXXescapes
The unicode test vector exercises this — a canonicalizer that emits \u65E5\u672C\u8A9E instead of the UTF-8 bytes for 日本語 will fail.
JCS requires lexicographic sort by UTF-16 code unit. JavaScript's default Array.prototype.sort() does this. In .NET, StringComparer.Ordinal is the equivalent — it compares strings char-by-char, and char in .NET is a UTF-16 code unit.
For characters outside the Basic Multilingual Plane (code points > U+FFFF), both languages represent them as surrogate pairs in their string types, and both sort them the same way when using code-unit comparison.
When you integrate these into your GAAIM producer or verifier:
- Run the test vectors. All six MUST pass.
- Do not modify the canonicalizer output with pretty-printing, logging instrumentation, or encoding transforms before signing.
- Remember: sign the bytes from
canonicalize(), not the string. Re-encoding the string could introduce BOMs or normalization differences. - Verifier: re-canonicalize on receipt before verifying signature. Don't trust that the sender canonicalized correctly — the bytes you verify must be the canonical bytes you produce from the received event.
- After Fix 4 integration (the
auditeligibleattribute), regenerate your production test vectors to includeauditeligible: trueon L1+ events.
function verifyEvent(event, registry):
# 1. Extract signature and signaturekey
signature = event.signature # e.g., "ed25519:<base64url>"
signaturekey = event.signaturekey # e.g., "https://keys.example.com/v1/keys/adapter-2026q2"
# 2. Resolve public key from registry
keyRecord = registry.get(signaturekey)
if keyRecord.keyUri != signaturekey: reject("registry-mismatch")
if keyRecord.revokedAt != null: reject("key-withdrawn")
# 3. Canonicalize event (signature fields stripped automatically)
canonicalBytes = canonicalize(event)
# 4. Verify signature over canonical bytes
sigBytes = base64urlDecode(signature.after(":"))
publicKey = parseSpkiBase64(keyRecord.publicKey)
if not Ed25519.verify(publicKey, canonicalBytes, sigBytes):
reject("signature-invalid")
# 5. Verify chain continuity (§5.6.2) if event.auditeligible
if event.auditeligible and event.prev != lastSeenEventId:
flag("chain-gap")
accept(event)
Apache-2.0. See the individual project files for details. This is reference code intended to be copied, modified, and embedded in GAAIM producers and verifiers.