Swift Santorini engine + MLX training + web UI.
Game rules run in Swift (native and SwiftWasm), and the web UI uses ONNX Runtime Web for neural inference.
Packages/SantoriniCore: game rules, encoding, MCTS implementation.Sources/NeuralNetwork: MLX model (SantoriniNet).Sources/Training: self-play, replay buffer, trainer/evaluation pipeline.Sources/AlphaSantorini: CLI commands (train,inspect,arena, etc.).Packages/SantoriniWasm: SwiftWasm build used by the browser UI.web/: Vite + Three.js frontend.tools/export_onnx.py: exports MLX safetensors checkpoints to ONNX.tools/verify_onnx.js: runtime sanity check for exported ONNX.experiments.md: experiment history and rationale for current defaults.
The model input is NHWC [batch, 5, 5, 9].
Per-cell planes (one-hot):
H0H1H2H3DOMECURRENTW1CURRENTW2OTHERW1OTHERW2
Important:
- Worker planes are turn-relative (
CURRENT*vsOTHER*), not absolute player-color planes. - This encoding is defined in
Packages/SantoriniCore/Sources/Santorini/Encoding/GameStateEncoding.swift. - In the web UI, MCTS currently reconstructs this 9-plane tensor from
getStateSummary()to match training format.
Total policy size is 153 actions:
25placements (5x5)128moves (2 workers * 8 move dirs * 8 build dirs)
Defined in Packages/SantoriniCore/Sources/Santorini/Encoding/ActionEncoding.swift.
SantoriniNet (MLX) is a residual convolutional model:
- Input block:
Conv3x3(9->256) + BatchNorm + ReLU - Residual tower: 8 residual blocks (default), each:
Conv3x3 + BN + ReLUConv3x3 + BN- skip add + ReLU
- Policy head:
Conv1x1(256->2) + BN + ReLU- flatten
2*5*5 = 50 Linear(50->153)(logits; softmax applied at inference/loss)
- Value head:
Conv1x1(256->1) + BN + ReLU- flatten
25 Linear(25->64) + ReLULinear(64->1) + tanh(value in[-1, 1])
Implementation: Sources/NeuralNetwork/SantoriniNet.swift.
Training loop phases per iteration:
- Self-play (batched async MCTS + batched NN eval)
- SGD updates from replay buffer
- Arena evaluation vs best checkpointed model
- Checkpoint save
Defaults come from TrainingConfig and CLI train options:
iterations:500gamesPerIteration:100mctsSimulations:256trainingStepsPerIteration:100batchSize:128learningRate:0.001replayBufferSize:100000symmetryAugmentation:true(8x via board symmetries)- Dirichlet noise:
epsilon=0.25,alpha=0.3 - Noise anneal: to floor
0.05across150iterations - Value target strategy:
terminal(option:mcts) - Evaluation: every
10iterations,20games, promotion threshold0.55 - Checkpoints: every
10iterations - Early stop:
100iterations without promotion
Batching defaults:
selfPlayBatchSize:128selfPlayConcurrency:150batchTimeoutMicroseconds:100
Note: self-play runs max(gamesPerIteration, selfPlayConcurrency) games to keep concurrency saturated. With defaults, that means 150 games even if gamesPerIteration=100.
- Policy loss: cross-entropy over legal-action support with masked logits:
illegal logits are set to
-1e9beforelogSoftmax. - Value loss: MSE.
Implementation: Sources/Training/SantoriniTrainer.swift.
- Swift 6.2+ (native build + training)
- Python 3.9+ with
torch,safetensors,onnx(ONNX export) - Node 20+ (web UI)
- SwiftWasm SDK (for WASM bundle generation)
Build:
xcrun swift buildFast core tests (rules + MCTS only):
xcrun swift test --package-path Packages/SantoriniCoreFull suite (core + NN + training + integration):
xcrun swift testIntegration test note: it runs a tiny 1-iteration training loop and writes a temporary checkpoint.
Default training run:
xcrun swift run AlphaSantoriniExplicit training with overrides:
xcrun swift run AlphaSantorini train \
--iterations 500 \
--games-per-iteration 100 \
--mcts-simulations 256Resume from checkpoint:
xcrun swift run AlphaSantorini train \
--resume-checkpoint checkpoints/checkpoint_480.safetensorsInspect priors/value on a position:
xcrun swift run AlphaSantorini inspect \
--checkpoint checkpoints/final.safetensors \
--advance 10 \
--mcts-simulations 200 \
--top-k 10Checkpoint-vs-checkpoint arena:
xcrun swift run AlphaSantorini arena \
--checkpoint-a checkpoints/checkpoint_350.safetensors \
--checkpoint-b checkpoints/checkpoint_480.safetensors \
--games 50Checkpoint vs uniform-policy baseline:
xcrun swift run AlphaSantorini arena-baseline \
--checkpoint checkpoints/final.safetensors \
--games 50Export checkpoint to ONNX:
python3 tools/export_onnx.py \
--checkpoint checkpoints/final.safetensors \
--output web/public/models/santorini.onnxThe exporter auto-detects architecture (current conv net and legacy FC variants) and writes a single self-contained .onnx file.
Verify ONNX output:
cd web
npm install
NODE_PATH=./node_modules node ../tools/verify_onnx.js public/models/santorini.onnxList installed SDKs first:
xcrun swift sdk listThen build WASM bundle (PackageToJS):
cd Packages/SantoriniWasm
swift package --swift-sdk <wasm-sdk-id> \
--allow-writing-to-directory ../../web/public/wasm \
js --product SantoriniWasm --output ../../web/public/wasmOutputs:
web/public/wasm/index.jsweb/public/wasm/SantoriniWasm.wasm
If you hit toolchain linker issues with swift, use xcrun swift ....
cd web
npm install
npm run devExpected runtime assets:
web/public/wasm/index.jsweb/public/wasm/SantoriniWasm.wasmweb/public/wasm-loader.jsweb/public/models/santorini.onnx(optional but recommended)
If the model is missing or fails to load, the UI falls back to baseline MCTS behavior (uniform-network prior/value).
Build static site:
cd web
npm run buildCommon wrappers from repo root:
make build
make wasm
make onnx CHECKPOINT=checkpoints/final.safetensors
make web-build
make web-dev- Run core tests:
xcrun swift test --package-path Packages/SantoriniCore - Run full tests:
xcrun swift test - Train a short run (
--iterations 1or--iterations 10) to verify your environment - Export one checkpoint to ONNX
- Build WASM bundle
- Launch web UI and verify model-backed moves