Skip to content

spear-ai/webway

Repository files navigation

Spear Data Normalization Gateway

A Rust workspace that decodes legacy binary messages using types generated from XSD schemas or C header files. Designed for airgapped deployment via a pre-built, fully-vendored container image.


Getting started

Rust workspace: install the toolchain and system deps, then build and test — see Building locally.

npm install

Architecture

┌─────────────────────────────────────────────────────────────┐
│  Code generation                                            │
│                                                             │
│   .xsd files  ──► spear-gen  ──► types.proto                │
│                           └──► types.rs                     │
│                                (decode_raw / encoded_size)  │
│                                                             │
│   .h files    ──► header-gen ──► structs.rs  (decode())     │
│                           ├──► messages.proto               │
│                           ├──► mappers.rs                   │
│                           └──► review_report.txt            │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│  Decode pipeline (spear-gateway)                            │
│                                                             │
│  legacy broker ──raw binary──► decode_raw() → Rust struct   │
│       or                                                    │
│  captured file ──raw binary──► decode_raw() → Rust struct   │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│  Future: normalization + publish (spear-lib)                │
│                                                             │
│  decoded struct                                             │
│      │                                                      │
│      ▼                                                      │
│  prost encode → ProtoEnvelope → Publisher → Redpanda        │
│                                                 │           │
│                                        new consumers        │
└─────────────────────────────────────────────────────────────┘

Crates

Crate Type Purpose
spear-gen binary Code generator: XSD → .proto + .rs with decode_raw/encoded_size
header-gen binary Code generator: C headers → Rust structs + .proto + mapping functions
spear-lib library Runtime: WSDL parser, ProtoEnvelope, Redpanda publisher
spear-gateway binary Decode pipeline: raw binary bytes → generated types → printed output

Building locally

# Requires: Rust stable, cmake, libcurl (for rdkafka in spear-lib)
cargo build --workspace --exclude header-gen
cargo test --workspace --exclude header-gen

# header-gen additionally requires llvm-dev + libclang-dev (Linux) or
# brew install llvm (macOS). Build and test it separately:
DYLD_LIBRARY_PATH=/opt/homebrew/opt/llvm/lib cargo test -p header-gen  # macOS
cargo test -p header-gen                                                  # Linux

spear-gen: XSD → code generation

Takes a directory of .xsd files and emits:

  • --out-proto — proto3 schema for downstream consumers
  • --out-rust — Rust structs with decode_raw(buf, same_endianness) and encoded_size() for the legacy binary wire format, plus prost::Message and serde::Deserialize derives
cargo run -p spear-gen -- \
  --input     schemas/synthetic \
  --out-proto generated/types.proto \
  --out-rust  generated/types.rs

See docs/xsd-proto-mapping.md for the full XSD → proto3/Rust mapping rules.


header-gen: C header → code generation

Takes a directory of .h files and emits three files per struct:

  • --out-rust — Rust structs with decode(bytes: &[u8]) (offset-based, configurable endianness) + review_report.txt for anything requiring manual review (bitfields, unions, unresolved types)
  • --out-proto — proto3 message definitions
  • --out-mapping — explicit map_*() functions from each Rust struct to its proto message

From the spear-dev container (recommended — no system libclang install required):

podman exec -it spear-dev header-gen \
  --input      /spear/headers \
  --include    /spear/includes \
  --include    /spear/lm-includes \
  --endian     little \
  --word-size  32 \
  --define     LINUX \
  --out-rust   /workspace/generated/rust \
  --out-proto  /workspace/generated/proto \
  --out-mapping /workspace/generated/mapping

From source (requires Rust + libclang):

cargo run -p header-gen -- \
  --input      headers/ \
  --include    /usr/include \
  --endian     little \
  --word-size  32 \
  --define     LINUX \
  --out-rust   generated/rust \
  --out-proto  generated/proto \
  --out-mapping generated/mapping

--word-size controls how long/unsigned long are mapped:

  • 32i32/u32 (LP32/ILP32 ABI)
  • 64i64/u64 (LP64 ABI)

--endian controls the decode method emitted (from_le_bytes vs from_be_bytes).

--include PATH (repeatable) adds an extra clang include search path. Use this for system headers or cross-compilation sysroots that your --input headers #include. Structs defined in --include directories are not emitted — only structs whose definition lives under --input appear in the output.

--verbose / -v prints per-struct filter decisions to stderr. Use this to diagnose 0-struct output without a redeploy cycle:

header-gen: input_dir as given   = `headers/`
header-gen: input_dir canonical  = `/spear/mission/headers`
[filter] MissionStatus  @  /spear/mission/headers/mission.h  ->  PASS
[filter] _IO_FILE       @  /usr/include/libio.h              ->  SKIP

If all structs show SKIP, canonical_input_dir and the paths libclang reports do not share a common prefix — check that --input points at the actual header directory, not a parent or symlink that resolves differently.

Deployment: header-gen is deployed via the spear-dev container image, which installs libclang at the OS level. There is no standalone release binary — LLVM's transitive shared-library dependencies make portable binary distribution impractical.


Synthetic schemas

schemas/synthetic/ contains representative XSD files used for local development and CI. The classified-side XSDs drop in as a direct replacement.

File Demonstrates
track.xsd Nested complex types, optional fields, enumerations
alert.xsd xs:choice, maxOccurs="unbounded", cross-file enum refs
status.xsd xs:extension (inheritance), 3-level nesting, plain string enums
sub/credentials.xsd Subdirectory scanning, primitive type aliases (xs:base64BinaryVec<u8>)

Airgapped deployment

The classified side has no crate registry. The workflow is:

1. Build the dev container (internet-connected machine)

./scripts/build-image.sh
# → vendors all crates, builds linux/amd64 image, saves to spear-dev.tar.gz

Transfer spear-dev.tar.gz to the classified side.

2. Load and run the container (classified side)

podman load < spear-dev.tar.gz

podman run -d --name spear-dev \
  -v /path/to/workspace:/workspace \
  spear-dev:latest

podman exec -it spear-dev bash

The container has the full Rust toolchain, all vendored crate sources, and pre-compiled build artifacts. Rebuilds inside the container only recompile changed Rust — the heavy C dependencies are already done.

3. Generate types from real schemas (inside container)

# From XSD files
spear-gen \
  --input     /workspace/xsds \
  --out-proto /workspace/types.proto \
  --out-rust  /workspace/types.rs

# From C header files
header-gen \
  --input      /workspace/headers \
  --endian     little \
  --word-size  32 \
  --define     LINUX \
  --out-rust   /workspace/generated/rust \
  --out-proto  /workspace/generated/proto \
  --out-mapping /workspace/generated/mapping

4. Plug in generated types and rebuild

cp /workspace/types.rs /spear/crates/spear-gateway/src/types.rs
# In crates/spear-gateway/src/main.rs:
#   1. Uncomment include!("types.rs")
#   2. Add decode_raw call in decode_and_print()
cargo build --offline --release -p spear-gateway

5. Decode a captured binary

# File mode — decode a raw binary captured from the wire
./target/release/spear-gateway --file /workspace/captures/msg.bin

# Live mode — connect to the legacy broker (C integration, coming later)
./target/release/spear-gateway --live

CI

Job What it checks
test cargo test --workspace --exclude header-gen on Ubuntu + macOS
test-header-gen cargo test -p header-gen on Ubuntu (requires libclang)
check-musl cargo check -p spear-gen (musl target)
lint cargo fmt --check + cargo clippy -D warnings (full workspace)

header-gen is excluded from the cross-platform test matrix because macOS runners don't ship with libclang. It gets full test coverage in the dedicated test-header-gen job on Linux.

Releases

Releases are managed with Changesets. Install Node dependencies once with npm install — see Getting started.

To include a change in a release, add a changeset file to your PR:

npm run changeset

Or create .changeset/<your-change>.md manually:

---
"@spear-ai/webway": patch
---

Description of what changed.

Bump types: patch (bug fix), minor (new feature), major (breaking change).

When your PR merges to main, the prepare-release workflow opens or updates a "Release 🚀" PR with the version bump applied across all crates. Merging that PR tags the release and triggers the build.

Artifacts attached to each GitHub Release:

  • spear-gen-linux-x86_64-musl.tar.gz — airgapped deployment binary (musl static)
  • spear-gen-linux-x86_64.tar.gz
  • spear-gen-macos-arm64.tar.gz
  • spear-gen-macos-x86_64.tar.gz
  • spear-dev-<version>.tar.gz — container image

header-gen is not released as a standalone binary — it is deployed via the spear-dev container image.


Project phases

Phase Status Description
Phase 1 Done spear-gen (XSD → proto + Rust) + spear-lib (envelope, publisher, WSDL parser)
Phase 2 Done spear-gateway decode pipeline + offline dev container (spear-dev)
Phase 2b Done header-gen (C headers → Rust structs + proto + mapping functions)
Phase 3 Planned Live legacy broker integration → normalize → publish to Redpanda
Phase 4 Planned Hardening, observability, airgapped K8s manifests

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors