Skip to content
View benzsevern's full-sized avatar

Block or report benzsevern

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
benzsevern/README.md

Hi, I'm Ben

Website LinkedIn PyPI

I build open-source data quality and entity resolution tools in Python. Everything I ship lands on PyPI and the MCP Registry so it works out of the box with LLM agents.


Golden Suite

A modular toolkit where each piece works standalone or chains together via GoldenPipe.

  CSV / DB / API
       │
       ▼
┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│ GoldenCheck │───▶│ GoldenFlow  │───▶│ GoldenMatch │
│  Validate   │    │  Transform  │    │   Resolve   │
└─────────────┘    └─────────────┘    └─────────────┘
       └──────────────────┬──────────────────┘
                          ▼
                    GoldenPipe
                   (orchestrator)
Project Highlights Downloads
Resolve GoldenMatch 97.2% F1 on DBLP-ACM · 30 MCP tools · 10 A2A skills Downloads
Validate GoldenCheck Zero-config profiling & drift detection · 19 MCP tools Downloads
Transform GoldenFlow 76 transforms · DQBench Transform: 100/100 · 10 MCP tools Downloads
Orchestrate GoldenPipe Chains Check → Flow → Match · 4 MCP tools Downloads
Extensions & integrations

Other Projects

DQBench — The standard benchmark for data quality tools. 4 categories, 12 tiers, 161 tests. Used to score Golden Suite and compare against Great Expectations, Pandera, Soda Core, and others.

InferMap — Inference-driven schema mapping for Python & TypeScript. 7 scorers, domain dictionaries, cross-language parity (F1 0.84).

DevPilot — Dev server supervisor for AI coders. CLI + MCP server with 10 tools. Lifecycle management, health checks, crash recovery.


Tech

Python Rust TypeScript Polars

Pinned Loading

  1. goldenmatch goldenmatch Public

    Entity resolution and deduplication toolkit — outperforms Splink, dedupe, and RecordLinkage on cross-domain benchmarks. Zero-config. MST cluster auto-splitting. Quality-weighted survivorship. 30 MC…

    Python 30 5

  2. goldencheck goldencheck Public

    Data validation that discovers rules from your data. Python + TypeScript. DQBench Score: 88.40.

    Python 1

  3. goldenflow goldenflow Public

    Data transformation toolkit — standardize, reshape, and normalize messy data. Python & TypeScript. 83 transforms, zero-config mode, MCP server, edge-safe. DQBench 100/100.

    Python 1

  4. infermap infermap Public

    Inference-driven schema mapping engine for Python and TypeScript. 7 built-in scorers, domain dictionaries (healthcare/finance/ecommerce), confidence calibration, cross-language accuracy benchmark (…

    Python

  5. goldenpipe goldenpipe Public

    Golden Suite orchestrator — chains validation (GoldenCheck), transformation (GoldenFlow), and entity resolution (GoldenMatch). 4 MCP tools on Smithery. DQBench Pipeline: 88.07.

    Python

  6. dqbench dqbench Public

    The standard benchmark for data quality tools — detection, transformation, entity resolution, and pipeline orchestration. 4 categories, 12 tiers, 161 tests.

    Python