Skip to content

TypeScript API

benzsevern edited this page Apr 8, 2026 · 1 revision

TypeScript API

Full API reference for the infermap npm package — the TypeScript port of the Python library with bit-for-bit mapping parity.

Install:

npm install infermap

Exports

Entrypoint Contents Runtime
infermap / infermap/core Types, map(), MapEngine, all 6 scorers, Hungarian assignment, in-memory/CSV/JSON/schema-file providers, config loader edge-safe
infermap/node extractSchemaFromFile, extractDbSchema (SQLite/Postgres/DuckDB) Node only

Convenience function

map(source, target, options?)

The simplest entry point. Normalizes polymorphic inputs to SchemaInfo and runs MapEngine.

import { map } from "infermap";

const result = map(
  { records: [{ fname: "A", email_addr: "[email protected]" }] },
  { records: [{ first_name: "", email: "" }] }
);

MapInput — accepted shapes for source and target:

type MapInput =
  | SchemaInfo                                                 // pre-extracted
  | { records: Array<Record<string, unknown>>; sourceName? }   // plain records
  | { csvText: string; sourceName? }                           // CSV as string
  | { jsonText: string; sourceName? }                          // JSON array as string
  | { schemaDefinition: string | object; sourceName? };        // JSON schema file

MapOptions:

interface MapOptions {
  required?: string[];              // extra required target fields
  schemaFile?: SchemaInfo;          // merge aliases from here
  sampleSize?: number;              // profile sample cap (default 500)
  config?: string | EngineConfig;   // scorer overrides + alias extensions
  engineOptions?: MapEngineOptions; // minConfidence, explicit scorer list, etc.
}

Core classes

MapEngine

import { MapEngine } from "infermap";

const engine = new MapEngine({
  minConfidence: 0.3,       // drop mappings below this combined score
  scorers: defaultScorers(),// explicit scorer chain; overrides config
  onScorerError: (info) => {// optional per-scorer exception handler
    console.warn(info.scorer, info.error);
  },
});

const result = engine.mapSchemas(srcSchema, tgtSchema, {
  required: ["email"],
  schemaFile: loadedSchemaFile, // merges aliases into target metadata
});

Scorer interface

interface Scorer {
  readonly name: string;
  readonly weight: number;
  score(source: FieldInfo, target: FieldInfo): ScorerResult | null;
}

Returning null is "abstain" (excluded from the weighted average). Returning makeScorerResult(0, "...") is a real negative (counted).

Built-in scorers

All six mirror the Python package exactly:

import {
  ExactScorer,       // 1.0 on case-insensitive exact name match
  AliasScorer,       // 0.95 on canonical alias match; accepts extraAliases
  PatternTypeScorer, // regex-based semantic type detection from samples
  ProfileScorer,     // weighted profile comparison (dtype + null/uniq/len/card)
  FuzzyNameScorer,   // Jaro-Winkler on normalized names
  LLMScorer,         // stub with pluggable adapter; always abstains in sync path
} from "infermap";

Default chain (via defaultScorers()) includes the first five. LLMScorer is opt-in.

defineScorer(name, fn, weight?)

Register a function-style scorer without a class:

import { defineScorer, makeScorerResult } from "infermap";

const mine = defineScorer(
  "DomainMatcher",
  (src, tgt) => {
    if (src.name.startsWith("cust_") && tgt.name.startsWith("customer_")) {
      return makeScorerResult(0.9, "shared customer prefix");
    }
    return null;
  },
  0.6 // weight
);

Providers (edge-safe)

import {
  inferSchemaFromRecords,     // plain Array<Record<string, unknown>>
  inferSchemaFromCsvText,     // CSV text (RFC 4180)
  inferSchemaFromJsonText,    // JSON array of records
  parseSchemaDefinition,      // JSON { fields: [...] } with aliases/required
} from "infermap";

Providers (Node-only)

Under the infermap/node subpath. Require filesystem or database drivers.

import {
  extractSchemaFromFile,  // reads .csv or .json from disk
  extractDbSchema,        // connects via URI: sqlite://, postgresql://, duckdb://
} from "infermap/node";

const schema = await extractDbSchema("postgresql://user:pass@host/db", {
  table: "customers",
});

DB drivers are optional peer dependencies. Install only what you need:

npm install better-sqlite3       # for sqlite://
npm install pg                    # for postgresql://
npm install @duckdb/node-api      # for duckdb://

Config

EngineConfig

interface EngineConfig {
  scorers?: {
    [name: string]: { enabled?: boolean; weight?: number };
  };
  aliases?: {
    [canonical: string]: string[];
  };
}

Passed to map({ config }) or loadEngineConfig(json).

fromConfig(json) / mapResultToConfigJson(result)

Persist a MapResult as JSON and reload it later. Compatible with the Python from_config loader when the JSON shape is used (YAML is not supported in the TS port).

import { mapResultToConfigJson, fromConfig } from "infermap";

await writeFile("mapping.json", mapResultToConfigJson(result));
// later:
const restored = fromConfig(await readFile("mapping.json", "utf8"));

Types

interface FieldInfo {
  name: string;
  dtype: "string" | "integer" | "float" | "boolean" | "date" | "datetime";
  sampleValues: string[];
  nullRate: number;       // 0..1
  uniqueRate: number;     // 0..1
  valueCount: number;
  metadata: Record<string, unknown>;
}

interface SchemaInfo {
  fields: FieldInfo[];
  sourceName: string;
  requiredFields: string[];
}

interface ScorerResult {
  score: number;          // clamped to [0, 1]
  reasoning: string;
}

interface FieldMapping {
  source: string;
  target: string;
  confidence: number;
  breakdown: Record<string, ScorerResult>;
  reasoning: string;
}

interface MapResult {
  mappings: FieldMapping[];
  unmappedSource: string[];
  unmappedTarget: string[];
  warnings: string[];
  metadata: Record<string, unknown>;
}

Next.js usage

Works everywhere — Server Components, Server Actions, Route Handlers, middleware, Edge Runtime. Default entrypoint has zero Node built-ins.

// app/api/infer/route.ts
import { map } from "infermap";

export const runtime = "edge";

export async function POST(req: Request) {
  const { sourceCsv, targetCsv } = await req.json();
  const result = map({ csvText: sourceCsv }, { csvText: targetCsv });
  return Response.json(result);
}

For Node runtime with filesystem/DB access, drop runtime = "edge" and use infermap/node.

See also

Clone this wiki locally