Releases: VynFi/VynFi-python
v1.4.0 — DataSynth 3.0 + VynFi API Adoption
Minor release adding support for DataSynth 3.0 features: scenario packs,
fingerprint synthesis, adversarial ONNX probing, AI-assisted config tuning,
and the dashboard co-pilot. All features verified end-to-end against the live API.
New Endpoints
Scenario Packs (client.scenarios.packs())
Eleven built-in counterfactual simulations across four categories:
| Category | Packs |
|---|---|
| Fraud | vendor_collusion_ring, management_override, ghost_employee, procurement_kickback, channel_stuffing |
| Control failures | sox_material_weakness, it_control_breakdown |
| Macro | recession_2008_replay, supply_chain_disruption_q3, interest_rate_shock_300bp |
| Operational | erp_migration_cutover |
packs = client.scenarios.packs()
scenario = client.scenarios.create(
name="Q3 revenue stress",
generation_config={
"sector": "retail", "rows": 10000,
"scenarios": {"enabled": True, "packs": ["channel_stuffing"]},
},
)
client.scenarios.run(scenario.id)
diff = client.scenarios.diff(scenario.id)AI Tuning (client.jobs.tune(), Scale+)
suggestion = client.jobs.tune(job_id, target_scores={"overall": 0.95})
print(suggestion.explanation)
# -> {original_config, suggested_config, explanation, quality_summary}Dashboard Co-pilot (client.ai.chat(), Scale+)
reply = client.ai.chat("Which fraud packs are right for audit training?")Fingerprint Synthesis (client.fingerprint.synthesize(), Team+)
# Privacy-preserving synthesis from a .dsf fingerprint
submission = client.fingerprint.synthesize(
"./private_data.dsf",
rows=10000,
backend="statistical", # or "neural"/"hybrid" (Scale+)
)Adversarial Probing (client.adversarial.probe(), Enterprise)
# Probe an ONNX fraud detector for decision-boundary weaknesses
probe = client.adversarial.probe("./model.onnx", n_probes=10000)
results = client.adversarial.results(probe.id)Config-side DS 3.0 features (no SDK changes needed)
- Neural diffusion:
diffusion.backend = "neural" | "hybrid"withneural.*subsection (Scale+) - Quality gates:
qualityGates.profile = "standard" | "strict" | "audit"(Team+) - Custom interventions:
scenarios.interventions[].target/value/timing(Scale+)
Upstream DataSynth fixes now live
- OCPM fields populated on JE headers —
ocpm_event_ids,ocpm_object_ids,ocpm_case_idnow carry full process mining metadata (was empty in 2.3.x). Verified 209/300 entries on a sample retail job. is_fraudon document flow records,display_nameon banking customers,numericMode: native, analytics/labels/process_mining output dirs — all confirmed.
Still upstream
exportLayout: flathangs the DataSynth binary — use the default nested layout until upstream fix lands.
Other changes
- Default client timeout bumped 30s → 60s (generate_quick server-side limit is 30s; 30s default was too tight with network latency).
Scenarios.create()contract updated to DS 3.0 shape ({name, generation_config}). Legacytemplate_id/interventionskwargs still work — auto-folded into config.
Four new examples
Full Changelog
v1.3.0 — DataSynth 2.3 + VynFi API 2.0 Features
Major release adding support for DataSynth 2.3 + VynFi API 2.0 features.
All features verified end-to-end against the live API.
New Endpoints
# Pre-built statistical analytics for a completed job
a = client.jobs.analytics(job_id)
print(f"Benford MAD: {a.benford_analysis.mad:.4f}")
print(f"AML coverage: {a.banking_evaluation.aml.typology_coverage:.2%}")
# Rate-controlled NDJSON streaming for TB-scale jobs (Scale tier+)
for envelope in client.jobs.stream_ndjson(job_id, rate=500, progress_interval=1000):
if envelope.get("type") == "_progress":
print(f" {envelope['lines_emitted']:,} lines emitted")
else:
my_pipeline.send(envelope)
# Storage quota validation for TB-scale jobs
size = client.configs.estimate_size(config=my_config)
print(f"~{size.estimated_files} files, ~{size.estimated_bytes / 1e9:.1f} GB")
print(f"Tier quota: {size.tier_quota_bytes / 1e12:.1f} TB")
# Raw DataSynth YAML config submission (Scale tier+)
result = client.configs.submit_raw(yaml="rows: 1000\nsector: retail")Transparent Archive Backends
JobArchive now seamlessly handles both legacy zip archives and the new TB-scale managed_blob manifests with presigned URLs:
archive = client.jobs.download_archive(job_id)
print(archive.backend) # "zip" or "managed_blob"
entries = archive.json("journal_entries.json") # lazy fetch via presigned URL if blobDataSynth 2.3 Output Modes
job = client.jobs.generate_config(config={
"sector": "retail",
"rows": 1000,
"output": {
"exportLayout": "flat", # one row per line, header merged ✓ verified live
"numericMode": "native", # JSON numbers (upstream DataSynth bug pending)
},
})New Models
- Analytics (15 models):
JobAnalytics,BenfordAnalysis,AmountDistributionAnalysis,VariantAnalysis,BankingEvaluation,KycCompletenessAnalysis,AmlDetectabilityAnalysis,CrossLayerCoherenceAnalysis,VelocityQualityAnalysis,FalsePositiveAnalysis,TypologyDetection - Sizing:
EstimateSizeResponse,SizeBucket - Raw config:
RawConfigResponse
Bug Fixes
- CamelCase deserialization for
JobFileList,JobFile,EstimateSizeResponse— were silently returning defaults when API actually had data - Download timeout extended from 30s → 5min (was breaking on large archive downloads)
Process Mining Notebook Enhanced
05_process_mining_ocel.ipynb now covers:
- All 8 DataSynth processes (O2C, P2P, S2C, H2R, MFG, Banking, Audit, BankRecon)
- OCEL 2.0 readiness section
- Cross-process traceability via
cross_process_links.json
New Examples
analytics_export.py— pre-built analytics workflowndjson_streaming.py— rate-controlled streaming for TB-scalenative_mode.py— DataSynth 2.3 native + flat layout
New Output Categories (DataSynth 2.3)
| Category | Description |
|---|---|
analytics/ |
Pre-built statistical evaluations (Benford, distributions, variants, banking) |
labels/ |
Anomaly labels + fraud red flags (CSV/JSON/JSONL formats) |
process_mining/ |
Full OCEL 2.0 event log + objects + relationships (19,974 events + 7,381 objects in a sample retail job) |
Verification
10 of 11 server-side fixes verified live. See docs/v1.3.0-verification-report.md for details.
Full Changelog
v1.2.0 — File Listing, Output Estimates, Per-File Download
What's New
Ships support for 3 API features deployed today by the API team.
File listing with schemas
List all files in a completed job's archive without downloading the full zip:
file_list = client.jobs.list_files(job_id)
print(f"{file_list.total_files} files, {file_list.total_size_bytes / 1e6:.0f} MB")
for f in file_list.files:
cols = ", ".join(s.name for s in f.schema_[:3])
print(f" {f.path} ({f.size_bytes:,} bytes) [{cols}, ...]")Output size estimates
estimate_cost() now returns expected output dimensions before you run a job:
est = client.configs.estimate_cost(config=my_config)
print(f"Credits: {est.total_credits}")
print(f"Output: ~{est.output.estimated_files} files, ~{est.output.estimated_size_bytes / 1e6:.0f} MB")
print(f"Note: {est.output.note}")Per-file download (now working)
Download individual files from a job without pulling the full archive:
data = client.jobs.download_file(job_id, "journal_entries.json")
# Also supports subdirectory paths:
data = client.jobs.download_file(job_id, "banking/banking_customers.json")New types
JobFileList,JobFile,FileSchema-- file listing response modelsOutputEstimate-- output size estimate onEstimateCostResponse.output
Full Changelog
v1.1.0 — JobArchive, Examples Suite, Endpoint Fixes
What's New
JobArchive — ergonomic archive access
Downloaded job archives are now wrapped in a JobArchive class for easy file access:
archive = client.jobs.download_archive(job_id)
archive.files() # list all 80+ files
archive.categories() # ['banking', 'document_flows', 'esg', ...]
archive.json("journal_entries.json") # parse JSON directly
archive.find("esg/*") # glob-style search
archive.summary() # file counts and sizes by category
archive.extract_to("./output") # extract to diskpandas: archive_to_dataframes()
Convert all JSON files in an archive to DataFrames in one call, with automatic header/lines flattening for journal entries:
from vynfi.integrations.pandas import archive_to_dataframes
frames = archive_to_dataframes(archive)
# {'journal_entries.json': DataFrame(95881 rows), 'banking/banking_customers.json': DataFrame(620 rows), ...}14 examples — notebooks + scripts
| Notebook | Use Case |
|---|---|
01_quickstart |
5-minute getting started |
02_audit_data_deep_dive |
Benford's law, debit/credit validation, SOX controls |
03_fraud_detection_lab |
Labeled fraud data, RF classifier (98.3% accuracy) |
04_document_flow_audit_trail |
P2P/O2C chains, three-way matching, gap analysis |
05_process_mining_ocel |
Event log reconstruction, variant analysis |
06_esg_sustainability_reporting |
Emissions, energy, diversity, materiality matrix |
07_aml_compliance_testing |
KYC, transaction monitoring, risk scoring, SAR |
Plus 7 standalone scripts: quickstart, streaming, pandas workflow, config management, multi-period sessions, what-if scenarios, quality monitoring.
Bug fixes
- Config endpoints used wrong URL path (
/v1/configs/...→/v1/config/...for validate, estimate-cost, compose) - Sessions
generate_next()used wrong URL path (/generate-next→/generate)
All fixes confirmed against the Rust SDK reference and live API.
Full Changelog
v1.0.0
VynFi Python SDK v1.0.0
First stable release. Full API parity with the VynFi Rust SDK reference implementation.
New Resources
- Configs — save, validate, estimate cost, and compose generation configs
- Credits — purchase prepaid packs, check balance, view history
- Sessions — multi-period generation sessions
- Scenarios — what-if scenarios with causal graph templates
- Notifications — list and mark-read
New Methods on Existing Resources
jobs.generate_config(),jobs.download_file(),jobs.wait()catalog.list_templates()billing.checkout(),billing.portal()
Ecosystem Integrations
pip install vynfi[pandas]— download job output as DataFramespip install vynfi[polars]— Polars DataFrame support
Other
- 12 resources, 52 tests, all verified against the live API
ForbiddenError(403), dedicatedQuickJobResponse/CancelJobResponsetypes- All Pydantic models aligned with actual API response shapes
See CHANGELOG.md for full details.
v0.1.0
Initial release
First public release of the VynFi Python SDK.
Features
- Full coverage of VynFi API resources: Jobs, Catalog, Usage, API Keys, Quality, Webhooks, Billing
- Automatic retry on 429/5xx with exponential backoff
- SSE streaming for job progress
- Typed responses via Pydantic v2
- Python 3.9–3.13 support
Installation
pip install vynfiQuick start
from vynfi import VynFi
client = VynFi(api_key="vf_live_...")
job = client.jobs.generate(sector="banking", tables={"transactions": 1000})