Evidpath helps teams check a recommender before launch by running interaction tests, saving clear evidence, and comparing two versions before they ship.
- check that your recommender endpoint is wired correctly
- run a repeatable audit against a real target URL
- open a report that shows who struggled and why
- compare a baseline and a candidate before launch
Install the package:
python -m pip install evidpathCheck your endpoint:
evidpath check-target --domain recommender --target-url http://127.0.0.1:8051Run one audit:
evidpath audit --domain recommender --target-url http://127.0.0.1:8051 --scenario returning-user-home-feed --seed 7That run writes an output folder with files such as report.md,
results.json, and traces.jsonl.
- product guide: products/evidpath/README.md
- PyPI: https://pypi.org/project/evidpath/
- TestPyPI: https://test.pypi.org/project/evidpath/
- releases: https://github.com/AlankritVerma01/limitation/releases
- demo guide: products/evidpath/DEMO.md
- external target contract: products/evidpath/EXTERNAL_TARGET_CONTRACT.md
- contributing: CONTRIBUTING.md
This repository contains two closely related things:
- the product package under products/evidpath
- the public proof and study under studies/01-recommender-offline-eval
If you are here to use the product, start with the product guide. If you are here to understand the original proof behind the direction, read the study.
The study package shows the original argument behind Evidpath: offline ranking metrics can miss important user-level tradeoffs.
Useful links:
- study README: studies/01-recommender-offline-eval/README.md
- canonical report: studies/01-recommender-offline-eval/artifacts/canonical/official_demo_report.md
- canonical JSON: studies/01-recommender-offline-eval/artifacts/canonical/official_demo_results.json
- product docs: products/evidpath/README.md
- PyPI README: products/evidpath/README_PYPI.md
- plans: plans/evidpath-v0/README.md
- code of conduct: CODE_OF_CONDUCT.md
- security: SECURITY.md
- support: SUPPORT.md
The earlier public write-up that motivated this direction is here:
https://dev.to/alankritverma/why-offline-evaluation-is-not-enough-for-recommendation-systems-15ii