Skip to content

seanbrar/seanbrar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 

Repository files navigation

Sean Brar

ML systems engineer building evaluation and verification infrastructure for AI systems. Google DeepMind GSoC 2025 alumnus.

My work focuses on the question: when an AI system produces output, how do you know it's correct? I build the harnesses, validation pipelines, and correctness guarantees that answer that question — from statistical evaluation of retrieval systems to schema-constrained LLM output validation to infrastructure-level correctness in LLM orchestration.

Currently pursuing post-baccalaureate CS and Mathematics, preparing for graduate research in ML evaluation and verification.

Selected Projects

Pollux — Async multimodal LLM orchestration library with deterministic content-hash caching, single-flight deduplication, and retry-policy separation for generation vs. side-effect calls. 90% API cost reduction on fan-out workloads. GSoC 2025 with Google DeepMind. Published on PyPI.

ContextRAG — RAG evaluation harness computing 7 retrieval metrics with TOST equivalence testing, bootstrap CIs, and Holm-Bonferroni correction. Validated a preregistered null hypothesis across 60+ experiment runs and 3 datasets.

gh-templates — Schema-constrained LLM extraction pipeline across 3,746 repositories. Pydantic contracts validating structured Gemini output with transient/permanent error taxonomy at 99.97% success rate.

paperweight — arXiv paper discovery and triage CLI with golden-set validation, offline integration testing, and Tenacity retry architecture. Published on PyPI.

Connect

seanbrar.com · LinkedIn · [email protected]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors