Sahil Saxena SahilSaxena007

Sahil Saxena

Computer Science · University of Manchester
Incoming Software Engineering Analyst · Deutsche Bank (London)

📫 How to reach me: [email protected]
💼 Current focus: Safety primitives for AI autonomy & transactional agent execution

🚀 About Me

I’m a systems-focused engineer building infrastructure that bridges large-model reasoning with safe real-world action. My work spans applied machine learning pipelines, real-time multimodal interaction systems, and foundational execution layer design for agentic AI.

I thrive on problems at the intersection of AI reasoning, systems architecture, and trustworthy automation — building not just models, but the runtime primitives that make autonomous behavior safe to deploy.

🔭 Current Projects

🔹 Backtrack — Transactional Execution Layer for Agentic AI

Backtrack introduces a novel execution model for AI actions:
plan → preview → approve → execute → undo.

It compiles natural-language intent into structured, reversible action plans using the Gemini 3 API, presents a clear diff for human approval, and executes with transactional guarantees plus deterministic undo. Designed as a foundational runtime layer for autonomous systems, Backtrack enables safe, auditable, and reversible autonomy across real systems.

Key technical highlights:

Structured plan compilation with Gemini 3 (multi-step reasoning, constrained outputs)
Append-only execution ledger + reversible action graph
Hybrid inverse-ops + selective checkpointing for guaranteed undo
TypeScript/Node backend with Electron desktop UI

Backtrack targets the trust barrier in agentic AI — a critical unmet need if systems are to act safely and autonomously in production environments.

🎓 Dissertation Project — HITL Alzheimer’s Classifier

• Designed a Human-in-the-Loop ensemble classifier (CatBoost, Random Forest, Neural Network) for early Alzheimer’s staging using clinical biomarker data, improving accuracy from 73.7% to 82.8% over an AI-only baseline. • Integrated SHAP-based explainability into an interactive clinician review dashboard, enabling transparent, auditable decision support for uncertain model predictions. • Reduced misclassification cost by up to 58% through structured human oversight integration, validated with McNemar’s test p < 0.001 — demonstrating rigorous statistical evaluation of model behaviour under real-world uncertainty. • Built a comprehensive evaluation framework spanning probability calibration, cross-validation, and controlled AI-only vs. HITL comparative experiments to quantify the measurable value of human oversight over automated decision-making.

🛠️ Languages & Tools

📈 Projects & Impact

Designed and deployed production ML pipelines for time-series forecasting and root-cause NLP analysis at Deutsche Bank.
Built EmpathAI, a real-time multimodal system combining vision, audio, and generative reasoning for empathy inference.
Contributed to internal tooling and automation frameworks in startup environments.

📫 Let’s Connect

I’m open to collaborations on systems-level AI infrastructure, agentic safety, and scalable reasoning platforms. Let’s build the primitives of tomorrow.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly