Computer Science · University of Manchester
Incoming Software Engineering Analyst · Deutsche Bank (London)
- 📫 How to reach me: [email protected]
- 💼 Current focus: Safety primitives for AI autonomy & transactional agent execution
I’m a systems-focused engineer building infrastructure that bridges large-model reasoning with safe real-world action. My work spans applied machine learning pipelines, real-time multimodal interaction systems, and foundational execution layer design for agentic AI.
I thrive on problems at the intersection of AI reasoning, systems architecture, and trustworthy automation — building not just models, but the runtime primitives that make autonomous behavior safe to deploy.
Backtrack introduces a novel execution model for AI actions:
plan → preview → approve → execute → undo.
It compiles natural-language intent into structured, reversible action plans using the Gemini 3 API, presents a clear diff for human approval, and executes with transactional guarantees plus deterministic undo. Designed as a foundational runtime layer for autonomous systems, Backtrack enables safe, auditable, and reversible autonomy across real systems.
Key technical highlights:
- Structured plan compilation with Gemini 3 (multi-step reasoning, constrained outputs)
- Append-only execution ledger + reversible action graph
- Hybrid inverse-ops + selective checkpointing for guaranteed undo
- TypeScript/Node backend with Electron desktop UI
Backtrack targets the trust barrier in agentic AI — a critical unmet need if systems are to act safely and autonomously in production environments.
• Designed a Human-in-the-Loop ensemble classifier (CatBoost, Random Forest, Neural Network) for early Alzheimer’s staging using clinical biomarker data, improving accuracy from 73.7% to 82.8% over an AI-only baseline. • Integrated SHAP-based explainability into an interactive clinician review dashboard, enabling transparent, auditable decision support for uncertain model predictions. • Reduced misclassification cost by up to 58% through structured human oversight integration, validated with McNemar’s test p < 0.001 — demonstrating rigorous statistical evaluation of model behaviour under real-world uncertainty. • Built a comprehensive evaluation framework spanning probability calibration, cross-validation, and controlled AI-only vs. HITL comparative experiments to quantify the measurable value of human oversight over automated decision-making.
- Designed and deployed production ML pipelines for time-series forecasting and root-cause NLP analysis at Deutsche Bank.
- Built EmpathAI, a real-time multimodal system combining vision, audio, and generative reasoning for empathy inference.
- Contributed to internal tooling and automation frameworks in startup environments.
I’m open to collaborations on systems-level AI infrastructure, agentic safety, and scalable reasoning platforms. Let’s build the primitives of tomorrow.

