Skip to main content

Ruby Jha

Engineering Manager · Applied AI · Cloud The next decade belongs to engineering leaders who build with AI.

I spent the last two decades leading engineering teams at State Street, Centene, and EY. Building products that handle real money, real patient data, and real regulatory scrutiny. The kind of work where downtime means someone's claim doesn't get processed and a bad deployment means financial loss.

Now I am applying that same engineering discipline to AI. I am building 9 AI systems covering RAG pipelines, embedding fine-tuning, and multi-agent orchestration. Each one has evaluation frameworks, architecture decision records, and metrics I would trust in a code review. The same standards I would hold any production system to.

The Full Stack

Leadership

People Management Hiring & Team Building Performance & Promotions Executive Communication Technical Strategy

Technical

Python Java TypeScript OpenAI API LangChain CrewAI FastAPI ChromaDB Azure Docker Kubernetes React Spring Boot Astro

Featured Projects

What I'm building

Project 01
Demo: streamlit Completed

Synthetic Data Generation Pipeline

I built a pipeline that generates synthetic training data, validates it with an LLM judge, and self-corrects until every record passes. Started with a 20% failure rate, ended at zero.

Python Pydantic OpenAI API GPT-4o-mini GPT-4o +1
Project 02
Demo: streamlit Completed

RAG Evaluation Pipeline

I tested 16 RAG configurations and found that semantic chunking + OpenAI embeddings + Cohere reranking gets 0.747 Recall@5 on structured Markdown docs. This is how I got there.

Python LangChain RAGAS Sentence-Transformers Braintrust +4
Project 03
Completed

Contrastive Embedding Fine-Tuning

I fine-tuned all-MiniLM-L6-v2 on 1,475 dating profile pairs and flipped Spearman from -0.22 to +0.85. LoRA got 96.9% of that using 0.32% of the parameters.

Python Sentence-Transformers PEFT/LoRA PyTorch UMAP +2
Project 05
Completed

ShopTalk Knowledge Management Agent

I built a RAG system from scratch with no LangChain, tested 46 configurations across 5 chunking strategies, 4 embedding models, and 3 retrieval methods, and found that heading-aware chunking + OpenAI embeddings hits NDCG@5 = 0.896 and Recall@5 = 1.0.

Python PyMuPDF FAISS SentenceTransformers OpenAI +6

Latest Blog Posts

leadership May 1, 2026

When Standups Feel Like Interrogations

How to diagnose whether tight oversight is a trust problem or a legitimate need, and how to hand back autonomy without losing accountability.

6 min read

leadership Apr 24, 2026

Your Team Is Doing the Work. Someone Else Is Taking the Credit.

How to fix invisible attribution in distributed teams, and why credit theft is the most corrosive trust pattern a manager can inherit.

6 min read

leadership Apr 17, 2026

The Reorg Nobody Told Your Team About

How to rebuild trust when your team learned about their own reorg from an org chart update, not a conversation.

5 min read

More about my background