Skip to content

rehan243/rehan243

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

23 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Profile Views

Rehan Malik

Senior AI/ML Engineer Β· Cloud Solution Architect (AWS) Β· Open to Opportunities

5+ years building production AI systems at enterprise scale β€” GenAI, LLMs, RAG, RLHF, Computer Vision, Voice AI, Cloud Architecture.

LinkedIn Kaggle Email


About Me

I’m a Senior AI/ML Engineer with 5+ years of hands-on experience shipping production AI systems across healthcare, finance, retail, media, and enterprise operations. I’ve worked with companies ranging from 10-person startups to 10,000+ employee enterprises like MARS, solving problems that move business metrics.

What I do best: Take AI from research to production. I’ve fine-tuned LLMs (LLaMA, Mistral) with LoRA/QLoRA, built RLHF pipelines with PPO, architected RAG systems over 2TB+ corpora, deployed real-time voice infrastructure handling 500+ concurrent calls, and shipped fraud detection models processing applications in real-time β€” all on AWS/GCP at scale.

What I’m looking for: Senior AI/ML Engineer, Staff ML Engineer, or Lead AI Engineer roles where I can build and ship production AI systems.

B.S. Computer Science (COMSATS University Islamabad, 2016–2020)


What I’ve Built

The work I’m most proud of β€” production systems processing real data, serving real users, driving real business impact.

Voice AI Infrastructure

Real-time concurrent voice processing with zero-latency ingestion engines

  • Built voice-to-data pipelines handling 500+ simultaneous calls using WebSockets, Apache Kafka, and streaming architectures
  • Developed gRPC microservices with C++ modules (CUDA, Eigen), reducing inference latency by 25%
  • Designed speech-to-text, sentiment analysis, and sales insights extraction from live audio streams

Fraud Detection AI Co-Pilot

Ensemble ML + GenAI explainability for financial services

  • Engineered 650+ predictive features from raw application data β€” behavioral anomalies, timing patterns, identity verification signals
  • Built ensemble model (XGBoost + Isolation Forest) achieving 50% fraud detection on holdout test sets
  • Discovered 3 applicant personas via unsupervised clustering (UMAP + HDBSCAN) β€” β€œDigital Ghost” persona has 70% fraud concentration
  • Implemented GenAI-powered explainable PDF reports via Amazon Bedrock translating SHAP values into plain English

Enterprise RAG Pipelines

Knowledge retrieval across 2TB+ structured and unstructured data

  • Architected multi-index retrieval (FAISS + ChromaDB + PG-Vector) with cross-encoder re-ranking
  • Built hallucination detection and citation tracking for grounded LLM responses
  • Deployed on AWS SageMaker with auto-scaling β€” 40% cost reduction vs hosted API models

LLM Fine-Tuning & RLHF

Parameter-efficient fine-tuning and human alignment for production LLMs

  • Fine-tuned LLaMA-2, Mistral with LoRA, QLoRA, PEFT β€” served via VLLM with CUDA optimization
  • Built full RLHF pipeline: SFT β†’ Reward Modeling β†’ PPO optimization with KL divergence constraints
  • Achieved 68% win rate vs SFT baseline and 96% safety compliance

Autonomous AI Agents

Multi-agent systems executing complex workflows without human intervention

  • Built 8+ specialized agents for insurance underwriting, multilingual caregiving (100+ languages), content generation, and admissions automation
  • LangChain Agent orchestration connecting LLMs to databases, APIs, and messaging platforms
  • Reduced processing time by 50% for student admissions workflows

Computer Vision at Scale

Object detection and digital avatar generation

  • BiiView: Real-time object detection using Meta AI’s Segment Anything Model (SAM) β€” 90% accuracy across 11M+ images and 1.1B+ masks
  • Digital People Platform: Hyper-realistic talking avatars with SadTalker + SpeechT5 TTS β€” 70% realism improvement, 30% user satisfaction increase
  • KYC Platform: Identity verification with OpenCV + AI β€” 99.9% accuracy, 50% faster document processing

Professional Experience

Role Company Period Highlights
Senior ML/AI Engineer Verticiti Mar 2024 – Present RAG pipelines (2TB+), LLM fine-tuning (LoRA/QLoRA), agentic workflows, C++ inference optimization, SAM object detection at scale.
Senior Generative AI Engineer MARS (10K+ employees) Oct 2024 – Jan 2026 Led $1M+ GenAI enterprise transformation. RAG architectures, LLM orchestration, multi-agent frameworks for regulated industries.
Cloud Solution Architect Cloud Kinetics USA Aug 2024 – Jan 2026 Designed cloud-native AI solutions on AWS, Azure, GCP for enterprise clients. ETL/ELT, data migration, real-time pipelines.
Senior AI Engineer Reallytics.ai Oct 2022 – Jan 2026 Voice AI infra (500+ calls), fraud detection, autonomous agents, RLHF frameworks, cloud architecture on AWS/GCP.
Senior ML Engineer Afiniti Oct 2022 – Nov 2023 Production ETL at scale for $1M+ accounts, churn modeling, call routing optimization.
AI Product Engineer Afiniti Apr 2021 – Oct 2022 ML pipelines for call-routing, feature engineering on millions of daily records, production monitoring.
Python Engineer MeryCure May 2020 – Apr 2021 IoT data pipelines (1000+ devices), anomaly detection, predictive maintenance, Power BI dashboards.

Featured Projects

Sentinel AI β€” Fraud Detection Ensemble XGBoost + Isolation Forest with 650+ features and GenAI explainability via Amazon Bedrock.

Python XGBoost AWS Bedrock SageMaker

Voice-AI-Platform Real-time voice processing β€” 500+ concurrent calls, WebSockets, Kafka, gRPC/C++.

Python Kafka gRPC C++ AWS

BiiView β€” Object Detection Meta AI SAM for video object detection β€” 90% accuracy across 11M+ images.

Python SAM OpenCV PyTorch

RAG-Enterprise-Search Enterprise RAG with multi-index fusion, re-ranking, and hallucination detection.

Python LangChain FAISS ChromaDB

LLM-Fine-Tuning-LoRA Fine-tuning LLaMA/Mistral with LoRA, QLoRA, PEFT β€” 40% cost reduction vs hosted APIs.

Python HuggingFace VLLM CUDA

RLHF-LLM-Optimization Full RLHF pipeline β€” SFT, reward modeling, PPO with KL constraints.

Python PyTorch HuggingFace TRL

Digital People Platform Talking avatars with SadTalker + SpeechT5 TTS β€” 70% realism improvement.

Python SadTalker OpenAI PyTorch

Agentic-AI-Workflows Autonomous AI agents for enterprise automation with LangChain orchestration.

Python LangChain OpenAI FastAPI

Sunshine Care β€” Daycare Management SaaS Production-grade multi-center childcare SaaS β€” 13 modules, 28 REST APIs, real-time multi-site switching, notifications. Competes with Brightwheel, Tadpoles and Lillio.

Next.js TypeScript Prisma SQLite Tailwind CSS

IPM-Website-V2 Professional web platform with modern UI/UX, responsive design and production deployment.

Next.js TypeScript Tailwind CSS

View all repositories β†’


Kaggle β€” Research & Technical Notebooks

Hands-on explorations, architecture deep-dives, and production-tested techniques β€” published on Kaggle.

πŸ€– Agentic AI: Multi-Agent Orchestration from Scratch Building a multi-agent system with tool registries, planning loops, and guardrails β€” framework-agnostic patterns from production.

πŸ”Œ LLM Function Calling and Tool Use: Complete Guide End-to-end function calling β€” schema design, validation, chaining, error recovery, and production deployment patterns.

πŸ” Advanced RAG: Production Retrieval Guide Multi-query RAG, hybrid search, cross-encoder re-ranking, hallucination detection β€” beyond basic retrieve-and-generate.

🎯 Prompt Engineering That Actually Works (2026) Chain-of-thought, few-shot, self-consistency, structured output β€” real techniques with measured results.

πŸ‘οΈ Multimodal AI: Vision-Language Pipeline Vision encoders, cross-attention fusion, image captioning, visual QA β€” building multimodal systems from components.

πŸ’³ Fraud Detection: XGBoost + Isolation Forest Ensemble Ensemble anomaly detection with SHAP explainability, t-SNE visualization, and DBSCAN clustering on imbalanced data.

πŸ’¬ Sentiment Analysis: NLP Pipeline Comparison TF-IDF vs BERT vs DistilBERT β€” benchmarking classical and transformer approaches on real text data.

πŸ“š RAG Pipeline: LangChain + FAISS for Document QA End-to-end retrieval-augmented generation with chunk strategies, embedding models, and answer grounding.

🧬 LLM Fine-Tuning: LoRA and QLoRA Guide Parameter-efficient fine-tuning walkthrough β€” LoRA, QLoRA, PEFT with memory profiling and serving benchmarks.

πŸ“ˆ Time Series: XGBoost Forecasting Feature engineering for temporal data β€” lag features, rolling stats, calendar effects, walk-forward validation.

🚒 Titanic: Stacking Ensemble Pipeline β€” Advanced stacking with cross-validated base learners, meta-learner optimization, and feature engineering.

πŸ‘‰ View all notebooks on Kaggle β†’

Featured Writeups & Datasets

Technical writeups published as Kaggle Datasets β€” production insights, benchmarks, and reference architectures.

Writeup What’s Inside
Agentic AI Tool Schemas: Production Patterns 50+ tool/function schemas, 8 agent configs, benchmark data from 500 agent executions
RAG Evaluation Benchmark 2026 1,000 QA pairs with human-annotated relevance scores across 50 retrieval configs
LLM Prompt Engineering Templates 100+ prompt templates with A/B test results from 200 production experiments
Fraud Detection: Feature Engineering Guide 650+ feature catalog, interaction analysis, and 3 fraud persona profiles
ML System Design Patterns: Production 40+ patterns, 25+ anti-patterns, decision frameworks for production ML

Tech Stack

Languages & Frameworks

Python C++ PyTorch TensorFlow scikit--learn OpenCV FastAPI Flask

Generative AI & LLMs

LangChain OpenAI Claude HuggingFace VLLM FAISS ChromaDB Pinecone

Cloud & Infrastructure

AWS SageMaker Bedrock Azure GCP Docker Kubernetes Terraform CUDA

Data Engineering

Kafka PySpark Airflow PostgreSQL MongoDB Redis DynamoDB gRPC WebSockets


Education & Certifications

B.S. Computer Science COMSATS University Islamabad, 2016–2020
Foundations: Data, Data, Everywhere Google
PostgreSQL: Advanced Queries LinkedIn Learning
SQL Essential Training LinkedIn Learning


πŸ“° Latest AI Research Articles

Auto-generated articles with AI-crafted images β€” published daily to AI-Engineering-Notes

Model Context Protocol And Tool Use

Model Context Protocol And Tool Use
2026-04-15

Llm Fine Tuning At Scale With Lora

Llm Fine Tuning At Scale With Lora
2026-04-14

Production Rag Pipelines With Re Ranking

Production Rag Pipelines With Re Ranking
2026-04-13

Real Time Multimodal Llm Integration

Real Time Multimodal Llm Integration
2026-04-12

πŸ“š View all articles β†’


⚑ Recent Activity

πŸ“ Opened issue [Feature] Built-in collection versioning for zero-downtime i in chroma-core/chroma (2026-04-15)

πŸ’¬ Commented on Qwen3.5 Image Lora post-training in axolotl-ai-cloud/axolotl (2026-04-15)

πŸ’¬ Commented on [Bug] Responses API with code_interpreter file_ids does not in BerriAI/litellm (2026-04-15)

⭐ Starred mage-ai/mage-ai (2026-04-15)

⭐ Starred Avaiga/taipy (2026-04-15)

πŸ’¬ Commented on Liger-Kernel is now supported on LLaMA-Factory + NPU in hiyouga/LlamaFactory (2026-04-14)

πŸ’¬ Commented on [BUG] convertSegmentMetadataToModel debug logs nil instead o in chroma-core/chroma (2026-04-14)

⭐ Starred umbertogriffo/rag-chatbot (2026-04-14)


πŸ”¬ Currently Researching

Topics discovered daily by a multi-model AI research engine (GPT-4.1, Grok-3, DeepSeek R1, Llama-4)

πŸ”¬ Model Context Protocol and Tool Use

πŸ”¬ Agentic Coding Assistants Architecture

πŸ”¬ LLM Fine-Tuning at Scale with LoRA

πŸ”¬ Production RAG Pipelines with Re-ranking

πŸ”¬ Real-Time Multimodal LLM Integration

πŸ”¬ Real-Time Data Quality Monitoring for ML


πŸ“Œ Latest Code Snippets

πŸ“Œ Token Budget Manager β€” LLM Context Window Optimization (Python) (2026-04-15)

πŸ“Œ Retry with Exponential Backoff & Jitter β€” Production HTTP Client (Python) (2026-04-14)

πŸ“Œ Async LLM Gateway with Circuit Breaker & Retry β€” Production Pattern (Python) (2026-04-13)

πŸ€– Profile auto-updated on 2026-04-15 09:16 UTC

GitHub Stats

GitHub Stats Top Languages


Currently open to Senior AI/ML Engineer, Staff ML Engineer, or Lead AI Engineer roles.
If you’re building production AI systems and need someone who ships β€” let’s talk.