Anusha Nandy
Data Scientist & Machine Learning Engineer. Bridging the gap between experimental notebooks and production systems .
01 / About
I’ve always liked stories. To me, Data Science is just a way of finding the truth hidden inside a mess of numbers. I operate at the intersection of data and Engineering, acting as a student, engineer, and relentless troubleshooter .
Currently, I’m finishing up my M.S. in Data Science, focusing heavily on Machine Learning Engineering. I enjoy the logic puzzle of system design —taking a theoretical concept and building the infrastructure to support it. Whether it's stitching images together or predicting customer churn, my goal is always the same: to build tools that are reliable and scalable.
Outside of tech, I try to stay offline as much as I can. I love exploring new places and I write about tech and life on Medium. I’m simply looking for a team where I can ask a million questions, solve hard problems, and keep learning.
Current Focus
02 / Stack
Languages
- Python (NumPy, Pandas)
- SQL (PostgreSQL, MySQL)
- R Programming
- Bash / Shell Scripting
Machine Learning
- PyTorch & TensorFlow
- Scikit-learn
- XGBoost / LightGBM
- HuggingFace Transformers
Infrastructure
- Docker & Kubernetes
- AWS (SageMaker, S3, EC2)
- Terraform
- GitHub Actions (CI/CD)
Tools & Platforms
- MLflow & DVC
- Apache Airflow
- Tableau & PowerBI
- FastAPI / Flask
03 / Work
Real-Time Fraud Detection Pipeline
Built a real-time fraud detection pipeline processing ~6K events/sec with <100ms latency. Implemented drift monitoring and automated retraining to maintain stable PR-AUC (~0.94) across streaming data.
RAG-based Chatbot for Document QA
Built a production-grade RAG pipeline integrating LLAMA with structured citation enforcement. Improved retrieval precision by 30-45% and reduced hallucinations by 35%.
Language Model from Scratch
Trained a 22M-parameter Transformer LM from scratch in PyTorch, implementing tokenization, attention, masking, and training loop end-to-end. Evaluated various positional encoding strategies.
Personalised Book Recommendation
Built a scalable retrieval-based recommender using implicit feedback and a two-tower neural architecture with FAISS KNN for efficient candidate retrieval.
Multitask BERT for Spam Detection
Designed a multitask learning framework by fine-tuning BERT jointly on spam detection, sentiment analysis, and toxicity classification, achieving 96.9% accuracy on YouTube spam detection.
04 / Path
M.S. Data Science
University of Alabama at Birmingham
Specializing in Machine Learning, Big Data Analytics, and Cloud Computing.
AI Intern
Indian Oil Company
Optimized indian vehicle recognition algorithms, enabling real-time monitoring on Android with 30% faster inference.
B.Tech. Artificial Intelligence
Mahindra University