hello there! i'm

Anusha Nandy

Data Scientist & Machine Learning Engineer. Bridging the gap between experimental notebooks and production systems .

01 / About

I’ve always liked stories. To me, Data Science is just a way of finding the truth hidden inside a mess of numbers. I operate at the intersection of data and Engineering, acting as a student, engineer, and relentless troubleshooter .

Currently, I’m finishing up my M.S. in Data Science, focusing heavily on Machine Learning Engineering. I enjoy the logic puzzle of system design —taking a theoretical concept and building the infrastructure to support it. Whether it's stitching images together or predicting customer churn, my goal is always the same: to build tools that are reliable and scalable.

Outside of tech, I try to stay offline as much as I can. I love exploring new places and I write about tech and life on Medium. I’m simply looking for a team where I can ask a million questions, solve hard problems, and keep learning.

Current Focus

Reliable ML End-to-End Pipelines Cloud Architecture

02 / Stack

Languages

Python (NumPy, Pandas)
SQL (PostgreSQL, MySQL)
R Programming
Bash / Shell Scripting

Machine Learning

PyTorch & TensorFlow
Scikit-learn
XGBoost / LightGBM
HuggingFace Transformers

Infrastructure

Docker & Kubernetes
AWS (SageMaker, S3, EC2)
Terraform
GitHub Actions (CI/CD)

Tools & Platforms

MLflow & DVC
Apache Airflow
Tableau & PowerBI
FastAPI / Flask

03 / Work

Real-Time Fraud Detection Pipeline

Built a real-time fraud detection pipeline processing ~6K events/sec with <100ms latency. Implemented drift monitoring and automated retraining to maintain stable PR-AUC (~0.94) across streaming data.

Kafka XGBoost MLFlow Streamlit

RAG-based Chatbot for Document QA

Built a production-grade RAG pipeline integrating LLAMA with structured citation enforcement. Improved retrieval precision by 30-45% and reduced hallucinations by 35%.

LlamaIndex FAISS Sentence Transformers

Language Model from Scratch

Trained a 22M-parameter Transformer LM from scratch in PyTorch, implementing tokenization, attention, masking, and training loop end-to-end. Evaluated various positional encoding strategies.

PyTorch Transformers

Personalised Book Recommendation

Built a scalable retrieval-based recommender using implicit feedback and a two-tower neural architecture with FAISS KNN for efficient candidate retrieval.

FAISS Neural Networks Collaborative Filtering

Multitask BERT for Spam Detection

Designed a multitask learning framework by fine-tuning BERT jointly on spam detection, sentiment analysis, and toxicity classification, achieving 96.9% accuracy on YouTube spam detection.

BERT Multitask Learning NLP

04 / Path

2024 — Present

M.S. Data Science

University of Alabama at Birmingham

Specializing in Machine Learning, Big Data Analytics, and Cloud Computing.

2022

AI Intern

Indian Oil Company

Optimized indian vehicle recognition algorithms, enabling real-time monitoring on Android with 30% faster inference.

2020 — 2024

B.Tech. Artificial Intelligence

Mahindra University

Resume.pdf