Deepali Balakrishna Ksheersagar Deepali-BK

Hi, I'm Deepali 👋

Data Scientist | NLP & LLM Engineer | NYU M.S. Data Science '26

📍 New York, NY · 📧 [email protected] · 💼 linkedin.com/in/deepali-bk · 🌐 Portfolio

About Me

I'm a Data Science graduate student at New York University (GPA: 3.6) with a background in software engineering and machine learning. My work sits at the intersection of NLP, LLMs, and production ML systems — from processing millions of unstructured documents to evaluating emotional intelligence in large language models.

I've built classifiers, information extraction pipelines, and multilingual NLP systems across healthcare, enterprise, and research domains. I care about models that work in the real world, not just on benchmarks.

🔬 Graduate Research Assistant @ NYU Rory Meyers College of Nursing
🏆 Violet Internship & Research Award 2025
🌐 Women in Data Science Ambassador 2026

🛠 Tech Stack

Languages & Tools Python R SQL C++ Git

Machine Learning & Deep Learning Scikit-learn PyTorch TensorFlow XGBoost Flair FastText Computer Vision

NLP & Generative AI HuggingFace Transformers LangChain SpaCy BERT RoBERTa GPT-4 RAG Systems Sentiment Analysis Topic Modeling Named Entity Recognition Few-shot Prompting

Data & Visualization Pandas NumPy Matplotlib Seaborn Tableau Spark Hadoop (HDFS)

Statistics Statistical Modeling Bayesian Methods A/B Testing Hypothesis Testing

🚀 Featured Projects

🔍 Muck Rack — HTML Quality Detection Pipeline (Capstone)

Production-grade ML pipeline combining BeautifulSoup + XGBoost with rule-based heuristics for HTML quality detection.

Achieved 0.98 precision / 0.920 F1 score
Built a GPT-4.1-mini few-shot labeling pipeline to expand an imbalanced seed dataset
Eliminated manual QA bottlenecks entirely

Python XGBoost BeautifulSoup GPT-4 Few-shot Learning ML Pipelines

🧠 Emotion Learning Evaluation for LLMs

Benchmarked emotional intelligence capabilities of Gemma, Qwen, and Llama using zero-shot and few-shot prompt engineering.

Demonstrated above-chance performance across all three models
Leveraged LangChain and HuggingFace for evaluation orchestration

LangChain HuggingFace Prompt Engineering Zero-shot Few-shot LLM Evaluation

💼 Experience

Graduate Research Assistant — NYU Rory Meyers College of Nursing (Sep 2025 – Present)

Statistical modeling on survey data from 90+ countries for COVID-19 healthcare pattern recognition
Built interactive Tableau dashboard for the organization's public-facing website
BERT-based multi-class classifier achieving 87% accuracy on 5,000+ survey responses on violence against medical professionals

Data Science Intern — Global Consortium of Nursing & Midwifery Studies (Jun – Sep 2025)

Analyzed 21,000+ survey responses across 40+ languages using multilingual RoBERTa
Applied sentiment analysis and topic modeling on PPE usage data during COVID-19

Software Engineer (ML) — CGI Inc. (Feb 2022 – Jun 2023)

Led NER implementation as SME for a Fortune 500 telecom client — extracted entities from 2M+ unstructured documents using SpaCy
Reduced manual review time by 71% through automated document processing
Built classification system with Flair + FastText to categorize 10,000+ reviews for C-suite strategic planning

🎓 Education

Degree	Institution	Year
M.S. Data Science	New York University	2024 – 2026
B.E. Electronics & Communication	B.N.M Institute of Technology	2016 – 2020

Relevant coursework: Deep Learning · Machine Learning · Natural Language Understanding · Reinforcement Learning · Big Data · AI Applications in Business (NYU Stern)

🏅 Awards

🥇 Violet Internship & Research Award 2025 — Competitive funding for excellence in research and internship performance
🌟 Women in Data Science Ambassador 2026 — Selected to represent and promote WiDS initiatives

Always open to research collaborations and data science opportunities. Feel free to reach out!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly