Skip to content
View Deepali-BK's full-sized avatar

Block or report Deepali-BK

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Deepali-BK/README.md

Hi, I'm Deepali πŸ‘‹

Data Scientist | NLP & LLM Engineer | NYU M.S. Data Science '26

πŸ“ New York, NY Β Β·Β  πŸ“§ [email protected] Β Β·Β  πŸ’Ό linkedin.com/in/deepali-bk Β Β·Β  🌐 Portfolio


About Me

I'm a Data Science graduate student at New York University (GPA: 3.6) with a background in software engineering and machine learning. My work sits at the intersection of NLP, LLMs, and production ML systems β€” from processing millions of unstructured documents to evaluating emotional intelligence in large language models.

I've built classifiers, information extraction pipelines, and multilingual NLP systems across healthcare, enterprise, and research domains. I care about models that work in the real world, not just on benchmarks.

  • πŸ”¬ Graduate Research Assistant @ NYU Rory Meyers College of Nursing
  • πŸ† Violet Internship & Research Award 2025
  • 🌐 Women in Data Science Ambassador 2026

πŸ›  Tech Stack

Languages & Tools Python R SQL C++ Git

Machine Learning & Deep Learning Scikit-learn PyTorch TensorFlow XGBoost Flair FastText Computer Vision

NLP & Generative AI HuggingFace Transformers LangChain SpaCy BERT RoBERTa GPT-4 RAG Systems Sentiment Analysis Topic Modeling Named Entity Recognition Few-shot Prompting

Data & Visualization Pandas NumPy Matplotlib Seaborn Tableau Spark Hadoop (HDFS)

Statistics Statistical Modeling Bayesian Methods A/B Testing Hypothesis Testing


πŸš€ Featured Projects

πŸ” Muck Rack β€” HTML Quality Detection Pipeline (Capstone)

Production-grade ML pipeline combining BeautifulSoup + XGBoost with rule-based heuristics for HTML quality detection.

  • Achieved 0.98 precision / 0.920 F1 score
  • Built a GPT-4.1-mini few-shot labeling pipeline to expand an imbalanced seed dataset
  • Eliminated manual QA bottlenecks entirely

Python XGBoost BeautifulSoup GPT-4 Few-shot Learning ML Pipelines


🧠 Emotion Learning Evaluation for LLMs

Benchmarked emotional intelligence capabilities of Gemma, Qwen, and Llama using zero-shot and few-shot prompt engineering.

  • Demonstrated above-chance performance across all three models
  • Leveraged LangChain and HuggingFace for evaluation orchestration

LangChain HuggingFace Prompt Engineering Zero-shot Few-shot LLM Evaluation


πŸ’Ό Experience

Graduate Research Assistant β€” NYU Rory Meyers College of Nursing (Sep 2025 – Present)

  • Statistical modeling on survey data from 90+ countries for COVID-19 healthcare pattern recognition
  • Built interactive Tableau dashboard for the organization's public-facing website
  • BERT-based multi-class classifier achieving 87% accuracy on 5,000+ survey responses on violence against medical professionals

Data Science Intern β€” Global Consortium of Nursing & Midwifery Studies (Jun – Sep 2025)

  • Analyzed 21,000+ survey responses across 40+ languages using multilingual RoBERTa
  • Applied sentiment analysis and topic modeling on PPE usage data during COVID-19

Software Engineer (ML) β€” CGI Inc. (Feb 2022 – Jun 2023)

  • Led NER implementation as SME for a Fortune 500 telecom client β€” extracted entities from 2M+ unstructured documents using SpaCy
  • Reduced manual review time by 71% through automated document processing
  • Built classification system with Flair + FastText to categorize 10,000+ reviews for C-suite strategic planning

πŸŽ“ Education

Degree Institution Year
M.S. Data Science New York University 2024 – 2026
B.E. Electronics & Communication B.N.M Institute of Technology 2016 – 2020

Relevant coursework: Deep Learning Β· Machine Learning Β· Natural Language Understanding Β· Reinforcement Learning Β· Big Data Β· AI Applications in Business (NYU Stern)


πŸ… Awards

  • πŸ₯‡ Violet Internship & Research Award 2025 β€” Competitive funding for excellence in research and internship performance
  • 🌟 Women in Data Science Ambassador 2026 β€” Selected to represent and promote WiDS initiatives

Always open to research collaborations and data science opportunities. Feel free to reach out!

Popular repositories Loading

  1. git-demo git-demo Public

    Forked from nyu-big-data/git-demo

    A bare-bones repository for demonstrating git

  2. DS-GA-1003-Machine-Learning-2025 DS-GA-1003-Machine-Learning-2025 Public

    Forked from nyu-dl/DS-GA-1003-Machine-Learning-2025

    Jupyter Notebook

  3. nlu_s25 nlu_s25 Public

    Forked from TalLinzen/nlu_s25

    Jupyter Notebook

  4. NLU-project NLU-project Public

    NLU project

    Jupyter Notebook

  5. Article-Ingestion-Pipeline-for-Muck-Rack Article-Ingestion-Pipeline-for-Muck-Rack Public

  6. Deepali-BK Deepali-BK Public