Welcome to my GitHub profile! Iβm passionate about data science, AI/ML, and responsible computing. From building machine learning pipelines to leading teams, I aim to make an impact through innovative, data-driven solutions. Feel free to explore my repositories and connect with me! πΌ LinkedIn βοΈ Email
- π Sophomore Data Science Student at SJSU
- π‘ Passionate about solving complex problems with AI/ML, data science, and responsible tech practices
- π€ Dedicated to leading teams to achieving impactful goals
- Python
- Java
- HTML, CSS, JavaScript
- Scikit-learn
- TensorFlow
- Keras
- Pandas & NumPy
- React & Next.js
- Jupyter Notebook
Check out my team's project repository for the real-world ML project we worked on for KPMG for over 3 months, as part of the Break Through Tech AI Program's Fall 2024 AI Studio!
- Analyzed 40,000+ zip codes and 3,000+ donations for C5LA using CRISP-DM framework and Agile methodology
- Achieved 88% accuracy with Random Forest model, selected features with RFE, optimized parameters with RandomizedSearchCV , applied preprocessing and trend analysis using Pandas, NumPy, Scikit-learn
- Presented findings and actionable recommendations to KPMG staff and C5LA to improve donor retention
- Led all exploratory data analysis (EDA) for a Kaggle competition focused on building a fair computer vision model to classify skin conditions across diverse skin tones.
- Conducted fairness-driven analysis using the Fitzpatrick skin tone scale and visualized class imbalances across 21 medical skin conditions.
- Insights from my EDA guided fairness-aware model development to address bias and healthcare disparities in dermatological AI.
A linear algebra-based project focused on dimensionality reduction using PCA:
- Applied PCA to reduce dataset dimensions while preserving variance
- Visualized key patterns and feature contributions with intuitive plots
- Demonstrated practical applications using real-world datasets like the Iris dataset
A machine learning project predicting Airbnb prices using:
- Scikit-learn pipelines for feature engineering
- Natural Language Processing (NLP) on amenities data
- Robust performance evaluation metrics
- Analyzed customer feedback sentiment
- Data preprocessing with TF-IDF vectorization
- Visualizing sentiment trends

