I'm a data scientist / AI engineer in Shanghai. I recently completed an MS in Computational Data Analytics "Data Science" at Georgia Tech (4.0 GPA), after ten years in international education — recruitment, teacher training, and co-founding two small schools in Shanghai (~35 staff).
I work on applied ML and AI systems around language, audio, and education. I've built tools for students, educators, and a few non-education clients. Across several GPT-based test-prep apps for Georgia Tech and other courses, there have been roughly 8,000 sessions from students.
These were mostly built to help myself and others prep for Georgia Tech's OMSA/OMSCS exams. Also includes IBDP Economics prep app.
| Project | Description | Live / Repo |
|---|---|---|
| ISYE 6501 Midterm Prep App | Streamlit app with GPT-generated multiple choice and open-ended questions. Used a few thousand times by fellow students. | demo • repo |
| GT Test-Prep Tools | Collection of similar apps for other courses (e.g. MGT 6203, CS 7643, ISYE 6740). | repo |
Most of these involve classification, ranking, or evaluation.
| Project | Description | Link |
|---|---|---|
| Birdsong Classifier (ChirpID) | Classifies ~30 bird species using VGG16 on Mel spectrograms. Deployed with FastAPI on Google Cloud Run (archived). | demo • repo |
| Deepfake Detector | Compares model robustness to diffusion-generated fakes (CLIP/Xception/SPSL + LoRA vs full fine-tuning). | paper • repo |
| Essay Scorer | Longformer + CatBoost pipeline for scoring essays in 100-word chunks. Includes QWK evaluation. | paper • repo |
| RV List-Price Model | Gradient-boosted model to predict suggested list prices for a multi-location RV dealer. Cleaned and joined internal inventory, vehicle-spec feeds, and scraped competitor listings; engineered pricing and calendar features; achieved ≈2% MAPE on suggested selling price. | report |
| Project | Description | Live / Repo |
|---|---|---|
| Stock-Tweet Sentiment Dashboard | 5 million tweets processed with BERTweet. Backend in FastAPI, frontend with D3. | demo • repo |
| Nurse Retain Lab | Interactive Streamlit app: users train models on synthetic attrition data, choosing features, sampling methods, and algorithms. | demo • repo |
| Project | Description |
|---|---|
| FlyingEnglish-Shanghai.com | Static site I built and deployed for the language school I co-founded. GitHub Pages + Porkbun domain. Still live, still used. |
If you’re working on something similar — or something weird — feel free to reach out. Open to collaborations, freelance projects, or ideas that break things in interesting ways.
