Skip to content
View smyh1989's full-sized avatar
πŸ§‘β€πŸ’»
Leveling up my data analytics skills!
πŸ§‘β€πŸ’»
Leveling up my data analytics skills!

Block or report smyh1989

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
smyh1989/README.md

πŸ‘‹ Hey, I'm Somi

Junior Data Analyst | Toronto, Canada
Former educator & designer pivoting into data analytics.
I enjoy turning messy real-world data into clear stories and practical recommendations.


🧠 What I’m focused on

  • Building end-to-end data analytics projects (Cyclistic, health datasets, etc.)
  • Applying SQL, Python, R, and spreadsheets to solve real-world problems
  • Communicating insights clearly for non-technical stakeholders
  • Documenting and sharing my work on GitHub and LinkedIn

Domain Tools & Skills
R Programming tidyverse, dplyr, lubridate, ggplot2, R Markdown
Python pandas, NumPy, Jupyter Notebooks
SQL SELECT, JOIN, GROUP BY, filtering, window functions, BigQuery
Spreadsheets Excel & Google Sheets: formulas, pivot tables, charts
Data Analysis EDA, descriptive stats, data cleaning, aggregation, feature creation
Visualization ggplot2, storytelling, R Markdown HTML/PDF
Other Interests UX research, product thinking, documentation, teaching

πŸ“Œ Recent Highlights

  • 🚲 Cyclistic Bike-Share Case Study
    Cleaned and analyzed ~5.6M bike-share rides in R to compare member vs casual behavior and propose marketing actions.

  • πŸŽ“ Google Data Analytics Certificate
    Completed end-to-end data analysis projects following the full workflow: ask, prepare, process, analyze, share, and act.

  • πŸ€– AI Job Market Assistant (DSI Project – RAG + LLM)
    Built an end-to-end data application to analyze job postings and evaluate resumes using real-world data.

  • πŸ•΅οΈβ€β™€οΈ SQL Practice Projects
    Solved investigative scenarios (SQL Murder Mystery) to develop querying, joins, and analytical problem-solving skills.


πŸ“‚ Selected Projects

  • AI Job Market Assistant (RAG + LLM + Resume Evaluation)
    Built an end-to-end data application that analyzes job market trends and evaluates resumes using real job postings.

    • Collected live job data via Adzuna API and stored it in a vector database (ChromaDB)
    • Implemented semantic search using embeddings and Retrieval-Augmented Generation (RAG)
    • Developed an AI-powered resume evaluator to match candidate skills with job requirements
    • Designed a pipeline using LangChain for query rewriting, retrieval, and response generation
      πŸ”— View the Repository
  • Cyclistic Bike-Share Case Study (R)
    Google Data Analytics capstone-style project analyzing member vs casual rider behavior.
    πŸ‘‰ View the repository 🌐 Live Report: https://smyh1989.github.io/cyclistic-case-study/

  • Bellabeat Fitness Tracker Case Study (R) Google Data Analytics capstone-style project analyzing smart device fitness data from Bellabeat to uncover activity patterns, user behavior trends, and insights for improving engagement.

    πŸ”— View the project: πŸ“ View the Repository 🌐 Live Report: https://smyh1989.github.io/fitbit-activity-weight-analysis/bellabeat.html

  • SQL Practice Projects (SQL)
    Solved investigative scenarios (SQL Murder Mystery) to strengthen querying skills.

  • Complex SELECT queries

  • JOINs and filtering

  • Analytical problem-solving

🀝 How I Work

  • I like clear, reproducible code (R Markdown, commented scripts, organized repos).
  • I care about explanations as much as numbers – why the result matters, not just what it is.
  • My background in teaching helps me translate technical findings into everyday language.

🌐 Connect with Me

If you’re hiring for junior data analyst / BI roles (GTA or remote), I’d love to connect.

Popular repositories Loading

  1. shell shell Public

    Forked from UofT-DSI/shell

    HTML

  2. python python Public

    Forked from UofT-DSI/python

    Jupyter Notebook

  3. sql sql Public template

    Forked from UofT-DSI/sql

    HTML

  4. LCR LCR Public template

    Forked from UofT-DSI/LCR

    Jupyter Notebook

  5. cyclistic-case-study cyclistic-case-study Public

    Google Data Analytics Capstone Project β€” Cyclistic Bike-Share Data Analysis (R, tidyverse, data cleaning, visualization, insights, recommendations)

    HTML

  6. smyh1989 smyh1989 Public