Junior Data Analyst | Toronto, Canada
Former educator & designer pivoting into data analytics.
I enjoy turning messy real-world data into clear stories and practical recommendations.
- Building end-to-end data analytics projects (Cyclistic, health datasets, etc.)
- Applying SQL, Python, R, and spreadsheets to solve real-world problems
- Communicating insights clearly for non-technical stakeholders
- Documenting and sharing my work on GitHub and LinkedIn
| Domain | Tools & Skills |
|---|---|
| R Programming | tidyverse, dplyr, lubridate, ggplot2, R Markdown |
| Python | pandas, NumPy, Jupyter Notebooks |
| SQL | SELECT, JOIN, GROUP BY, filtering, window functions, BigQuery |
| Spreadsheets | Excel & Google Sheets: formulas, pivot tables, charts |
| Data Analysis | EDA, descriptive stats, data cleaning, aggregation, feature creation |
| Visualization | ggplot2, storytelling, R Markdown HTML/PDF |
| Other Interests | UX research, product thinking, documentation, teaching |
-
π² Cyclistic Bike-Share Case Study
Cleaned and analyzed ~5.6M bike-share rides in R to compare member vs casual behavior and propose marketing actions. -
π Google Data Analytics Certificate
Completed end-to-end data analysis projects following the full workflow: ask, prepare, process, analyze, share, and act. -
π€ AI Job Market Assistant (DSI Project β RAG + LLM)
Built an end-to-end data application to analyze job postings and evaluate resumes using real-world data. -
π΅οΈββοΈ SQL Practice Projects
Solved investigative scenarios (SQL Murder Mystery) to develop querying, joins, and analytical problem-solving skills.
-
AI Job Market Assistant (RAG + LLM + Resume Evaluation)
Built an end-to-end data application that analyzes job market trends and evaluates resumes using real job postings.- Collected live job data via Adzuna API and stored it in a vector database (ChromaDB)
- Implemented semantic search using embeddings and Retrieval-Augmented Generation (RAG)
- Developed an AI-powered resume evaluator to match candidate skills with job requirements
- Designed a pipeline using LangChain for query rewriting, retrieval, and response generation
π View the Repository
-
Cyclistic Bike-Share Case Study (R)
Google Data Analytics capstone-style project analyzing member vs casual rider behavior.
π View the repository π Live Report: https://smyh1989.github.io/cyclistic-case-study/ -
Bellabeat Fitness Tracker Case Study (R) Google Data Analytics capstone-style project analyzing smart device fitness data from Bellabeat to uncover activity patterns, user behavior trends, and insights for improving engagement.
π View the project: π View the Repository π Live Report: https://smyh1989.github.io/fitbit-activity-weight-analysis/bellabeat.html
-
SQL Practice Projects (SQL)
Solved investigative scenarios (SQL Murder Mystery) to strengthen querying skills. -
Complex SELECT queries
-
JOINs and filtering
-
Analytical problem-solving
- I like clear, reproducible code (R Markdown, commented scripts, organized repos).
- I care about explanations as much as numbers β why the result matters, not just what it is.
- My background in teaching helps me translate technical findings into everyday language.
- πΌ LinkedIn: https://www.linkedin.com/in/somi-shafiee89
If youβre hiring for junior data analyst / BI roles (GTA or remote), Iβd love to connect.