Bridging the gap between core data science and production-ready software engineering.
Hi there, I'm Parth. I am a Data Scientist, Machine Learning Engineer, and AI Engineer who thrives on bridging the gap between core data science and production-ready software engineering. I have a strong foundation in transforming raw, complex datasets into actionable business intelligence through predictive modeling, natural language processing, and advanced generative AI workflows. For me, the work doesn't stop at training a model; I am deeply passionate about architecting modular ML systems, developing Retrieval-Augmented Generation (RAG) applications, and building end-to-end automated MLOps pipelines. I love taking advanced statistical techniques and engineering secure, high-performance backends, then bringing those insights to life by deploying scalable analytical tools via FastAPI and Streamlit. I am currently based in Greater Sudbury and am fully open to relocating for an exciting onsite or hybrid opportunity.
π§ Artificial Intelligence & Machine Learning
π€ Generative AI, NLP & Agents
βοΈ Data Engineering, Cloud & MLOps
ποΈ Databases, Governance & Security
π οΈ Fundamentals & Version Control
-
Multi-Doc-Chat RAG System A modular Retrieval-Augmented Generation (RAG) pipeline designed to seamlessly ingest, process, and query multiple complex documents concurrently. By integrating advanced LLMs and vector embeddings, this system enables conversational intelligence, allowing users to extract synthesized, fact-based insights from large text corpora with high accuracy and reduced hallucination.
-
Financial Crew AI: Multi-Agent MLOps System Designed a production-ready, multi-agent AI architecture that ingests and processes live cryptocurrency time-series data via external REST APIs. Engineered the core orchestration routing layer using FastAPI to enforce strict modularity, enabling the seamless deployment of new agent logic without system downtime. Features an automated MLOps pipeline built with Azure DevOps for continuous continuous data ingestion and scheduled predictive model retraining.
-
Secure Lens: Enterprise NLP & Privacy Gateway Architected a scalable Data Loss Prevention (DLP) backend using Python and FastAPI. This gateway integrates a hybrid NLP inference pipeline utilizing Microsoft Presidio and spaCy to achieve high-precision Named Entity Recognition (NER) of sensitive PII and PHI. Implemented a server-side Role-Based Access Control (RBAC) engine that applies contextual, in-memory data masking for zero-trust querying, while optimizing latency through targeted DataFrame sampling.
-
Streaming Voice AI Assistant Engineered a robust, modular ML pipeline tailored for real-time API orchestration and voice interaction. This assistant integrates Faster Whisper for streaming Automatic Speech Recognition (ASR) and Mistral LLM for rich, contextual natural language processing. By leveraging advanced GPU acceleration strategies and managing isolated virtual environments, the system achieves strict deployment-ready performance and low-latency benchmarks.
-
Chicago Crash Analysis & Predictive Modeling Conducted comprehensive Exploratory Data Analysis (EDA) on large-scale temporal and traffic datasets. Trained and evaluated classical machine learning algorithms (Decision Trees, k-NN, clustering) to uncover underlying accident patterns, delivering actionable statistical insights and standardized reporting to support data-driven policy planning.

