Software Engineer • AI/ML Systems • Graduate CS Student

Vishal Vilas Shinde

Graduate computer science student at California State University, focused on Applied AI and LLM engineering. Developed and shipped AI-powered products using fine-tuned LLMs and retrieval-augmented generation (RAG) systems. Strong backend foundation from prior industry internships.

Based in California, United States

Email [email protected]

Request Resume View Projects

Page loaded in ...

Experience

Engineering, research, and teaching work

Experience across teaching infrastructure, computer vision research, backend systems, and full-stack product engineering with a focus on performance, reliability, and measurable outcomes.

Teaching Assistant

California State University, Fullerton

California, USA

Aug 2025 - Present

Responsible for holding office hours and conducting lab sessions for CPSC 223P: Python Programming.
Built a GitHub Actions CI/CD pipeline with pytest-based autograding, eliminating manual test runs for 39 student submissions per week.

Research Assistant

California State University, Fullerton

California, USA

Jun 2025 - Present

Working with Dr. Sampson Akwafuo on a deep learning pipeline for retinal image segmentation, using Neural Style Transfer augmentation to mitigate domain shift and overfitting, and applying Grad-CAM-based visualization for model interpretability
Achieved 91% accuracy and QWK 0.9167.

Software Engineering Intern

Digital Product School by UnternehmerTUM

Munich, Germany

May 2023 - Jul 2023

Developed an MVC-based backend API using Spring Boot and PostgreSQL to ingest and persist real-time data streams from 4 radar sensors, adding indexes on sensor ID and timestamp columns to keep activity queries fast under continuous writes, with async processing to keep the API non-blocking during concurrent inserts.
Improved a scikit-learn activity classification model from 74% to 91% accuracy by analyzing radar sensor feature distributions and correcting feature construction that caused the model to conflate overlapping activity states.

Full-stack Developer Intern

Mezchip

Bengaluru, India

Jul 2022 - Oct 2022

Built backend APIs for an omnichannel customer support platform in Flask, serving 1,000-plus monthly users, including message send and read, ticket management, paginated message history, and Shopify and Instamojo store integrations.
Implemented JWT authentication and reduced chatbot widget bundle size by 42% by removing a third-party timing dependency and rewriting message timestamp logic in vanilla JavaScript.
Optimized the search functionality by introducing debounced autocomplete, eliminating redundant API requests.

Projects

Selected work

LLM tooling, cloud-native RAG systems, developer infrastructure, and real-time communication experiments built for practical use and clear engineering tradeoffs.

DoodleDojo

Built an AI-powered sketching coach that turns photos or text prompts into simplified line-art references, guides users through stroke-by-stroke drawing sessions with real-time Gemini live voice feedback, and animates final sketches into short videos.

Won 1st place at the Google DeepMind Hackathon with an interactive AI drawing coach experience
Engineered a custom stroke extraction pipeline in Python using OpenCV, scikit-image, and NetworkX

Gemini FastAPI Next.js Canvas Drawing Python

Submission GitHub

SafePrompt

Fine-tuned Llama 3.2-3B with QLoRA on 34,800 PII redaction samples using a custom collator that masks loss via prefix length delta to train only on output tokens; adapters published on HuggingFace for public use.

Developed a CPU-only FastAPI inference service that loads the LoRA adapter offline on commodity hardware.
Evaluated on 300 held-out samples, achieving 0.90 placeholder micro-F1 and 0.0% formatting error rate; shipped a Chrome MV3 extension.

QLoRA Llama 3.2 FastAPI Web Extension

Model Adapter Repository

AskCourse

Built and deployed a RAG API on the Cloudflare Workers Python runtime (Pyodide), bridging JS and Python object boundaries via JsProxy and Object.fromEntries for all AI, Vectorize, and KV bindings.

Deterministic ingestion pipeline from CSV to strict NDJSON metadata
Rate limiting, observability, and staging/production wrangler environments

RAG Llama 3.1 Cloudflare Workers Vectorize

Try Repository

Learn WebRTC

Built a full-stack WebRTC playground with custom signaling for real-time video, audio, and data channel communication, plus an interactive debugging assistant.

Custom Node.js / Socket.io signaling server and RTCPeerConnection flows
Gemini-powered debugging assistant for SDP, signaling, and ICE analysis

TypeScript WebRTC Next.js Socket.io Google Gemini

Try Repository

Contact

Let’s build something useful

I’m open to Applied AI and backend internship opportunities. Reach out for collaboration, internships, or full-time roles.