๐ AI/ML Engineer | Agentic AI & LLMs | Computer Vision
AI/ML Engineer focused on the intersection of Deep Learning and Agentic AI. Experienced in developing high-accuracy Computer Vision pipelines and LLM-driven diagnostic tools that transform complex data into actionable intelligence. My work leverages Agentic AI & Machine learning to create self-evolving systems capable of real-time decision-making. Actively seeking Spring 2026 opportunities to scale autonomous solutions.
- Generative AI & LLMs: Agentic Workflows (LangGraph, CrewAI), Knowledge Graphs (Neo4j), Prompt Engineering , Retrieval-Augmented Generation (RAG), OpenAI , Gemini , Whisper (Speech-to-Text) , and Hugging Face Transformers.
- Database & MLOps: Cypher Query Language, Vector Databases (Pinecone, Qdrant, Milvus, Weaviate, Chroma), Knowledge Graph Construction, SQL/PLSQL (Optimization & Automation) , Oracle EBS , Docker , Cloud Platforms (AWS, GCP, Azure) , and Gradio (Deployment).
- Deep Learning & Vision: YOLO (v8) , YOLO-World (VLM) , Vision Transformers (ViT), Ensemble Transfer Learning (ResNet, DenseNet, VGG) , OpenCV , and Image Segmentation.
- Machine Learning: Supervised/Unsupervised Learning, Multi-Agent Reinforcement Learning (MARL) , NLP (Intent Classification, NER) , and Time-Series Forecasting (LSTM)
- Robotics & Engineering: ROS2 (Foxy) , Robot Control (Unitree Go2) , PyTorch , TensorFlow , Linux/Unix , and Git.
-
Researcher | A2IL Lab, University at Buffalo (Aug 2025 โ Dec 2025).
- Developed the B.O.L.T (Behavioral Object Locomotion & Tracking) system for the Unitree Go2 quadruped, enabling autonomous, real-time vision-guided object following.
- Designed a decoupled, low-latency architecture that separates perception from motion control to ensure control decisions are driven by the most recent sensory feedback.
- Engineered a perception pipeline utilizing a quantized YOLOv8n model optimized with ONNX for edge deployment on a Jetson Nano, achieving a near-perfect [email protected] of 0.995.
- Implemented a Finite-State Machine (FSM) control layer to abstract noisy perception data into stable behaviors: Search, Approach, Creep, Hold, and Backup.
- Achieved an average end-to-end system latency of 48 ms with a target tracking success rate exceeding 90% in indoor environments.
- Optimized real-world distance and orientation estimation using camera intrinsic calibration and focal-length-based geometric depth approximation.
- Created and annotated a custom dataset of target objects, applying geometric and photometric augmentations to improve model generalization under varied lighting and backgrounds
-
Analyst | Capgemini (Dec 2022 โ May 2024)
- Optimized complex SQL queries and PL/SQL packages within Oracle EBS, resulting in a 40% reduction in execution time for high-volume supply chain operations.
- Automated critical data transformations and streamlined business processes across Order Management, Procurement, and Inventory Control systems.
- Resolved over 100 production incidents and maintained 100% SLA compliance while supporting mission-critical batch jobs for an aerospace client.
- Developed structured operational dashboards and reports focused on BOMs and inventory to support strategic stakeholder decision-making
-
ML Intern | EQlibria (Dec 2021 โ Feb 2022)
- Deployed speech-to-text and summarization pipelines using OpenAI Whisper and Gemini APIs to process long-form therapist-patient video sessions.
- Built a context-aware multilingual chatbot utilizing NLP (NER and intent classification) and Naive Bayes models to achieve 93% accuracy in user personality classification.
- Developed web scraping pipelines to curate metadata for therapist discovery and integrated this data directly into the mobile application engine.
- Implemented class weighting techniques to resolve data imbalance, ensuring robust performance for automated mental health interventions.
-
Digital Image Processing Intern | Grey Scientific Labs (Sept 2021 โ Oct 2021)
- Engineered image preprocessing and noise reduction pipelines for medical datasets, organizing 400+ DICOM images for AI-based diagnostic analysis.
- Developed automation scripts to convert and stitch high-resolution microscopic medical scans, reducing manual preprocessing time by 15%.
-
Buffalo accident risk prediction & resource allocation: Architected and deployed a custom Multi-Agent Reinforcement Learning (MARL) system to optimize network-wide emergency response for 145 units. Utilized a PPO-driven policy agent to process live data feeds and trigger autonomous, real-time resource reallocation decisions.
-
DeepSpeech: Built a real-time acoustic pattern recognition system using a Convolutional Autoencoder to diagnose speech disorders with 96% accuracy. Integrated an LLM feedback loop via GPT-4 prompt engineering to deliver personalized, context-aware articulation tips based on identified faulty articulation.
-
DermAI: Engineered a diagnostic system for dermatological image classification leveraging Ensemble Transfer Learning (ResNet50, DenseNet121, VGG-19, EfficientNet-B0) to improve robustness over custom models. Deployed a web-based tool using Gradio that combines real-time classification with GPT-4 feedback and Knowledge Graph concepts for grounded recommendations
- 2nd Place: AI For Good Hackathon (Spring 2025) โ Developed an AI-driven snow management solution using LSTM models.
- Best Student Volunteer of the Standing Committee of the Section - SAC (IEEE Bombay Section 2022) [Link]
- IEEE Winner: R10 Connect Logo Design Competition Winner. [Link]
- MS in Artificial Intelligence | University at Buffalo (Dec 2025)
- BE in Electronics & Telecommunication | University of Mumbai (May 2022)
- LinkedIn: linkedin.com/in/sahilsawant01
- Email: [email protected]
- Portfolio: github.com/sahillarious

