Inspiration
Skilled mechanic jobs are experiencing a global shortage, and it's growing. With heavy machinery downtime becoming more expensive than ever, machine breakages cause field workers to wait numerous days to to diagnose the issue, identify the part, and initiate replacement.
We created CATVision to empower technicians and field workers to handle more of this manual process themselves. Rather than waiting days, they can identify damaged parts automaticaly and move toward repair.
What it does
CATVision is inspection assistant for heavy machinery. A field worker:
- First takes photos of a damaged component based on the diagram of the vehicle
- After identifying the part, we mark these components as Critical (Red), Needs Assistance (Orange), or Ok (Green) based on the official inspection maintenance documentation
- In the case that the part is in Critical or Needs Assistance , we'll display a direct link to the replacement part from the Caterpillar Inc. website.
- We compile a comprehensive inspection report based on all the items and their associated inspection status, with the option to send the CSV to their manager via email.
How we built it
We built CATVision using:
Gemini API to perform image detection on machine images and detect if components are damaged
Creating a document parsing system to extract part information from the official safety and maintenance PDFs on Caterpillar's website
Adding structured parts database to map flagged components to the official product listings
Creating a search layer to retrieve the suggested part links and their pricing from Caterpillar Inc.
By combining vision, document intelligence, and product retrieval, we created a full inspection-to-purchase pipeline.
Challenges we ran into
Some machinery components often look similar, making accurate part identification difficult
Part naming conventions in manuals don’t always match product listing titles
Extracting structured data from maintenance PDFs required careful parsing
Mapping image-detected components to official SKUs required fuzzy matching and validation
Accomplishments that we’re proud of
Building an end-to-end pipeline from image capture to replacement link
Successfully extracting structured inspection data from complex maintenance documents
Reducing what could take hours of manual lookup into seconds
Creating a scalable foundation that could support thousands of machinery parts
Most importantly, we built something that directly reduces downtime
What we learned
AI is most powerful when it connects perception (vision) to action (procurement)
Real-world industrial systems require clean data pipelines, not just smart models
The biggest impact comes from solving workflow friction, not just technical challenges
We learned that empowering workers is more valuable than replacing them.
What’s next for CATVision
We want to introduce a hands-free “Glove Mode” by integrating voice AI through ElevenLabs, allowing technicians to navigate inspections and confirm parts without using their hands. We will also build a managerial dashboard to track inspection statuses, monitor recurring issues, and analyze downtime across job sites. To further reduce repair delays, we aim to integrate real-time dealer inventory and delivery timelines from Caterpillar Inc. and add downtime cost estimation to prioritize urgent fixes. Ultimately, we plan to expand to multiple machinery brands, enable automatic purchase order generation, and deploy a mobile-first version built specifically for field environments.
Built With
- flask
- gemini
- python
- react-native
- supabase


Log in or sign up for Devpost to join the conversation.