Inspiration
Inspired by Sherlock Holmes' incredible ability to deduce entire narratives from the smallest details, we wanted to empower real-world investigators with similar superpowers. Currently, forensic teams spend countless hours manually combing through 3D scene reconstructions and photos which is a process that is slow, painstaking, and prone to human error. We built SceneSplat to bridge that gap, turning complex 3D scenes into actionable insights in minutes by combining AI-powered detection with an immersive 3D viewer.
What it does
SceneSplat is an interactive web application that transforms static 3D crime scene models into dynamic, intelligent environments for forensic analysis.
- Explore Immersive 3D Scenes: Investigators can load
.glbmodels and navigate them with intuitive orbit, zoom, and pan controls, examining every angle of a reconstructed scene. - Automate Evidence Detection with AI: With a single click, the app analyzes the scene to find objects, anomalies, and potential evidence, automatically placing labeled 3D markers with descriptions and confidence scores.
- Leverage Dual Analysis Modes:
- Quick Analysis: Instantly extracts evidence from the 3D model's geometry and metadata for rapid insights.
- Deep Vision Analysis: Captures a screenshot of the current view and sends it to Google Gemini Flash, using its powerful multimodal capabilities to visually identify points of interest just as a human would.
- Manage Cases and Evidence: The system organizes scenes into parent cases and child evidence items. Uploading a new
.glbmodel to a case automatically adds it as a piece of evidence for individual inspection. - Consolidate Notes and Files: Users can attach case notes and upload supporting files (images, documents) directly to a scene, keeping all investigative materials in one place.
How we built it
We built SceneSplat using a modern, full-stack TypeScript architecture designed for performance and rapid development.
- Frontend Framework: Next.js with the App Router provided a robust foundation, allowing us to build a fast, server-aware single-page application and handle backend logic through API routes.
- 3D Rendering: We used React Three Fiber and drei to declaratively manage a
three.jscanvas, enabling us to render complex.glbmodels and overlay interactive UI elements smoothly in the browser. - AI & Vision: The core intelligence is powered by the Google Gemini Flash model via the
@google/generative-aiSDK. We engineered a sophisticated pipeline that sends visual data (screenshots) to Gemini and parses its structured JSON response to place 3D markers in the scene. - UI/UX: The user interface was built with TailwindCSS for styling and Radix UI for accessible, unstyled component primitives, allowing us to create a clean and intuitive experience.
The data flow is seamless: a user selects a scene, which is rendered in the 3D viewer. They can trigger an AI analysis, which sends a request to our Next.js backend. The backend processes the request (either by parsing the model file or querying Gemini), and returns a list of evidence points that are then visualized as interactive markers in the 3D space.
Challenges we ran into
- 3D Coordinate Mapping: One of our biggest hurdles was mapping AI detections from a 2D screenshot back to accurate 3D positions within the scene's world space. This required careful prompt engineering and coordinate system translation.
- Structured AI Output: Getting Gemini to consistently return valid, structured JSON without extra conversational text required extensive prompt refinement and robust backend parsing logic.
- Performance Tuning: Rendering large, detailed
.glbmodels in real-time while maintaining a smooth user experience required careful optimization of the React Three Fiber scene. - Dynamic File Handling: Building a system that could not only accept uploads but also dynamically recognize new
.glbfiles and integrate them into the case hierarchy as child evidence was a complex but rewarding challenge.
Accomplishments that we're proud of
- We are incredibly proud of creating a full, end-to-end flow: from uploading a 3D model to seeing AI-generated evidence markers appear in the scene moments later.
- The reliable screenshot-to-detection pipeline using Gemini is our core technical achievement. It feels like magic to see the AI "look" at the scene and point out what's important.
- Achieving smooth 3D interactions, including the evidence markers and labels, within a React-based web application.
- Building a clean, professional, and highly functional UI during the time constraints of a hackathon.
What we learned
- The Power of Multimodal AI: We learned firsthand how to combine 3D rendering with advanced vision models. The possibilities for spatial analysis are immense.
- Prompt Engineering is Key: The quality of our AI's output was directly tied to the clarity and structure of our prompts. We learned to "think like the model" to get the best results.
- The Tradeoffs of Analysis: We learned to balance the speed of local metadata parsing with the deep, contextual understanding provided by a cloud-based vision model, offering users multiple ways to analyze a scene.
- Designing for a Niche Workflow: Building a tool for a specific, professional workflow like forensic investigation taught us a lot about prioritizing clarity, precision, and utility in UI/UX design.
What's next for SceneSplat
- Fine-Tuned Models: Expand the object taxonomy and improve detection precision and recall by training or fine-tuning specialized models for forensic analysis.
- Collaboration Features: Introduce multi-user support, allowing teams to collaborate on a case in real-time, share notes, and manage access.
- Reporting and Chain of Custody: Implement features to export formal evidence reports and integrate with chain-of-custody tracking systems.
- Expanded Format Support: Add support for more 3D formats (like
.plyfor point clouds) and implement streaming for exceptionally large scene files. - Offline and Edge Inference: Explore options for on-device or edge-based models to enable use in remote locations with limited internet connectivity.
Built With
- gemini
- next.js
- polycam
- react-three-fiber
- typescript

Log in or sign up for Devpost to join the conversation.