Doodle World

Users can either start by creating their own world or choose a templated world to begin exploring and expressing freely in.
This is one of the preset worlds where users can start exploring and interacting with their 3D environment.
The whiteboard lets you add characters or objects to your world, alongside text or image generation.
Generated character from the whiteboard! :)

🌟 Inspiration

We were inspired by the idea of reimagining how we capture and relive our life experiences, not just as static memories but as immersive, interactive worlds.

We wanted to empower people to express themselves creatively and shape their own digital realities, giving them a way to reimagine their future through playful creation and exploration.

Our goal was to blur the line between imagination and reality - to make creating your own world as intuitive as doodling on a piece of paper.

🌍 What It Does

Doodle World lets users generate fully realized 3D worlds from simple inputs like images, sketches, or text prompts.
Once a world is created, users can:

Add and manipulate 3D models using sketches, natural language, or uploaded images.
Interact with the generated environment, models, and characters (NPCs).
Continuously refine or evolve their worlds through creative exploration.

In short, Doodle World turns creativity into an iterative, generative playground for imagination and self-expression.

🏗️ How We Built It

Doodle World: Technical Architecture

We developed Doodle World on a modern web stack designed for real-time 3D interaction, AI-driven content creation, and scalability.

Frontend & Real-Time 3D Engine: The core interactive experience is built with Three.js. This powerful library manages our rendering loop, first-person camera controls, user input, and physics-based object interactions directly in the browser.

AI Generation Core: Our creative pipeline is powered by Gemini, which acts as an intelligent interpreter for user input. It enhances simple sketches, expands text prompts, and analyzes photos to create depths for our 3D generation services.

3D Environment & Asset Pipelines: We use a multi-API approach for robust content generation. Worlds and models are generated via services like Meshy, Tripo3D, and Marble's World Labs API. Real-time updates to models, lighting, and world state are synchronized between users and the server using WebSockets.

Backend Services: Our Node.js backend orchestrates the entire platform. It manages user sessions, saves world data, handles asynchronous calls to our AI generation APIs, and serves optimized 3D assets to the client for a smooth, performant experience.

⚙️ Challenges We Ran Into

Mesh vs. Splat Rendering Conflicts: A key challenge was layering traditional physics meshes (.glb) on top of photorealistic Gaussian Splats (.spz). Balancing scene depth, aligning the two data types, and preventing texture conflicts without breaking the render pipeline required significant fine-tuning.
Achieving Performant Real-Time Rendering: As our first major Three.js project, there was a steep learning curve. Optimizing scene loading, managing memory, and achieving smooth, performant real-time rendering and physics in the browser was a process of extensive trial and error.
Dynamic World Stitching for 'Infinite' Exploration: Implementing a procedurally generated, "infinite" world was a major architectural hurdle. We had to develop a system for dynamically mapping and stitching together multiple scenes, each composed of a Gaussian Splat file (.spz) and a corresponding physics mesh (.glb). Aligning their coordinate systems and streaming these world "chunks" without performance degradation was a complex challenge.
Server & Asset Synchronization: Ensuring that all 3D assets loaded correctly and asynchronously was difficult. We spent considerable time debugging model and animation failures, especially when users would generate and introduce complex, high-poly assets into a shared environment.

🏅 Accomplishments We’re Proud Of

Unified Full-Stack Experience: We successfully integrated multiple complex domains—real-time 3D rendering, generative AI, WebAssembly physics, and backend services—into a single, cohesive web application. Our team's diverse expertise was the key to building a seamless user journey from a 2D prompt to an interactive 3D world.
Achieving Seamless Feature Integration: We systematically resolved complex bugs between our rendering, physics, and AI systems. By tackling difficult issues like animation drift, physics glitches, and rendering latency, we ensured that user-generated content functions correctly and cohesively within the world.
Radical Performance Optimization: Through targeted optimization of our 3D asset delivery pipeline, we successfully cut average model and animation load times by 40%. This enhancement is critical for maintaining an immersive, real-time experience as users populate their worlds with more complex creations.
We had fun !!!

💡 What We Learned

Mastering the Real-Time Web Stack: This project was a deep dive into the modern 3D web. We learned how to balance the computational demands of generative AI, the rendering complexity of Three.js and WebGL, and the performance requirements of a real-time physics engine, all within the constraints of a web browser.
The Art of Intuitive Tooling: We discovered that the most powerful creative tools are the ones that feel invisible. Our focus shifted from just building features to designing a workflow that closes the gap between a user's simple idea, a sketch, a photo, a few words and the immediate, tangible experience of exploring it in 3D.
Technology Should Be Playful: Ultimately, building Doodle World reinforced a core belief: technology is at its best when it feels alive, personal, and playful. Our goal was not just to engineer a product, but to create a space for joyful, immediate creation.

🚀 What’s Next for Doodle World

We’re excited to expand Doodle World with next-gen features:

Apple Vision Pro Integration: Allowing for truly immersive world-building and exploration in spatial computing.
Interactive AI Agents: Moving beyond static models to generative characters that can react to you and the environment.
Natural Language NPC Communication: Enabling users to talk, collaborate, or learn from the characters they create.
Robust Content Moderation: Implementing AI-powered systems to ensure all shared worlds are safe and inclusive.
Collaborative Multiplayer: Building shared spaces where multiple users can co-create, merge their worlds, and build persistent, shared experiences together.