From Perception to Imagination
This repository is a systematic exploration of world-models and embodiment for AI agents.
The goal is not robotics demos or game-playing agents.
The goal is to understand and implement how an agent models reality, predicts consequences, imagines futures, and grounds its decisions in a causal world.
This repo builds on prior work in:
- Memory Agents (continuity, learning, identity)
- Reasoning & Planning Agents (goals, strategies, failure awareness)
Here, intelligence is no longer abstract.
It is situated.
Most AI agents today:
- reason in text
- plan in symbols
- hallucinate actions
- ignore physics, cost, delay, and failure
Real intelligence requires a model of the world:
- what exists
- how it changes
- what actions are possible
- what actions are costly or irreversible
A world-model allows an agent to:
- predict before acting
- imagine futures
- learn from surprise
- ground planning in reality
This repository treats world-modeling as the bridge between cognition and embodiment.
World Models =
Perception → State → Dynamics → Imagination → Action → Feedback → Memory
Every project in this repository implements one piece of that loop.
Nothing is skipped. Nothing is assumed.
Perception as Belief
Build a state encoder that converts raw observations into a compact, structured representation of the world.
Focus:
- what the agent believes exists
- what is known vs unknown
- uncertainty and partial observability
Output is not pixels or text, but a world state the agent reasons over.
How the World Changes
Model how the world evolves given:
- current state
- chosen action
Learn or simulate:
- state transitions
- stochastic outcomes
- failure probabilities
This is the foundation of prediction.
What the Body Can and Cannot Do
Define:
- available actions
- invalid actions
- action costs (time, energy, risk)
This prevents impossible plans and grounds reasoning in physical constraints.
Thinking Before Acting
Simulate future trajectories internally using the world-model.
Capabilities:
- multi-step rollouts
- branching futures
- uncertainty-aware planning
Planning becomes internal simulation, not prompt chaining.
Situational Learning
Store experiences as:
- state
- action
- outcome
- surprise or error
Memory retrieval becomes context-aware: “What happened last time I tried this here?”
This tightly couples world-models with memory agents.
Learning from Being Wrong
Detect gaps between:
- predicted next state
- actual next state
Log:
- blind spots
- model errors
- unreliable assumptions
Trigger reflection, caution, or model updates.
This is embodied introspection.
Why the Agent Cares
Goals are not hardcoded.
They emerge from:
- repeated rewards
- constraints
- survival pressure
- long-term outcomes
The agent learns what matters by interacting with reality.
- Not a robotics hardware repo
- Not a game AI repo
- Not an RL benchmark zoo
- Not a demo-first project
This is a conceptual and implementation-level study of embodied intelligence.
- Each project is standalone
- Projects are meant to be built sequentially
- Earlier abstractions are reused, not rewritten
- Evaluation is explicit and documented
This repo is designed to:
- support workshops
- enable consulting
- act as a research playground
- serve as a foundation for physical AGI systems
Memory → Reasoning → World Models → Embodiment → Physical AGI
This repository represents the missing link between cognition and reality.