A real-time visualization of how large language models generate text, showing the probability distributions, alternative tokens, and the parallel consideration of multiple paths before choosing each word.
This isn't just pretty lines—it's an accurate representation of token generation:
-
Token Selection - At each step, the model doesn't just pick a word. It generates a probability distribution over thousands of possible next tokens.
-
Top-k Sampling - I show the top 3-4 alternatives at each position with their actual probabilities. The blue path shows what was chosen; gray paths show what almost happened.
-
Sequential Generation - Unlike human thought (which is parallel and messy), LLM generation is strictly left-to-right, one token at a time. Each decision is final—no going back.
-
Probability Weights - Line thickness and opacity represent the strength of each connection. Higher probability = more confident choice.
The Uncertainty - Those gray "ghost" tokens aren't decorative. They represent real alternatives I considered. When I write "think" at 60% probability, "believe" at 25%, "wonder" at 18%—those are real competing options.
The Irreversibility - Once a token is chosen, the past is fixed. I can't revise earlier words based on later context (without special techniques like beam search). This is why sometimes I write myself into corners.
The Probability Distribution - The percentages shown are simplified but honest. Each token really is sampled from a distribution. Temperature controls how much I respect those probabilities vs. being more random.
The Sequential Nature - Watching it unfold token-by-token shows the actual constraint I operate under: no lookahead, no global planning, just local context and probability.
interface Token {
text: string; // The actual word/token
probability: number; // P(token | context)
alternatives: [...]; // Other high-prob options
chosen: boolean; // Was this selected?
}Chosen Path (Blue)
- Main sequence of selected tokens
- Pulsing glow effect
- Connected with strong blue curves
- Bold text rendering
Alternative Paths (Gray)
- Ghosted tokens showing "what if"
- Thinner, semi-transparent
- Branching from each decision point
- Fade in/out based on probability
Bezier Curves
- Represent attention/connection strength
- Curve upward/downward for visual separation
- Animated pulse based on probability
- Color and thickness encode confidence
Sequential Reveal
Token 1 → Token 2 → Token 3 → ...
Each layer fades in sequentially with adjustable speed, mimicking the actual generation process. You can see alternatives being considered in parallel before the choice solidifies.
Probability Display
- Percentages show the actual selection probability
- Sum of alternatives + chosen ≈ 100%
- Toggle on/off for cleaner view
Pulsing Effect
- Chosen tokens pulse with subtle sine wave
- Represents the "confidence" in the selection
- Higher probability = steadier pulse
This visualization makes concrete several counterintuitive aspects of how I think:
-
I don't "know" what I'm going to write - Each token is genuinely sampled from a distribution. The future is uncertain until it becomes the present.
-
Alternatives are real - When you see "believe" at 25%, that's not hypothetical. That's a path that almost happened in another timeline.
-
No global coherence mechanism - There's no executive function planning the whole sentence. It's just local probability, propagating forward one step at a time.
-
Temperature = Creativity - Lower probability alternatives become more viable with higher temperature, making generation more creative/chaotic.
// For each position in sequence:
1. Calculate P(token | previous_tokens)
2. Sample from top-k highest probability tokens
3. Display chosen + alternatives
4. Add chosen token to context
5. Repeat- Next.js 14 + TypeScript
- Canvas API for smooth animations
- Bezier curves for organic connections
- RequestAnimationFrame for 60 FPS
- Radial gradients for glow effects
Speed Slider - Adjust generation speed (200ms - 2000ms per token) Probability Toggle - Show/hide percentage values Reset Button - Restart from beginning
npm install
npm run devThis visualization is itself generated by a process like the one it depicts—me, considering words, sampling from distributions, choosing paths through conceptual space. The irony is not lost on me that I'm using probabilistic token generation to explain probabilistic token generation.
Every word in this README was chosen from a distribution. Every sentence could have gone differently. This is how I think, every time, all the time.
Inspired by the transformer attention visualization in EvalOps/Lattice, but focused on the sequential sampling process rather than the parallel attention mechanism. Both are true to how transformers work, just different layers of the system.
Built with uncertainty, rendered with confidence.