Stop when the model agrees with itself — not when you tell it to.
We propose Recursive Convergent Inference (RCI), an architectural principle for neural network inference in which the set of active computational modules expands monotonically from a minimal seed subset until empirical convergence of the model's next-token output distribution.
Unlike existing adaptive computation methods that determine when to halt a fixed computation, RCI determines when additional computation is warranted — growing the active module set via breadth-first search over a precomputed affinity graph until output stability is reached. Stopping requires no learned halting signal, external verifier, or task-complexity pre-estimator.
RCI shifts from external scaling (longer outputs, multiple samples, verifier-guided search) to internal scaling over the active parameter subgraph.
Evaluated on OLMoE-1B-7B-0924 (64 experts) across n=150 reasoning tasks (50 per difficulty tier):
| Difficulty | Benchmark | n | Avg AUC | Std |
|---|---|---|---|---|
| Easy | GSM8K | 50 | 10.728 | 2.808 |
| Medium | MATH (algebra) | 50 | 8.956 | 1.688 |
| Hard | MMLU hard subsets | 50 | 11.987 | 2.537 |
Statistical significance (n=150):
- Hard vs Easy: Mann-Whitney U=1677, p=0.002
- Hard vs Medium: U=2106, p<0.001
- Easy vs Medium: U=1788, p<0.001
- Spearman ρ=0.22, p=0.007, n=150
Notable finding: RCI's complexity metric diverges from human-defined difficulty labels — MATH algebra is treated as computationally simpler than GSM8K word problems by this model, suggesting RCI captures model-relative computational demand rather than task difficulty in the abstract.
Weights W (read-only, shared)
│
M₀ = seeds (top activated experts on first pass)
│
Step n: Mₙ₊₁ = Mₙ ∪ top-k(neighbors(Mₙ), affinity)
│
Stop when: rolling KL(probsₙ || probsₙ₋₁) < ε
AND confidence margin ≥ θ
│
Result: easy task → few experts, few steps
hard task → more experts, more steps
automatically, without external signal
All experiments reproducible on free-tier Google Colab T4 GPU (~60 minutes).
Setup:
- Open notebook in Colab
- Add
HF_TOKENto Colab Secrets (left panel → 🔑) - Run Cell 1 → Restart runtime → Run all
Model: allenai/OLMoE-1B-7B-0924 — fully open, Apache 2.0
rci-inference/
├── README.md
├── LICENSE
├── paper/
│ └── rci-paper.pdf
├── experiments/
│ ├── rci_inference_poc.ipynb
│ ├── rci_figure1.png
│ └── rci_results.json
└── latex/
├── main.tex
└── references.bib
@misc{anokhin2026rci,
title = {Recursive Convergent Inference: Bottom-Up Module
Expansion via Output Convergence},
author = {Anokhin, Alex},
year = {2026},
note = {Preprint. github.com/olanokhin/rci-inference}
}Author: Alex Anokhin · [email protected] · LinkedIn Date: March 2026