Tag Archives: diffusion

A note on billiard-ball computers, SBEN and diffusion-based LLMs

I remembered this and thought is important to mention, for the readers who are bored by the dispute around symplectic bipotentials.

The much discussed uncited reference [10] arXiv:1902.04598 is not only the first appearance in a proof of symplectic bipotentials, but also covers for the first time unilateral contact.

As a consequence any minimization algorithm of the cost functional like in SBEN is Turing complete, therefore it takes arbitrarily long time, except some trivial inputs.

Indeed, as mentioned first time in this post for the argument we may use Fredkin and Toffoli billiard-ball computers

which then theoretically may be simulated by minimizing the SBEN functional cost over possible evolution trajectories of the system,

G(c) =  \int_{0}^{T} I\left( c(t),\dot{c}(t),\dot{q}(t) - \frac{\partial H}{\partial p} (c(t),t), - \dot{p}(t) - \frac{\partial H}{\partial q}  (c(t),t) \right) \mbox{d}t

over all admissible evolution curves c(t) = (q(t), p(t)).

Here, for unilateral contact we use an information content with the form (see [10])

I(z, \dot{z}, \eta) \, = \, \chi_{M}(q) \, + \, \chi_{N(q\mid M)} \left( \eta_{p} \right) \, + \, \chi_{T(q\mid M)} \left(\dot{q}\right) \, - \, \langle\langle \dot{z} , \eta \rangle\rangle

This argument shows that any minimization algorithm, for a generic input (i.e. a generic concrete physical system), may take arbitrarily long time, except for some trivial inputs.

That’s why we need another path, based on likelihoods, than this naive minimization over trajectories.

See also the slides Artificial physics for artificial chemistry (2019) for a talk aimed at physicists, where after a description of chemlambda I go in the direction of arXiv:1902.04598.

Interestingly, the problem is somewhat similar with diffusion-based LLMs (dLLMs), which enable parallel token generation through iterative denoising arXiv:2509.25188.

See and play with Gemini experimental text-diffusion model.

What is the similarity: the minimization of SBEN cost over trajectories is alike to parallel token generation trough iterative denoising, and also the minimization algorithm is Turing complete, just as Attention is Turing-Complete.

All this is for the moment the intuition of a mathematician, and nothing more.