Third-order self-attention An experimental third-order multihead self-attention model (requires N^3 memory 😰). Details to come.