Would be interesting to combine it with Reasoning In the Latent Space: feed the ...

vessenes · 2025-11-13T13:25:07 1763040307

I like this plan, but don't you already have this from the input vector in the prompt, at least if the inference is 'chunk wise' - generating a latent space vector, decoding it, outputting it, doing the next one.

What if you trained a separate thinking phase using the auto encoder, though? Might be more efficient, and then you've got it using neuralese internally.

Actually, reading the (summary) paper - they tried your idea and had trouble with it for a different reason:

   > Once the generative head predicts the next vector , a natural next step would be to feed it directly as input to the Transformer for predicting . However, we found that the model struggles to unpack the semantic information from such a compact representation. Instead, we ground the autoregressive process back in the more structured discrete space, where the predicted  is passed through the autoencoder to reconstruct the K tokens.