We don't generate chains of tokens with a constant error rate so errors don't pile up. Don't ask me what we do instead for I have no clue but whatever it is, it works better than next token prediction.
Hey, maybe humans aren't just like LLMs after all.
Hey, maybe humans aren't just like LLMs after all.