It's not strictly that though, it's next word prediction with regularization.
And the reason LLMs are interesting is that they /fail/ to learn it, but in a good way. If it was a "next word predictor" it wouldn't answer questions but continue them.
Also, it's a next token predictor not a word predictor - which is important because the "just a predictor" theory now can't explain how it can form words at all!
And the reason LLMs are interesting is that they /fail/ to learn it, but in a good way. If it was a "next word predictor" it wouldn't answer questions but continue them.
Also, it's a next token predictor not a word predictor - which is important because the "just a predictor" theory now can't explain how it can form words at all!