When a language model is trained for chain-of-thought reasoning, particularly on...

philipov · on March 3, 2025

> As a result, this adherence to learned patterns can overshadow genuine logical relationships, causing the model to rely on familiar sequences instead of understanding why one step logically follows from another.

To be honest, even humans rarely get above this level of understanding for many tasks. I don't think most people really understand math above the level of following the recipes they learned by rote in school.

Or beyond following the runbook in their IT department's documentation system.

And when the recipe doesn't work, they are helpless to figure out why.

kingkongjaffa · on March 3, 2025

> instead of understanding why one step logically follows from another

There’s currently 0% chance of “understanding” happening at any point with this technology.

ikanreed · on March 3, 2025

I mostly agree, but struggle with saying this with perfect certainty.

Understanding in the "have mental model of the world, apply it, derive thoughts from that model, derive words from thoughts" pattern is a thing they don't do the way we do.

But understanding of some kinds CAN be encoded into tokens and their relationships. They're clearly capable of novel, correct inferences, that are not directly contained within their training sets.

I all-but-guarantee my "My fish suffocated when I brought it to space, even though I gave it a space suit filled with pure water, why?" test case is not something it was explicitly trained on, but it correctly inferred "Because fish need oxygenated water"

jvanderbot · on March 3, 2025

How do we define understanding?

TeMPOraL · on March 3, 2025

There are many ways to define it. Taking, for example the "definition" from Wikipedia here[0], you could say that LLMs are understanding, in a distilled form, because relationships is precisely what they're made of.

--

[0] - https://en.wikipedia.org/wiki/Understanding#Definition - though this feels more like vague musings than a definition proposal.

joe_the_user · on March 3, 2025

That claim sounds like a quote from a paper but it's not from the currently linked paper. The paper itself seems more like antidote to the problem and does seem to roughly assume the claim.

I like the claim and I'd guess it's true but this seems like a weird way to introduce it.

smrtinsert · on March 3, 2025

Isn't that what the study you linked to roughly proposes?

belter · on March 3, 2025

But John Carmack promised me AGI....

zeknife · on March 3, 2025

I haven't kept up with his tweets, but I got the impression he deliberately chose to not get involved in LLM hype in his own AI research?