Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Humans don't have this fixed split into "context" and "weights", at least not over non-trivial time spans.

For better or worse, everything we see and do ends up modifying our "weights", which is something current LLMs just architecturally can't do since the weights are read-only.



This is why I actually argue that LLMs don't use natural language. Natural language isn't just what's spoken by speakers right now. It's a living thing. Every day in conversation with fellow humans your very own natural language model changes. You'll hear some things for the first time, you'll hear others less, you'll say things that get your point across effectively first time, and you'll say some things that require a second or even third try. All of this is feedback to your model.

All I hear from LLM people is "you're just not using it right" or "it's all in the prompt" etc. That's not natural language. That's no different from programming any computer system.

I've found LLMs to be quite useful for language stuff like "rename this service across my whole Kubernetes cluster". But when it comes to specific things like "sort this API endpoint alphabetically" I find the amount of time to learn to construct an appropriate prompt is the same if I'd have just learnt to program, which I already have done. And then there's the energy used by the LLM to do it's thing which is enormously wasteful.


> All I hear from LLM people is "you're just not using it right" or "it's all in the prompt" etc. That's not natural language. That's no different from programming any computer system.

This right here is the nail on the head. When you use (a) language to ask a computer to return you a response, there's a word for that and it's "programming". You're programming the computer to return data. This is just programming at a higher level, but we've always been increasing the level at which we program. This is just a continuation of that. These systems are not magical, nor will they ever be.


I agree, I'm mostly trying to illustrate how difficult it is to fit our working model of the world into the LLM paradigm. A lot of comments here keep comparing the accuracy of LLMs with humans and I feel that glosses over so much of how different the two are.


Honestly we have no idea what the human split is between "context" and "weights" aside from a superficial understanding that there are long term and short term memories. The long term memory/experience seems a lot closer to context than it is to dynamic weights. We don't suddenly forget how to do a math problem when we pick up an instrument (ie our "weights" don't seem to update as easily and quickly as context does for an LLM).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: