Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
[flagged] Ask HN: When LLMs make stuff up, call it 'confabulating', not 'hallucinating'
36 points by irdc on Oct 17, 2023 | hide | past | favorite | 65 comments
In humans, a hallucination is formally defined as being a sensory experience without an external stimulus. LLMs have no sensors to experience the world with (other than their text input) and (probably) don’t even have a subjective experience in the same way humans do.

A more suitable term would be confabulation, which is what humans do when due to a memory error (eg. due to Korsakoff syndrome) we produce distorted memories of oneself. This may sometimes sound very believable to outsiders; the comparison with LLMs making stuff up is rather apt!

So please call it confabulating instead of hallucinating when LLMs make stuff up.



Similar suggestions have been made over and over, and they've never stuck. "Hallucinating" has already hit mainstream media. It's probably going to stay that way out of sheer inertia because nobody outside of a small group of nerds on hackernews is going to sit down and think about whether "hallucinating" or "confabulating" more accurately describes the nature of the error. The existing term already captures the general idea well enough.


And here I thought us nerds ruled the world.


The millenium parties were all on New Year's Eve of 2000, not New Year's Eve of 2001. Some nerds had to be worrying about Y2K date/time bugs instead of partying too.


The true nerds would be partying the next year with the millinium ending/beggining on Jan 1, 2001.


Exactly. Pretty lonely endeavor.


What exact date do you think "New Year's Eve of 2000" is? 2000-12-31?


It was the eve of the new year, the evening that started at the end of 1999-12-31 and ended in time for brunch on 2000-01-01. I was partying with the fun people after arguing the numbers for a few days in 1998 or 1999.


Hard disagree. I’m browsing the internet not surfing the Information Super Highway. Mainstream media needs to rely less on allegory once technology becomes mainstream.


LLMs aren't mainstream yet. When they are, it will simply be common cultural knowledge that they can (lie,hallucinate,confabulate,whatever) and metaphors won't be necessary, the way the internet became mainstream once it was common knowledge not to "feed trolls" or "click on spam."

At the moment, most people still think LLMs are basically like the computers from Star Trek, rational, sentient and always correct. Like those lawyers who used ChatGPT to generate legal arguments - it didn't even occur to them that AI could fabricate data, they assumed it was just a search engine you could talk to like a person.

This is why we still need metaphors to spread cultural knowledge. To that end I think it's less important to be technically accurate than to impart an idea clearly. "Hallucinate" and "confabulate" get the same point across, but the former is more widely understood.

Even "confabulate" isn't great, since it carries the connotation of either deception or senility/mental illness. But the "confabulation" of LLMs is inherent to their design. They aren't intended to discern truth, or accuracy, but statistical similarity to a stream of language tokens.

And humans don't really do that, so we don't really have language fit to describe how LLMs operate without resorting to anthropomorphism and metaphors from human behavior.


But hallucinating is not an allegory... Just like we're still using browsing.


> I’m browsing the internet not surfing the Information Super Highway.

Hard disagree. You're browsing the Web not exploring the Internet.


When educating the general public about the risks and limitation of LLMs, I think "hallucinating" is a useful term - it is something people can understand and it conveys the idea of LLMs being somewhat random and unreliable in their responses. I'm not sure "confabulating" is so easily understood or accessible.


Hallucinating also gets the point across that the LLM will sometimes be 100% sensible and 100% confident in its claims, while being 100% wrong in those claims.


I agree that confabulate is more accurate and descriptive. But in general usage "halucinate" is a commonly understood word and "confabulate " is not. Using the latter runs the risk of sounding off-puttingly technical or obscure.


This is exactly the problem: I'd heard the word "confabulate" before, but has a weird 19th-century feel to it, and didn't know that it was specifically about memory reproduction errors.

I think that humans actually confabulate in minor ways a lot more than we realize; just ask your parents to compare stories with their siblings about something that happened to them growing up -- in my experience you often get completely different versions of events. Given that, it might actually be a good opportunity (as the comment sibling suggests) to introduce that into the language.


The standard meaning of "confabulate" is just "chat". The "memory reproduction error" is psychology jargon. For this reason, I support continued use of "hallucinate" for LLM errors.


Right; so could we compare trying to get people to stop saying "hallucinate" to trying to get people to stop saying "tidal wave"?

(My understanding is that people the use the word "tidal wave" to mean the wave around the earth caused by the moon's gravity, which causes the tides; obvs this is quite different than a large wave caused by an earthquake.)


My mind immediately goes to the episode of Star Trek: TNG with Mark Twain when I hear confabulate.


Good opportunity to change that!


It’s too late. People already call it hallucination and confabulating sounds weird. The idea that you think you can reverse the trend with a HN post is interesting though.


Language changes all the time, nothing is set in stone, and HN is one of the better places where this change can start happening. Media copycats will follow.


The question is, why? Yeah, confabulation might capture the phenomeon better, but no-one knows the word, it is not commonly used, compared to hallucination. And that is across languages:

    Russian:
        Hallucination: Галлюцинация (Gallyutsinatsiya)
        Confabulation: Конфабуляция (Konfabulyatsiya)

    German:
        Hallucination: Halluzination
        Confabulation: Konfabulation

    French:
        Hallucination: Hallucination
        Confabulation: Confabulation

    Spanish:
        Hallucination: Alucinación
        Confabulation: Confabulación

    Italian:
        Hallucination: Allucinazione
        Confabulation: Confabulazione

    Portuguese:
        Hallucination: Alucinação
        Confabulation: Confabulação
If hallucination isn't a perfect fit, but everyone and their dog know roughly what it indicates in relation to LLMs then that's still better than forcing a word no-one ever uses to describe something slightly different but overall rather similar.


French also has affabulation, which means roughly the same thing (confabulation is actually defined as "affabulation in some severe pathological cases"), and is, AFAIK, commonly known.


> Language changes all the time

...towards less complexity and redundancy. Scientifically accurate terms are called that because they're usually not part of everyday language. Hallucination is a much more common word and thus more self-explanatory to most people. For a random person AI confabulation sounds about as intuitive as flux capacitor.


I don't see this as a more suitable term; I had to check the definition of "confabulate" and the first hit in the macOS dictionary is "formal; engage in conversation" and the second is "psychiatry; fabricate imaginary experiences".

The fact I had to look it up to make sure, and that it isn't the primary definition, makes this a bad alternative to a well understood word which has already become established.


In terms of communicating with laypeople, I don't think this is a hill worth dying on. "Confabulating" is not a term I think that many people outside of the field of psychology are familiar with in the first place.

If you do want to use another term for laypeople, I think "bullshitting" or "BSing" would have connotations that are more relevant than "hallucinating".


I find the easiest way of explaining LLMs to laypeople is "Bulls*t Engine". If tuned well they're going to answer like a salesperson or internet troll: if they don't know they will BS before they don't answer. Its not hallucinating, or confabulation its BS.

That's not to say they're not useful. The ability to BS is well regarded among humans as long as you, as a consumer, have a decent BS detector. And like a good BS artist, if you stay within their area of expertise they can be really useful. Its when you ask them something that they should or almost know that they start to be full of s**.


To me, "confabulating" implies deliberately attempting to mislead. I like "bullshit" better -- bullshitting might be lying, but it can also mean simply trying your best in the attempt to please. To the degree that an LLM "wants" something, it's to give you an answer that makes you happy, even if it doesn't know the truth.

The fact that BS is also used for deliberate lying to mislead is a strike against that word. Using "bullshit" and "hallucinate" in conjunction somewhat paint a picture of the quality of the answer you get and the "motivation" used to get there.


The problem with describing factual errors as "hallucinating" is that they're no more or less hallucinations than when it generate correct content. The entire point of these LLMs is that it synthesises content not in the original input. It's always "hallucinating" - its just that sometimes it gets it right and sometimes it gets it wrong.


This is a better term and a suggestion I’ve already come across. If popularizers of science and tech as well as the AI space were to begin using the term consistently, the mainstream media would follow suit eventually.

> LLMs have no sensors to experience the world with (other than their text input) and (probably) don’t even have a subjective experience in the same way humans do.

Computers are not subjects, period, and to attribute intentionality to them is nothing short of projection and superstition. LLMs do not somehow transcend what a computer is. Computers are machines that transform conventional representations into other conventional representations, but these representations require human interpretation to have meaning (during which the representation is related to the conventionally assigned meaning which is itself the terminus that requires no further interpretation). That meaning or content is the intentionality in question. Your won’t find it in the machine, by definition.

A source of popular misunderstanding probably comes from the confusion of what the computer is objectively doing with the metaphorical language being used to talk about it. Even “confabulation” is, strictly speaking, metaphorical or analogical, but within the analogical schema we’re using, it is a better fit than hallucination.


The purpose of words and language in general is to communicate which includes intent and meaning. If I go ask 100 random people what confabulate means, I doubt even a third know what it means. But regardless of “accuracy” if I tell someone a robot hallucinated, they more or less get what I mean.

Besides both the general public and academics have been calling it “hallucination” for well over a year now. I’m sorry to say this ship has long sailed.


Semantics, but useful ones. By the way, since you wrote '(probably) don’t even have a subjective experience in the same way humans do.', I want to talk about that.

Couldn't llms (or rather generative transformers) sort of confabulate their own subjective experience? If you added other senses/input, just text, that could all act on neurons, a 'memory', a feeling of passing time (internal clock or whatever, this is weirdly a sense every viable human get, as do all complex animal lifeforms). Is this a technical possibility? If a GPT was able to remember and 'internalize' recent input (I mean it passed recent input in its training function, not at all like ChatGPT 'remember' previous input/output that he experienced very recently) and it's output, and you asked it 'why was that your response', would it confabulate too? Would it trick itself? And could it experience 'déjà vu' if the 'internalization' bugged?


Interesting question. I guess a related question would be ‘are human minds something like a really advanced ChatGPT, or are they something completely different?’

Sorry I don’t have an answer though.


The first time I heard somebody say an LLM hallucinated, I knew exactly what they meant, so I think hallucination is a good term.


> A more suitable term would be confabulation, which is what humans do when due to a memory error (eg. due to Korsakoff syndrome) we produce distorted memories of oneself. This may sometimes sound very believable to outsiders; the comparison with LLMs making stuff up is rather apt!

Why? LLMs don't have memories any more than they have sensors.


The term describes externally-observable behaviour rather than falsely positing the existence of LLM internal mind-states.


That is a contradiction of the definition that you supplied.

Maybe decide what you think the words mean before you decide how other people will use them?


My dad has Parkinson's and one of the pamphlets said such neural diseases can result in confabulating behavior. So I can definitely see the relevance of confabulate instead of hallucinate. It turns out, some medical experts wrote these opinions recently, but I don't have access to the full text:

"ChatGPT: these are not hallucinations – they’re fabrications and falsifications" https://www.nature.com/articles/s41537-023-00379-4#author-in...

"Chatbot Confabulations Are Not Hallucinations" https://jamanetwork.com/journals/jamainternalmedicine/articl...


As a neophyte wordsmith I like discussions of word choice, especially on HN (disappointed to see this flagged).

While I agree in principle, I'm not sure 'confabulating' is doing enough work. My hypothesis is given two words--one more accurate and one less accurate to describe the same phenomena--people will choose the more expansive or imaginative notion. That is, people will choose a word which generates more ideas.

We are in a time of expanding imagination about what LLMs and "AI" will do in the future. By definition 'hallucinating' LLMs are ridiculous, but the thoughts extending from 'hallucinating' label are nevertheless expansive in a similar way to our general overall feelings towards these technologies. Whereas 'confabulation' does nothing for my imagination.


I had literally never heard this word before people started complaining about LLMs hallucinating. I think I'll stick with the one that makes sense and wait for dictionaries to catch up. Words literally can mean multiple things.


On a related note, are there promising models / techniques to detect these sort of instances?

Say for instance I summarize something and want to check that the result doesn't contain hallucinations (confabulations :)) or more specifically that the summary contains only information present in original text. What's current state of the art for something like this and how well does it perform? I've read some about entailment models and fine tuned LLMs for this sort of thing but haven't found many great resources.


All seem valid to me. The inputs to training are the "sensor" through which LLM "experience the world". Maybe interpolation or extrapolation would be acceptable too.



The process can work both ways.


Confabulate is definitely a more suitable and sensible term for this. It’s clearly semantically more descriptive and would do a better job communicating what is actually happening.

The thing is, “confabulating” will never stick. People like myself will enter discussions about it and insist that it won’t stick, and because of that, I certainly won’t start using that term and hopefully I’ll convince a few others not to either.


While I agree with you that it's a better fit semantically, it won't happen, but not for the reasons given in the thread. The real reason is not the ignorance of the general public, but -- IMO -- the fact that even highly intelligent AI/ML engineers themselves have a limited vocabulary* and generally do not know this word.

* this is the prevailing secular trend, at least in the US


This is what I've been calling it as well. Confabulation is much more appropriate as a term.


Why change something that is already a well defined term in the context of generative AIs ?


Because it isn't really as well defined as you believe, but it has been well defined in context of neuropsychology?


I just searched for "ai hallucination" on arxiv and got 43 hits. And terms get reused in new contexts all the time. I don't disagree confabulating would be a more correct term, though.


This exactly: psychology and psychiatry have fully-developed lexicons for describing various phenomena which AI researchers would do well to adapt.


Neither really seem appropriate; they're both far too anthropomorphising. The machine isn't making an error of _thought_; it's simply accidentally incorrect instead of accidentally correct.


A similar point was discussed 12 days ago (233 comments): https://news.ycombinator.com/item?id=37773918


Must’ve missed that one.


thank you for the description. as a non-native english speaker i haven't heard the word 'confabulate' before, while hallucinate i learned from a young age.


A huge portion of the English speaking world would also not know the meaning of confabulate


Unfortunately, people are calling script kiddies "hackers" and LLMs "AI", so there's no likelihood of course correction.


Why not simply say "autocompleted"?


There’s a book called “Because Internet” you should probably read.

Tldr; language changes over time.

The idea of “precise language, and that someone is “using words wrong” is not correct, and history shows, repeatedly, obsessively trying to enforce “correctness” in language doesn’t work.

So… this is no different from telling people not to use “like, someone told me…” in sentences.

It won’t stick, and perusing it is probably meaningless.

If people call it hallucinations; that’s what they are.

That’s what language is.

There’s some deep irony in talking about this with language models.


You're right but too late. The die is cast. The world is not going to change to suit you.


lol, nerds.

seriously though, hallucinating makes sense to me because it genuinely feels like it's seeing things that doesn't exist.

for example, it comes up with non-existent postgresql functions.

that's hallucinations right there, i sometimes wonder what gpt-3.5 is smoking - i wanna try it.


Given that the "I" in "AI" is already a strong misnomer, why not just go with the lies people get told and make a fast buck?


Too late, the ship has sailed.


who asked?


Go back to Lesswrong.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: