The problem is chatgpt always answers. It's never, i don't know. So when you ask...

jerf · on April 6, 2023

Even if you convince it to say "I don't know", which isn't that hard, it doesn't actually "know" that it "doesn't know". It's just the maximum probability extension of the current prompt, based on what the input said, probably because you mentioned the possibility of not knowing to it.

It in fact never knows. It is always guessing and BSing. It is also very good at it, better than any human, so the BS is quite frequently correct. But it doesn't know.

Which of course always prompts a lot of psuedo-intellectual chin stroking about "well, what is knowing anyhow?" I don't have to answer that question to point out that what we want and what GPT provides aren't the same thing, nor is what GPT provides and what we think we're getting. That is sufficient for there to be a problem.

I believe AIs will have a satisfactory concept of "knowing" someday. Some may even exist today. But such AIs will have to incorporate language models as a part of the system, not have language models be the system. Language models can't do the thing we really want, nor the thing we think we're getting.

sebzim4500 · on April 6, 2023

GPT-4 before the RLHF phase of training had a pretty good idea of what it "knows". The calibration graph was almost perfect, but after the RLHF the calibration is almost completely broken.

esjeon · on April 6, 2023

Nah, RLHF is what made GPT-4 outperform 3.5. The base model hasn't been much improved since 3.5. Also, the calibration curve is based on a subset of MMLU, so it doesn't really represent any of the actual user experience.

sebzim4500 · on April 7, 2023

I'm not saying that RLHF does more harm than good, just that it made this particular aspect of its performance worse. Basically there is still significant room for improvement, probably without changing the architecture.

int_19h · on April 7, 2023

Source?

esjeon · on April 7, 2023

The OpenAI GPT-4 paper itself.

brucethemoose2 · on April 6, 2023

Perhaps "one model to rule them all" isnt the best approach.

sebzim4500 · on April 6, 2023

There's probably a huge amount of room for improvement in the RLHF process. If there is still low hanging fruit, it would have to be there.

brucethemoose2 · on April 6, 2023

"I dunno" would have to be marked as a good or neutral response in the RLHF process, and that seems like a problematic training incentive.

sebzim4500 · on April 7, 2023

In an ideal world "I don't know" would be considered worse than a correct answer but much better than a wrong answer.

In the UK, there is a competition called the "junior maths challenge", or something, which is a multiple choice quiz where correct answers are +1 and incorrect answers are -6 (so guessing has negative EV). I think we need a similar scoring system here.

jiggywiggy · on April 6, 2023

Hmm didn't notice any difference yet, you are saying it got worse last weeks?

For kids story writing I've been getting better results with 3.5 at times.

Where 4 is way better af coding.

sebzim4500 · on April 6, 2023

No, we have no access to the original model, unfortunately.

The fact that RLHF broke the calibration comes from the GPT-4 paper, possibly the only interesting technical detail that they include.

rideontime · on April 6, 2023

What's with the obsession with children's stories and GPT? Is it just that children have low standards?

Agentlien · on April 7, 2023

As a parent my guess would be that people see it as a way to introduce welcome variety and whimsy into the daily routine of reading a bedtime story. While also feeling like you're using a hobby interest to help with a real practical issue.

I have a small library of children's books and we've read them all several times, the good ones many times.

That said, I wouldn't personally turn to these language models. From what I've seen they tend to generate rather bland and boring stories. I would rather make up my own or reread "Kackel i grönsakslandet" for the hundredth time.

jiggywiggy · on April 6, 2023

I know nothing off the internals, so this might be silly. But wouldn't it know a certain probility by which a token is correct?

garethrees · on April 6, 2023

Language models are based on probabilities of tokens appearing in a context. For illustration purposes, imagine a very simple model with just one token of context that has been trained on a corpus of three sentences, all of which are true, for example:

    clouds are white
    crows are black
    swans are white

After the model outputs "crows are", the single token of context is "are", and the probabilities are 2/3 for "white" and 1/3 for "black". So the model usually emits "crows are white", which is false, despite being trained on a corpus of true statements. Statistically "white" was more likely to follow "are" in the training data, so the same is the case of the model's output.

Of course LLMs have a much larger and more complex context than the single token in my example. But if the training data contains many news stories about professors being accused of sexual misconduct (which is newsworthy), and few news stories about professors behaving with propriety (which is not), then when querying the model for a story about a professor then it is likely to reproduce the statistical properties of its training data.

robocat · on April 6, 2023

Nitpick: looking out my window, clouds are grey. If I drive to the estuary, the swans are black (most are in New Zealand). Black & white examples always turn out to be grey examples.

cguess · on April 6, 2023

"correct" isn't the way to look at this. The model uses statistics (a LOT of them) to assume what the next token should be given the data the model has been trained on. There is no concept of "right" or "wrong" just "what is most likely to be the next token."

I'm 100% positive that if the guard rails that OpenAI put on were taken off of ChatGPT it, for instance, would not be a big fan of jews given the width and breadth of antisemitism online (and certainly in its training set).

jamincan · on April 8, 2023

Presumably, though, it could end up in a situation where the next most likely token is most likely with some low probability. Would it carry on then?

cguess · on April 9, 2023

Yes, because "I don't know" isn't an option. It just grabs the most likely

Quarrel · on April 6, 2023

Indeed.

It is a writer. It writes.

You can ask any writer to write such a thing, using examples, and they could write similar things.

The real problem is that this will get deployed to the internet- there will be people reading HN today that are writing SEO optimised websites, with thousands of articles on a topic, that will just spring out of GPT4's writing, fully formed.

GPT can find the topics, pick a domain name from available ones, churn out the articles. It can all go into templates, with different AI generated graphic design supporting it. Ad supported churned out rubbish.

The writing style can change site to site, long form, short form, informed by current events, whatever. GPT would happily provide the prompts for this variety changing the style based on the topic and target audience.

It seems inevitable to me that the web WILL be full of such sites, and worse, they'll fill the comment sections on "social" sites too.

But? Banning AI isn't the answer, if for no other reason than it wouldn't work.

cowl · on April 6, 2023

The real problem lies in the fact that those non existing citations will become real. Several years back there was a case of an article in Wikipedia that made some unsrourced claims->Journalist that doesn't do verification republishes those claims (without specifying wikipedia as srouce) -> Wikipedia article gets challanged for lack of citation -> the News story first based on wikipedia becomes the reference in the original Wikipedia article. Full circle

It's easy that something like this happens again. chatGPT confidently listes hallucinated sources->media rushes to publish the scoop -> now you have real "sources" for future reference

alpos · on April 6, 2023

Seems like there's a bug in that system, it was discovered on accident, and now there is a bot that exploits this flaw.

The fix will most likely have something to do with requiring that citations use primary sources, not just any article on the internet. In then end state, Wikipedia will be much better for it.

asdff · on April 7, 2023

At this point, all of this inevitability of our doom is making me want to invest some money into someone who is setting up these websites like this. If the internet is to burn from the inside out I might as well score a buck or two out of it by the time we end up back in the stone age reading hardcopy programming textbooks published before the great AI awakening.

lordnacho · on April 6, 2023

I think actually the problem is it always answers confidently.

Ask it about why World War II started, or how to make a cake, or where to go for dinner, or anything else, and it gives you a confident, reasonable answer. A lot of the answers are simply whatever it's already seen, mashed up. You can think of it as a search. But actually it doesn't think about what it's saying, it's stringing words together to make you think it's smart.

So then when it makes up something, it will sound to you, the reader who always sees it answer in perfect English with a decent answer, like it found an article about this professor in its dataset and is merely summarizing it.

ModernMech · on April 6, 2023

I was showing a colleague few instances where ChatGPT was confidently wrong, and he picked up on something I never had. He said "Oh, so it's doing improv!" He explained to me that the standard response in improv is to say "Yes, and..." and just run with whatever the audience suggests. He's completely right! ChatGPT constantly responds with "Yes, and..." It's just always doing improv!

And people are trying to replace doctors with LLMs. It's like "ER" meets "Who's Line?"

Applejinx · on April 6, 2023

ChatGPT is Mandela Effect, personified. It's going to go for what seems like it SHOULD be true. Sometimes that will go horribly wrong, except it will, by its very nature, seem like it's probably not wrong at all.

logifail · on April 6, 2023

> I think actually the problem is it always answers confidently

This isn't a problem restricted to ChatGPT, there are humans who display this trait too. This might be appealing at a superficial level, but if you start believing speakers with this trait it's a slippery slope. A very slippery slope.

I'm trying really hard to avoid Godwin's law, so let me suggest that Elizabeth Holmes could be one example of this.

toss1 · on April 6, 2023

Yup, it is just the most massive industrial-scale bullshit generator [0] ever invented.

It is capable of spewing excellent bullshit(TM) at incredible rates, and always with the greatest expression of confidence and good grammar.

Occasionally, when in the 'middle of the road' of it's training set, it can provide useful output. So, it's better than the broken clock that is correct twice every day.

But, wander off the bulk of the training, and it is far worse than nothing; it is dangerously misleading unless you are very skeptical and knowledgeable in your field.

(I have tested it in several fields with the same results, interesting and helpful, yet hallucinating facts everywhere.)

Getting LLMs to know the difference between a good fact and a hallucinated mashup of plausible BS is looking like almost as large a problem to solve as making the LLMs in the first place.

[0] Bullshit is defined as written or spoken without regard to the truth, only what sounds good in the context. It's not a deliberate lie, just a salad of truth and falsehood, delivered without doubt or stress of lying.

rootusrootus · on April 6, 2023

That's not quite true. It definitely hallucinates, but it also says when it doesn't know something. Here's an example I just did:

Prompt: What did Glenn Beck do in 1990?

Answer: I'm sorry, but I do not have access to information on Glenn Beck's activities in 1990. Glenn Beck is an American television personality, radio host, author, and political commentator who has been active in various roles since the 1990s. However, without specific details or context, it's impossible for me to determine what he did in 1990. If you have any additional information or context, please let me know and I'll try my best to assist you.

I followed up with a much more specific version of that question, and it tripped the "this might violate the content policy" warning, but it did give the correct answer (that it was a hoax).

lm28469 · on April 6, 2023

It can't know that it doesn't know because it straight up doesn't know anything

dataviz1000 · on April 6, 2023

This can be solved by having it play tic - tac - toe against itself.