This is alleged, and it is very likely that claimants like New York Times accide...

mrtranscendence · on May 14, 2024

Prompt injecting their own article would indeed be an incredible show of incompetence by the New York Times. I'm confident that they're not so dumb that they put their article in their prompt and were astonished when the reply could reproduce the prompt.

Rather, the actual culprit is almost certainly overfitting. The articles in question were pasted many times on different websites, showing up in the training data repeatedly. Enough of this leads to memorization.

seanmcdirmid · on May 15, 2024

They hired a third party to make the case, and we know nothing about that party except that they were lawyers. It is entirely possible, since this happened very early in the LLM game, that they didn’t realize how the tech worked, and fed it enough of their own article for the model to piece it back together. OpenAI talks about the challenge of overfitting, and how they work to avoid it.