Amazing how a very costly to train system using billions of neural nodes on millions of dollars of compute performs more poorly than an 8-bit 1970s pocket calculator.
Not sure why people are expecting some sort of "intelligence" to emerge from a text generator model trained on Internet corpus data. GPT-3 doesn't calculate, it pattern matches.
I do get why people might be surprised, on the other hand, that it actually doesn't perform worse than indicated here. Maybe it's surprising upside. But since we know that the GPT is a transformer model, what it is doing is applying a probabilistic best-fit. From this perspective I can see how it is best-fitting data in ways that can provide these sorts of results, especially given all that training data.
Not sure why people are expecting some sort of "intelligence" to emerge from a text generator model trained on Internet corpus data. GPT-3 doesn't calculate, it pattern matches.
I do get why people might be surprised, on the other hand, that it actually doesn't perform worse than indicated here. Maybe it's surprising upside. But since we know that the GPT is a transformer model, what it is doing is applying a probabilistic best-fit. From this perspective I can see how it is best-fitting data in ways that can provide these sorts of results, especially given all that training data.