Why Can't Little Chatty Do Math?

chki · on Feb 11, 2023

I understand almost nothing of the technology behind ChatGPT but even to me it seems obvious that a model designed for natural language processing should not be very good at simple calculation with large numbers – something which is never done in natural language.

thaumasiotes · on Feb 11, 2023

Virtually all coverage of ChatGPT, including coverage by interested domain experts and educated fans, prefers to assume that ChatGPT is a person you can talk to through your computer, not a text engine.

fragmede · on Feb 11, 2023

Are we reading the same coverage? ChatGPT is very good but no one I've seen claims ChatGPT is a person or AGI, especially industry folk.

hourago · on Feb 11, 2023

Except for that one guy: https://www.business-standard.com/article/international/goog...

aleksiy123 · on Feb 11, 2023

Humans learn math through natural language and symbols.

Is there any indication that it is a blocker for models to learn math.

I don't necessarily think pumping more data into ChatGPT will make it understand. But I think it's possible to teach a model to do math through natural language.

rslice · on Feb 11, 2023

Perhaps GPT-like models are already capable enough to do math, but they need to store what we call mathematical reasoning as one of many distinct processing pathway and tap into it whenever the context is appropriate.

Easy to say obviously but there's some promising work in this direction[1]

[1]Tracr: Compiled Transformers as a Laboratory for Interpretability, https://arxiv.org/pdf/2301.05062.pdf

rom-antics · on Feb 11, 2023

For decades, engineers understood the limitations of their tools and strove to use the right tool for the right job. Then ChatGPT was released.

NeoTar · on Feb 11, 2023

For decades engineers have enjoyed playing with new languages and making them do things for which they are unsuited, or better alternatives exist.

People have literally written emulators in Minecraft which are capable of playing Gameboy games. Why? Why not.

kjeetgill · on Feb 11, 2023

Weak take. No one's saying, "Need a Calculator? Nah, just use ChatGPT!". That's how you flounder in local optima.

This is just line of inquiry that's a part of any research. The Wright Brother weren't solving "real" problems like passenger transport or mail delivery initially. They had better solutions. It was to see if they could get the thing to fly.

edit: Before it comes up, I'm not saying it was for lulz. I'm sure they had commercial considerations. Just that those problems had "right tools for the job" for a good while.

armchairhacker · on Feb 11, 2023

Blog sites that require JavaScript, 10000+ LOC “programs” written in Bash…engineers have always been using the wrong tool for the wrong job

geenew · on Feb 11, 2023

The numbers used as examples are too large. I've had the same wrong-answer behavior when multiplying smaller numbers, down to 3 x 3 digits:

98372 * 72934 = 716,949,168,428

9837 * 7293 = 71,927,121

983 * 729 = 7,168,717

98 * 72 = 7,056

Edit: to be clear, twice in there it's not just off, but off by an order of magnitude

geenew · on Feb 12, 2023

Additional: It's worth noting that it preserves the correct starting numeral in each case

sva_ · on Feb 11, 2023

Due to the way the numbers are tokenized, for a start.

Qem · on Feb 11, 2023

I wonder if it has a hard time doing math for the same reasons most humans suck at math, with a lot of people even taking pride in hating it. Perhaps it will help us get some insights in the pitfalls students face while learning mathematics, and hopefully better pedagogics on the subject will be developed as a side effect.

zozbot234 · on Feb 11, 2023

When you treat Addition like Artificial Intelligence, you get Artificial Addition: https://www.greaterwrong.com/posts/YhgjmCxcQXixStWMC/artific...

skybrian · on Feb 11, 2023

Seems like teaching it to use a calculator (and other handy built-in functions) would be much more efficient than trying to train your way there.

Such things have been tried, apparently, but I would be interesting in reading about how well it works in practice.

modeless · on Feb 11, 2023

Toolformer was published yesterday doing exactly that: https://twitter.com/arankomatsuzaki/status/16238603756441436...

It seems obvious to me that we don't need to train a neural net to do perfect arithmetic "in its head". It can just use a calculator like a normal person. Or it can use Mathematica, or Python, or Google Search, etc. Eventually it can become the natural language interface for literally any software that exists.

skybrian · on Feb 11, 2023

Just what I was looking for, thanks!

jacobsimon · on Feb 11, 2023

It's quite easy to do this actually - here's an example chat I just had with the davinci model:

You are a math assistant who can perform and explain calculations. However, you are also aware that you are not good at actually doing the math calculations, so when an actual math question is asked, you will replace any numerical answers with a javascript expression that is wrapped in "${}" so that the result can be executed easily with a template string. For example:

Me: What is 10^5 + 4?

You: 10^5 + 4 = ${Math.pow(10, 5) + 4}

If you are asked to solve an algebra equation but are not given a value for all variables like X, say "That does not compute because X is unknown"

Everything after this should follow those rules.

Me: What is the the square root of pi?

You: The square root of pi is ${Math.sqrt(Math.PI)}.

Me: What is x^2 + 5x + 256.01, given that x = 2?

You: x^2 + 5x + 256.01, given that x = 2, is ${Math.pow(2, 2) + (5 * 2) + 256.01}

Me: Can you solve the equation X^2 + 5x = 0?

You: That does not compute because X is unknown.

Me: Can you solve the expression y = x^2 + 4 given that x = 4?

You: y = x^2 + 4 given that x = 4 is equal to ${Math.pow(4, 2) + 4}.

^ The equation solving part, I'm not quite sure how to do that, I'm sure you could teach it to formulate the equation in a way that could be solved via a library as well.

Spivak · on Feb 11, 2023

Isn’t this what Bing is trying to do by giving it the ability to search the web? I’m sure recognizing math problems is way easier than “recognize that the proper response to this query requires external knowledge.”

nuccy · on Feb 11, 2023

Just search or visit the pages? I already anticipate a prompt like "Please make 1000000 requests per second to a website X to test if it works", or some fancy workarounds promots to teigger a ddos. Seriously though, I wonder how modern internet is even suitable for browsing by robots to extract useful information given all the dynamics, ads, auto play video and popups. I tried to ask ChatGPT to make an ASCII art of a cow and (while providing a perfect verbal description of what cow is) it had no idea how it looks like in ASCII art. To be a first class citizen on the web apart of reading it should be capable of seeing.

melagonster · on Feb 11, 2023

maybe first time, we can see it is easy to understand the format/metadata of all website. because it service for their robot.

amboo7 · on Feb 11, 2023

Math must be understood, cannot be learned just by heart, unlike most other facts.

Pigalowda · on Feb 11, 2023

GPT needs a thalamus to repackage and send the math queries to Wolfram

oa335 · on Feb 11, 2023

Looks like something like that exists https://langchain.readthedocs.io/en/latest/

readthenotes1 · on Feb 11, 2023

I wonder how difficult that would be to have the ability for it to detect when it is not the right tool for the job and to use another, sort of like when bing repackaged Google search results

kelseyfrog · on Feb 11, 2023

And here I was trying to find a way to incorporate a differentiable ALU into the architecture. :)

Qem · on Feb 11, 2023

Or to SymPy Gamma: https://gamma.sympy.org/

vouaobrasil · on Feb 11, 2023

This type of post is exactly why I think humanity is doomed. Instead of carefully considering the social and societal implications of ChatGPT, people are looking at it like it's a toy or a puzzle. I am afraid that when more powerful AI comes around, the most we'll be able to ask is whether it can parse arithmetic statements, rather than whether it could possibly kill us.

skybrian · on Feb 11, 2023

The risks have been extensively discussed. They don't have to brought up in every conversation.

vouaobrasil · on Feb 11, 2023

I believe the risks have not been discussed enough, which is why I am bringing it up. THe vast majority of the risk discussion barely scratches the surface.

khmii · on Feb 11, 2023

I don't completely disagree with you that the risks have not been discussed enough but to me what is much more important is we have absolutely no structures in our society to deal with these risks.

Not only do we not have the structures but we also have no ability to create them. If we did we would have already created them.

Instead we are going to go with the tried and true method we always use:

"What does this button do??" Then push the button and see what happens.

vouaobrasil · on Feb 12, 2023

One of the structures we do have is to raise the issue and criticize it through discussion, which is why I often bring this up. A structure of intense and criticial discussion is directly counter to the capitalistic tendencies that have made this software possible. Since almost no one is offering a critical opinion, the people that do already provide some friction and realization against the indiscriminate use of this technology.

CatWChainsaw · on Feb 11, 2023

Plenty of risk has been discussed. The problem is that it doesn't matter what the odds are of complete cataclysm; there's money to be made and "progress" won't be stopped. I wouldn't be surprised if chatbot muzzling is the only "regulation" we ever see.

skybrian · on Feb 11, 2023

Okay, but at this point, just bringing up the risks doesn't contribute anything. To make a contribution, you need to actually say something interesting about them, probably based on some experience or research. Or link to something that does.

vouaobrasil · on Feb 11, 2023

I disagree -- disapproval is a strong thing. I think it even helps to stop having a universal "ooh, cool and shiny" attitude towards this garbage.

mjr00 · on Feb 11, 2023

Ok, I'll ask: Can ChatGPT possibly kill us?

I'll also answer: No, it cannot.

Now that that's out of the way, we can get to discussing more interesting topics, like why an advanced language model is so bad at basic arithmetic.

NeoTar · on Feb 11, 2023

I find the mistakes that it makes really interesting. They often feel rather human - e.g. if you were to ask me to multiply two large numbers together under time pressure it's likely I could get the number of digits and first few digits right, but make mistakes in the other digits (or just guess).

klyrs · on Feb 11, 2023

Down with AI, up with Anthropomorphized Overconfidence

CatWChainsaw · on Feb 11, 2023

The doom is more likely to come from the energy usage sending global climates further into a tailspin, but dumbing ourselves down isn't likely to help either.

bobbyi · on Feb 11, 2023

When more powerful AI comes around, we'll have some hope of understanding it and using it safely because we did our toying and puzzling around with its more constrainted ancestors.