They have that, but I have been reading that new models are so good at math already (solving complex math problems) I am guessing it's generally not needed?
There is a large conceptual gap though, between "solving a complex math problem" which is navigating through logic/reasoning, versus "correctly predicting the next token in the multiplication of 2 or more large numbers".
Eg. If we've already worked out premises that A is larger than B, and 2C is smaller than B, than you can easily compute the next token in the sentence "Therefore C is..."
versus computing 123,287,211 times 971,222, where computing the first token is non-trivial, but computing what comes after "11973" in the result is even less obvious. (it would be tremendously easier if you were predicting the result backwards, starting with the last digit).
There is some evidence that models actually "plan ahead" somewhat (something like guessing more than one token at a time, eg. when writing a line in a poem, the model has an "idea" of what the ending word will be) but there are limits to the reliability of that, vs. using a calculator tool.