Reminds me of Von Neumann's "With four parameters I can fit an elephant, and wit...

1-more · on Nov 3, 2020

There's a perpetual quest in strength sports to create an equation that lets you compare the relative strength levels of athletes in different weight classes. In weightlifting the Sinclair formula uses allometric scaling to try to roughly answer the question "if we scaled this lifter to the size of the overall world record holder, how much would we expect them to lift?" That's a decent approach. However for a long time powerlifting used the Wilks formula which was, you guessed it, a 5th order polynomial. Hilariously, there are discontinuities in the formulas. If a man bulks up to a bodyweight approaching 283.034 kg, his Wilks score approaches infinity. Same for a woman at about 208.391kg. But if that's your strategy then don't overshoot as the limit from the right is negative infinity.

Sinclair: https://en.wikipedia.org/wiki/Sinclair_Coefficients

Wilks: https://en.wikipedia.org/wiki/Wilks_Coefficient

discontinuities for men in Wolfram Alpha https://www.wolframalpha.com/input/?i=discontinuities+in+y%3...

discontinuities for women https://www.wolframalpha.com/input/?i=discontinuities+in+y%3...

flobosg · on Nov 3, 2020

That was actually implemented in a paper:

https://fermatslibrary.com/s/drawing-an-elephant-with-four-c...

canjobear · on Nov 3, 2020

You can do it with one parameter http://colala.berkeley.edu/papers/piantadosi2018one.pdf

TeMPOraL · on Nov 3, 2020

If you try hard enough, you can do anything computable with a single integer parameter - if your model is a Turing machine, and the parameter encodes a program.

It's perhaps less clever or cool than the hack above (by virtue of us being used to Turing machines), but serves as a friendly reminder that you can encode a lot of information in a long enough number.

lisper · on Nov 3, 2020

The real lesson here is that "number of parameters" is not a useful measure of information content. The only useful measure of information content is information entropy, which is the logarithm of the number of distinguishable states. The base of the logarithm is arbitrary, but by modern convention is invariably taken to be 2. The resulting unit is the bit.

Dylan16807 · on Nov 3, 2020

Number of parameters is fine when you're not trying to be sneaky with ridiculously precise constants. And it's more expressive in certain ways than a raw bit count. For general use you can impose reasonable limits, so that one parameter can't go overboard with bits. Something like a precision cap of one part per thousand and no going over a trillion.

deepnotderp · on Nov 3, 2020

This is not true.

Consider the two sequences:

1 2 3 4 5 6

And

6 4 6 3 8 2

Do these sequences really have equivalent entropy? Welcome to algorithmic information theory :)

lisper · on Nov 3, 2020

Your example is vacuous. You cannot determine the information content of a sequence of numbers in isolation. You can only determine information content of a system with respect to a model that tells you how to use the state of the system to distinguish between a number of possible states. The information content of the system is the log of the number of possible states from which the state of the system allows you to select one.

Yes, algorithmic information theory is a thing. But neither it nor your example refute what I said (because what I said is in fact true).

Dylan16807 · on Nov 3, 2020

If they're that short, they basically do, to be honest.

dekhn · on Nov 3, 2020

every possible story, image, or really anything anywhere is just an index into PI.

contravariant · on Nov 3, 2020

Using complex parameters, clever.

scarmig · on Nov 3, 2020

I was playing with a formulation like "with twelve parameters I can change lead into gold," but I like Von Neumann's whimsy a lot more.

benibela · on Nov 3, 2020

GPT-3: I need 17 billion parameters to create text

visarga · on Nov 3, 2020

On the other hand how many parameters do we need to create text?

A simple calculation of number of parameters in the human brain divided by seconds of lifetime yields: 10^14 synapses / (72.6 * 365 * 24 * 3600 seconds) =~ 43000 synapses/second, not all of them related to language and reasoning.

So we use tens of thousands of synapses for each second of our life, which is quite a contrast to the 100 bits of information per second which is estimated for conscious processing.