> predict a few different events in a row, add up the errors, and see how off th...

dllthomas · on May 3, 2023

How about here? https://en.wikipedia.org/wiki/Scoring_rule

AlbertCory · on May 3, 2023

that's the definition of a rule.

> One could note the number of times that a 25% probability was quoted, over a long period, and compare this with the actual proportion of times that rain fell.

it still depends on many samples, or "over a long period" in your doc.

You can't escape the fact that there are only one or two samples, no matter how much math you throw around.

dllthomas · on May 3, 2023

> that's the definition of a rule.

And there are several example rules on the page.

> You can't escape the fact that there are only one or two samples, no matter how much math you throw around.

That depends on what question you're asking. "How well calibrated are the electoral predictions that FiveThirtyEight makes?" is a sensible question with a lot of data points, seems to speak directly to the crowing about the one call being bad, and seems well suited to the application of a scoring rule for comparison between people making predictions about the same things.