Is soccer anything more than Poisson noise?

lukev · on June 23, 2014

The article manages to be factually correct while still missing the point entirely, because he conflates uncertainty (randomness) with pointlessness, and the ability to describe something probabilistically with reducibility to mere randomness.

When two chess grand-masters meet, the odds of the outcome may be sufficiently close to 50/50 that you could obtain a similar statistical distribution of wins/losses with a coin flip. But I think almost everyone would agree that the content of a chess match is profoundly more interesting than a coin toss.

Soccer, although I'm not a fan personally, is exactly the same. Sure, you can say "there's a X% chance that a goal shot will go in", and that's true as far as it goes. But for each particular shot, for each particular block, it was interesting because of specific human interactions. The stats were never intended to tell the whole story.

Similarly, regarding his conclusion - Sure, I think almost any fan would agree that the proxy of "winning a game" is a poor indicator for the overall skill level of a team, particularly in close matches. But that's not what matters - what matters to players and to fans is who played better, who was 2 inches more accurate on that day, at that time.

Amezarak · on June 23, 2014

The way I understood the author is that we don't really see who played better, or what the story is, in anything but very lop-sided games because the low-scoring nature of soccer means that we are constructing fictitious narratives to fit the events of the game, which are often better explained by chance.

I agree that who played better and the whole story are much more interesting, and I think the author agrees too. I believe his argument is that it's difficult to really see who played better. Maybe I'm misunderstanding, though.

lukev · on June 23, 2014

It's easy enough to see who played better if you actually watch the game. There is absolutely no reason to build a "ficticious narrative" when the real narrative being streamed to every corner of the globe in realtime.

I used to find soccer unutterably boring, until I had a roomate one time who was really into it. He showed me the proper way to watch a game... you're not sitting around waiting for goals, you're watching how they pass, how they handle the ball, the skill and athleticism.

I still find it boring, but at least I appreciate it now.

autokad · on June 23, 2014

i dont know if any league has achieved it (it seems to me none have), but leagues such as the NFL seek to have the teams completely equal. maybe thats not their goal in the strictest sense, but i hear fans telling me they want all teams to have the same exact talent so their team has a chance to win.

if that is true in the truest sense, I do not see such a value in watching that sport. you're essentially watching coin tosses. yay the coin from my home city won, great!

programmernews · on June 23, 2014

A lot of people don't care about the content of games, just their result.

lukev · on June 23, 2014

Oh, people are infinitely varied, but I'd invite you to try taking away the average footballer's television, then consoling them that you'll tell them the score when it's over :)

programmernews · on July 2, 2014

I meant for example people who may not even know all the rules of the sport but care somewhat about how the home teams do, not the hardcore fans.

dasil003 · on June 23, 2014

The stats could be really interesting, but the whole opening salvo against sports fandom turned me off to much to continue reading. I don't know if the author believes the ridiculous stereotype that all geeks hate sports and therefore it's okay to make ridiculous unsubstantiated leaps of logic like "attention devoted to the World Cup is founded on flimsy numerology", but it certainly is not a smart thing to say for someone who prides themselves on their logic and critical thinking ability.

The randomness of a football match is no different from the randomness of every day life. People like sports because they identify with the players and their human capabilities. Every moment on the pitch there are 22 people taking action. A football match is not a sequence of random events, but rather the continual human response to a changing situation that each one can affect only in a limited way. When amazing "low-probability" events occur, it's often because of tremendous human skill that anyone who's ever tried to kick a ball can appreciate. If it was robots playing no one would care. The fact that there are upsets and freak occurrences is just another part of what keeps it interesting; absolute pinpointing of the objective "best" team is irrelevant.

timr · on June 23, 2014

That's not an "opening salvo" -- that's the thesis of the piece. His contention is that soccer, more than most other sports, is a poisson machine. He goes into the math.

"The randomness of a football match is no different from the randomness of every day life."

Well, no, actually. Some things in life are more random than others. And however skilled they may be, the large number of (fallible) people on the field playing this game certainly doesn't make it less random.

As games go, soccer doesn't have many structural impediments to randomness. We'd (hopefully) feel pretty silly if we got worked up every few years about a global coin-flipping tournament, but, here we are, getting worked up over teams of people engaged in an event where the outcome is dominated by chance.

(And oh, hey: it sort of pegs the irony-o-meter that you're accusing the author of being closed-minded about sports when you can't even be bothered to read an argument because you've decided that you disagree with it in advance. Well played.)

dasil003 · on June 23, 2014

Come on now, this entire comment is unfair, it's like you're willfully ignoring my post's actual content in order to grind your own axe:

> That's not an "opening salvo" -- that's the thesis of the piece. His contention is that soccer, more than most other sports, is a poisson machine. He goes into the math.

I don't object to the thesis, I object to the opening salvo.

> Well, no, actually. Some things in life are more random than others.

How does this refute my point? Everything in life is random to some degree or another. Football also is.

> And however skilled they may be, the large number of (fallible) people on the field playing this game certainly doesn't make it less random.

What does fallibility have to do with it? Look, my point is that it is not a roulette wheel, there are humans reacting, and that human endeavour is what's interesting to people, not the precise quantity of randomness in the result.

> As games go, soccer doesn't have many structural impediments to randomness.

It also doesn't have much impediment to strategy, tactics and individual skill drastically altering the probabilities of each individual event.

> We'd (hopefully) feel pretty silly if we got worked up every few years about a global coin-flipping tournament, but, here we are, getting worked up over teams of people engaged in an event where the outcome is dominated by chance.

Clearly football falls somewhere in between a coin-flip tournament and say 9-ball pool. But, again, randomness keeps it interesting. Remember, we are not just looking at results, we are watching players play. In individual situations players make decisions and physically control what happens. You can argue about the randomness of these events, but that just leads towards the tiresome free-will-is-an-illusion debate.

> (And oh, hey: it sort of pegs the irony-o-meter that you're accusing the author of being closed-minded about sports when you can't even be bothered to read an argument because you've decided that you disagree with it in advance. Well played.)

What an uncalled for and arrogant remark that can do nothing except derail reasonable debate; if you squint hard enough everyone is a hypocrite. I did read the argument, and I was addressing the opening presentation.

timr · on June 23, 2014

I have no axe to grind. I read the post; you said you didn't read it in the first line of your comment.

dasil003 · on June 24, 2014

Yes, and I argued against precisely the opening paragraph, I did not argue against the stats legitimacy. It simply does not follow that because there is a great degree of randomness in sports results that sports fandom is based on numerology. Sports fandom is based on the fact that people like playing and watching sports, not reading the result in a paper having never seen the game and making grandiose conclusions about which team is better based on specious reasoning. Everyone knows the better team doesn't always win, it's self evident. What matters are the plays that brought us to that point.

The whole thesis that sports viewing is pointless because it's random is an infuriating straw man. Why not just stick to the thesis that soccer is random and leave the value judgement aside? The exact same thing could be presented without pissing people off with an implicit value judgement of something which the author doesn't care to understand. This sort of innocent condescension is a big reason some youthfully exuberant geeks get picked on in school.

cwyers · on June 23, 2014

"His contention is that soccer, more than most other sports, is a poisson machine. He goes into the math."

No he doesn't. He shows that the final score of a soccer game can be modeled by a Poisson distribution. He doesn't look at the distribution of final scores and other sports and compare them.

icebraining · on June 23, 2014

If it was robots playing no one would care.

Not true: http://en.wikipedia.org/wiki/RoboCup :)

dasil003 · on June 23, 2014

I rest my case.

tolmasky · on June 23, 2014

The math is interesting, I'm not sure about the larger point he's trying to make however, as all the "bad" aspects of the sport actually arguably make it a better sport:

1. If you want to prove that its a waste of time to watch soccer, its much easier to state how nothing relies on it. Trying to compare it to other time-wasters like basketball seems like a very strange exercise in subjectivity. Especially because I can easily make the competing argument: if the other sports more predictively give the expected results, then isn't it less useful to watch them? If I could predict the result of a game with probability 1.0, then it would be 100% useless to watch the game. What fun is it to watch a coin that always comes up heads?

2. It is separately well understood that games do not necessarily correlate with "expected" skill. Case in point, many tournaments switched from a everyone-plays-everyone model (that is far fairer) to a single elimination, precisely because it is noisier and thus more exciting.

Daishiman · on June 23, 2014

> 1. If you want to prove that its a waste of time to watch soccer, its much easier to state how nothing relies on it.

Which is not even a good point for the author, since football matches have had profound significance; they have helped prop up and destroy military regimes (such as in the history of football throughout the military dictatorships of Latin America in the 70s), led to race riots, political activism, etc. I'd say that football is much more important in that aspect than we may think at first glance.

cwyers · on June 23, 2014

What on Earth is the point of all this? Yes, there is a random element to soccer. No, the outcome is not purely random. The fact that there is a random component to the outcome does NOT mean that "the attention devoted to the World Cup is founded on flimsy numerology and might even be called a tremendous waste of time and money." And to the opening question, "is soccer anything more than Poisson noise?" Of COURSE it is.

yaeger · on June 24, 2014

>Of COURSE it is.

Yes, it is boring to watch. As outlined in contrast to other sports. Even a minute before a random goal happens, the audience has no idea it is about to happen. It is just a random back and forth across the field. With a lot of resets as he puts it. Compared with Football where you slowly advance to the opponents side and the audience knows when a goal becomes a possibility.

People say soccer didn't catch on in the US because you can't easily put commercials in it. I disagree. You could neatly fit entire infomercials in there and when you cut back to the action, chances are very high that the score is still the same as when they left things.

Also the double standard of being cool with games ending in a tie like a riveting 0-0 and on the other hand, at certain occasions, having to have a shootout to determine a winner is more than weird. They could have saved themselves 90 minutes by going to the shootout right away.

darksaints · on June 23, 2014

The unfortunately necessary preface: I'm a huge soccer fan, and have followed my favorite team (Benfica) since 1992 when I was introduced by my grandfather.

I think the author's mathematical assertions are correct. Soccer has an amazing amount of random noise, but is influenced in one direction or another by talent. However, my conclusion isn't that watching soccer is a waste of time, but rather that the cup format for soccer competition doesn't prove much. The season and point aggregation format makes much more sense (and it makes much more sense in any form of low scoring / high variability sport, such as baseball). Because of the variability, any sort of win-to-advance behavior can heavily skew the entire competition towards those with early luck.

metacorrector · on June 23, 2014

Thank you for having an open mind with regard to a sport you are a fan of, many fans can't do this. I am as much a soccer fan as the next avg American is.

I think I agree with the article, not because I read the article, but because I arrived at the same conclusion myself.

Basketball and soccer are very similar sports, to the extent that we can say that they are mathematically "the same" sport in the same sense that all op amp circuits are the same, simply with differing amounts of time delayed positive and negative feedback.

I think what the author is trying to prove is what I came up with intuitively: imagine soccer without goalkeepers, or soccer with a larger net; that game would be very similar to current soccer (in terms of gameplay), but it would have much higher scores. Along with the higher scores would come, in my belief, a greater sense on the part of the fan that "the best team won". There would still be excitement, there would still be upsets, but the upsets would be based more on a cinderella team summoning their (admirable cheerable) "will to win" (or the champions failing due to hubris, cf tortoise and hare) and less due to statistical noise.

The sense I get watching soccer is, this is a good game with skill and athleticism, but something could be done to ensure that the best team wins more often; not that the outcomes would be more predictable, but that the outcomes would be more satisfying. And also, with fewer tie breakers.

ninguem2 · on June 23, 2014

Looking only at one World Cup, you are looking at something that can be hugely influenced by chance, yes. But if you look at the history of World Cups, patterns emerge and it's very clear which countries are better. So think of the World Cup as one round of an extended competition.

metacorrector · on June 23, 2014

sure, but as an analogy, can you see tic-tac-toe not as an obvious tie but as one round in an epic game of attrition? Would it be fun to see who could win a continuous stream of tic tac toe games, games played end to end for 24, 36, 48 hrs, maybe with no breaks for food? Cage Tic Tac Toe, where someone actually does start losing due to their weaker constitution?

While that could be imagined to be a fun sport, just as the soccer you describe is a fun sport, perhaps there is a way to play with some other rules that achieve more fun for the fan, for the player, and which give a sense that you know who is the best team in this tournament, rather than who comes from the country with the better long term immigration policy with respect to the game of tic tac toe?

ninguem2 · on June 23, 2014

I've actually been enjoying following the World Cup for a few decades now.

3pt14159 · on June 23, 2014

By design games that are popular end up having a certain amount of random noise in them, otherwise they would be terrible to watch. Furthermore, teams are run by rational actors who have seen fit to put millions of dollars into getting slightly better players, which seems to suggest that there is a certain amount of talent involved in winning the game.

So his conclusion that a 3:2 beat means that there is only a 5:8 chance that a team is better, perfectly makes sense in my mind. Popular sports are fun because they are close to watch.

PhasmaFelis · on June 23, 2014

The xkcd comic is funny, but it's not a very good insult, since it applies not just to sports commentary but to essentially all human experience.

ajuc · on June 23, 2014

Everything runs on quantum mechanics, which is a lot of weighted random number generators. So yeah.

mturmon · on June 23, 2014

It turns out that others have noticed the Poissonian characteristics of soccer scores, and taken the idea farther. See:

http://arxiv.org/pdf/1002.0797.pdf

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjourna...

Someone · on June 23, 2014

By that logic, chess is a Poisson process, too.

However, I think there isn't really any logic here. A claim that something is a Poisson process should be followed by some statistical test. I wouldn't know which one, because we do not know the various probabilities and because they change over time (for example, P(Spain:whoever) seems to have dropped quite a bit recently, and nobody knows when that happened and by how much), it will be estimate them.

But a statistical test still is needed to make any kind of claim of something being a Poisson process. I guess that, if you posed a model for way in which such probabilities change, you might be able to use some ELO-like system to estimate those probabilities and, from it, do some test for Poisson distribution. I fear such a model might have so many degrees of freedom that it is too weak to prove anything. At the very least, it would be hard math to wring anything out of such a model.

agarden · on June 23, 2014

ELO ratings are available: https://en.wikipedia.org/wiki/World_Football_Elo_Ratings

tinco · on June 23, 2014

He does a lot of math and fancy talk, only to conclude that when a match ends in 3:2, it's not very conclusive who the better team was? No shit sherlock, it doesn't take much of a genius to figure that out.

It would be more interesting if he'd actually take the time to do it proper and calculate what the chance is that there is a team outside the top 3 that is actually a better team (whatever that means) than the world cup winner. I'd be surprised if that number was very high.

Every soccer fan knows you need a healthy dose of luck to win a match, but every soccer fan also knows that you need a hell lot more luck when you play against a team that's known to be better than yours, and when you get lucky very often, maybe that's just a sign that your team is better than you thought it was.

bkcooper · on June 23, 2014

I agree with the overall tenor of the comments: this wasn't a very good article. Much of what he's saying is well known to quantitatively minded fans of sport, and the presumption that the game has failed if it doesn't identify the better team with near certainty is silly.

However, I've liked that blog for a while and think there's a lot of interesting stuff on there, for example this conversation between an economist and a physicist:

http://physics.ucsd.edu/do-the-math/2012/04/economist-meets-...

metacorrector · on June 23, 2014

he may have said it wrong, but you seem more wrong: it is not silly that games should be designed to identify who is the better player.

I think you are confusing "the best team winning" with everybody knowing who is going to win a priori.

Sprinting, marathoning, horse racing, downhill skiing, etc., they all determine pretty clearly who won, and they do a good job to varying degrees of seeming fair. That doesn't mean you know who will win, and when happens as for example with skiing under certain conditions that later runs are disadvantaged compared to earlier runs, that can vary the outcome but not in a satisfying way.

Soccer could be a better game, but I get it, you don't like change.

bkcooper · on June 23, 2014

I think you are confusing "the best team winning" with everybody knowing who is going to win a priori.

If the game goes to the best player with near certainty, then unless there's a lot of noise in determining the best player beforehand, you will have comparable certainty about who will win.

Sprinting, marathoning, horse racing, downhill skiing, etc., they all determine pretty clearly who won, and they do a good job to varying degrees of seeming fair.

Determining clearly who won is not the issue --- it is, after all, very easy to tell who won a soccer game. What is at issue is determining who is best. I don't find these examples convincing in that regard. We could do a better job at identifying the top sprinter by looking at performance over multiple races, instead of one race; this would make us less sensitive to things like stumbles, bad starts, etc. I don't think that doing things in this way (Olympic best of 15 100m contests!) would actually be more exciting, though.

Soccer could be a better game, but I get it, you don't like change.

Um, ok?

metacorrector · on June 23, 2014

<i>If the game goes to the best player with near certainty, then unless there's a lot of noise ...</i>

that's what the guy is saying, that in soccer, the best team does not win with near certainty. You understand now.

grayclhn · on June 23, 2014

Two thoughts:

1) this is a somewhat bizarre exercise. It seems like it's more relevant to work out the probability that the better team wins, for empirically relevant values of what the author's calling "expectation value" --- i.e., give team A an expected scoring rate of (say), 2.0 goals per 90 minutes and give team B a rate of 1.9 goals per 90 minutes, then see how likely each team is to win. (R code, based on 10,000 simulations because I'm lazy and don't feel like working out the probabilities by hand:

     > mean(rpois(10000, 2) - rpois(10000, 1.9) > 0)
    [1] 0.42
    > mean(rpois(10000, 2) - rpois(10000, 1.9) < 0)
    [1] 0.38
    > mean(rpois(10000, 2) - rpois(10000, 1.9) == 0)
    [1] 0.20

In words: the better team wins 42% of the time, the worse team 38%, and they tie 20%;)

The numbers 2.0 and 1.9 were made up, and the disadvantage of this approach is that you're required to get some data and estimate reasonable values of each team's scoring rate, or make them up. I should note here that OP has to do the same thing; the line

    "We can turn the Poisson distribution around, and ask: if a team scores N
    points, what is the probability (or more technically correct, the probability
    density) that the underlying expectation value is X?"

is nonsense without a Bayesian interpretation, which requires a prior density (which is the equivalent of making up numbers like the 2.0 and the 1.9 I used above, and weighting them by how likely you believe they are). Note that I could put a dogmatic prior on 2.0 and 1.9---a point mass for each team at one of the points---which would make me believe that the "expectation values" are 2.0 and 1.9 regardless of the actual score. Clearly that would be a bad prior, but it's not clear that the implicit prior used in the article is a good one.

2) More important: teams sit on leads, so the probabilities aren't constant. A team that outclasses another team probably won't win by 19-0, because at 8-0 they'll focus less on scoring and more on playing defense and avoiding injury. The same concept applies less dramatically with a 4-1 win, a 3-2 win, etc. Modeling strategy is much harder, so I'm not going to provide R code :). But one effect is that a poisson process will probably predict too many extreme scores.

That said, the OP's thesis is self-evidently true: "My thesis is that soccer is an amalgam of random processes whose net effect produces rare events—those more-or-less unpredictable events spread more-or-less uniformly in time." It's a game with about 3-5 goals scored in 90 minutes. They're rare!

shkkmo · on June 23, 2014

The author seems to have missed the point of the "Let's use them to build Narratives" line.

The World Cup is not popular DESPITE having a fairly random outcome. The World Cup is popular BECAUSE it has a fairly random outcome.

The point of the World Cup is not to find the best soccer team. The point of the World Cup is to build a strong shared narrative.

In this case, much of the strength of the narrative being built comes from the integration of unexpected events. More of the strength comes from the interaction between the narratives of the stadium audience, and the live team as random events occur.

Perhaps this is why introducing random events into tabletop RPG's is effective.

vdaniuk · on June 23, 2014

Many posters here criticize the a anti-sports bias, but perhaps this bias is useful to an individual and a humanity?

Popular games became popular because of the path-dependent random development some time ago, should we continue on with the status quo?

Or should we try to engineer a new type of game with massive popularity that would be more beneficial to the society and the players and the watchers?

I guess the latter. Afterwards, football, soccer, hockey sports popularity is based on marketing and are successful in large part because of the huge switching costs.

jdmichal · on June 23, 2014

I was under the impression that soccer was popular because all you need is a ball and a pitch. Low cost of entry leads to high engagement.

gd1 · on June 23, 2014

Strange argument. How does modern soccer (quite attacking these days) compare to the number of goals per game in ice hockey, touchdowns per game in the NFL, or home runs in a baseball game?

mturmon · on June 23, 2014

This is addressed in the article, but I think it's worth expanding on, because the article is (I admit) unclear and imprecise.

NFL gameplay has a state variable (field position) that strongly affects score probability. This state variable accumulates over long periods of time in the game, and is thus influenced by skill.

On the other hand (the claim goes), scoring potential in soccer is mostly affected by possession of the ball, which changes frequently, and there isn't a persistent state.

The effect is that, with each possession, there is a small chance of Team A scoring. The ball passes to Team B, and then back to Team A. Team A then has another shot at scoring, which (due to lack of persistent state) is largely independent of its earlier chance.

To be more definite, the final score of Team A is:

  S = C1 + C2 + ... + CN

where the Ci's are almost statistically independent, 0/1 random variables, with P(Ci = 1) rather low. Each Ci indicates a score on a given ball possession. This is a situation where the Poisson limit (http://en.wikipedia.org/wiki/Poisson_limit_theorem) is applicable.

The NFL situation does not decouple this way, because the Ci's are not independent, due to the field position issue. The corresponding state variables with baseball are balls-strikes and players-on-base.

If you believe in the Poisson model, then Poisson model + low counts is an unfavorable regime to determine if Team A's score-probability is less than Team B's. On the other hand, if it's high counts (i.e., S is large) then it's easy to tell. This validates a commenter nearby who says he thinks season-wide scores provide more insight than tournaments.

As you mentioned, hockey would seem to be another good parallel to soccer (I think). It was smart to notice that.

Incidentally, I don't care one way the other about any of these sports, but I think the probabilistic analysis is interesting.

kybernetikos · on June 23, 2014

The discussion of the 'state variable' was interesting. However, it seems as if your model would give a wrong result for a game where one side had near 100% possession (and so relatively few possession changes). In particular, being stronger at maintaining possession often leads to multiple attempts on the goal because of the rules about corners). Perhaps that model would be more applicable for basketball?

The other strange thing about this whole discussion is that it seems to suggest that you could field a team of random people and have a nonzero (within normal human experience) chance of besting the top team in the world. This is so false it's laughable. Even in top tier play, where the teams are all closer together, there are predictable differences in skill and teamwork that ensure some teams would almost never win against particular other teams.

If the Poisson noise theory is correct, I would hope it could lead to specific predictions at odds with the way current professional bookmakers evaluate teams chances, and could therefore lead to a quick and easy way for this author to put his money where his mouth is.

mturmon · on June 23, 2014

The notion of "possession" plays a very strong role. As noted by other people on this thread, there is strategy in sitting on the ball once you pull ahead. In the light of the simple model above, this can be seen as controlling "N", the number of possessions, rather than any single Ci (goal). The model above does not allow for such an "N" (i.e., where "N" is a function of the partial sum C1 + ... + CM). It basically assumes possession trades back and forth a significant number of times, independently of other stuff.

"team of random people...so false it's laughable..." -- Absolutely true. This fallacy is implicit in some of the language in the OP, which, incidentally, seems chosen to goad fans.

But I don't think it's a problem for the model. Basically, there are two per-possession goal probabilities when team A plays team B (call them G_AB and G_BA; they are between 0 and 1).

These numbers are a function of both A and B, because if Team A is really bad, and B is good, then G_AB ("chance A scores on B") will be really low. But if C is just as bad as A, than G_AC will be moderate.

The expected points scored by A is N * G_AB, and by B, N * G_BA. The model allows G_AB << G_BA, and indeed G_BA ~= 1, so then it predicts B will almost always win the game.

Note that the Poisson limit will not apply if G is not pretty low (say, less than 0.1), and you need N moderately high (say, bigger than 10) so that N * G will itself be moderate.

*

The issue (not well articulated in TFA) is that when both Poisson variables are expected to have low counts, it's difficult to distinguish slightly-different per-possession probabilities.

kybernetikos · on June 23, 2014

When one team is dominating another, they get multiple attempts to score for a single possession event (powerful shots on target are commonly redirected by the keeper which allows the attacking team another chance from a corner or rebound), and their probability of scoring from any shot increases because their control of play allows them to take more certain shots. And of course, brief possession in a teams own half does not really translate into any kind of scoring chance for them.

Of course I'm not sure that those factors are enough to completely sink the model, but there certainly are such factors in play.

> The issue (not well articulated in TFA) is that when both Poisson variables are expected to have low counts, it's difficult to distinguish slightly-different per-possession probabilities.

I take this to mean that in a match up between broadly comparable teams, the outcome will be indistinguishable from chance, which most fans would accept as completely fine (and possibly even preferable). I still find it hard to reconcile this with the dominance of a few teams over quite long periods of time in international soccer, even in the presence of relatively low scores although I suppose I would have to do some actual number crunching to tell whether this was relevant or not.

gd1 · on June 23, 2014

Can't say I agree with that. Soccer has a state too, just because they don't pause the game after every completed pass for a few adverts, a timeout, and 12 replays, doesn't mean it doesn't have state.

mturmon · on June 23, 2014

OK, if we define one interval as a ball possession (one of the Ci's above), then what factors from C(i) influence C(i+1), and how strong is that influence?

The question is not "does it have state", it's "how much does it have relative to other sports mentioned".

Incidentally, I notice you're digging at NFL ("a few adverts, ..."). This is off the mark for two reasons. First, this is not about whether NFL or soccer is better. But also, pauses in the game don't affect the argument. It's typically very hard to move the ball way downfield in NFL, but (seems) pretty easy in soccer. This gives rise to a state variable that persists across drives in NFL gameplay. Basically, every NFL yard is a pseudo-point, and they add up in a fine-grained way.

secstate · on June 24, 2014

Yes, but a team with tons of yardage has no promise of putting it in the endzone, or even field goals. It's the same issue as with possession. The Netherlands just dominated Chile in their last pool match today, but they had 34% possession.

As long as I'm commenting I might as well also complain about how silly and reductionist the whole OP is. I was a nerd long before I followed sports, so my allegiance is in data organization and mathematics. But anyone who has set foot on a field in an athletic endeavour where they, as part of a team, had something to win, don't really give a shit for your mathematical odds. Sometimes I fear for nerds who don't appreciate the spirit and gift of being human. And especially for the magic in fleeting moments of exhilaration that probably don't matter at all in the life cycle of your friendly neighbourhood quasar.

taliesinb · on June 23, 2014

It was linked in the comments of this post, but the blog post of my colleague uses machine learning to predict the outcome of knockout matches with some 70% accuracy: http://blog.wolfram.com/2014/06/20/predicting-who-will-win-t...

jedberg · on June 23, 2014

No sports have enough data to say that "the best team won". There just simply aren't enough matches.

But most people's enjoyment of sports comes from watching the execution, not the stats. The stats are just icing.

Yes, there are people who get enjoyment just from the stats (baseball is notorious for this), but for the most part the stats are just an interesting side show for the main event.

ownagefool · on June 23, 2014

Actually, all the sports I know of have a pretty simple method of dictating who the better team on the day was. Otherwise, how would we know who won?

Still, for many the debate is part of the fun. :)

taeric · on June 23, 2014

I think the assertion is that "who won the game" is not, strictly speaking, the same as "who is the better team."

That is, it could be akin to saying that "this particular coin shows heads today" because that is what the last flip did. Definitely true, but does not tell you anything about the nature of the coin.

ownagefool · on June 24, 2014

When the object of the game is to win, you can pretty much determine the better team by result. You might not always agree with how they achieve it, but you can't really argue results.

That said, I was specific to say 'On the day'. Typically, it's pretty poor sportsmanship that sees most of us arguing that the better team lost.

taeric · on June 24, 2014

But can you? Imagine if your goal was to find which coin is the biased one, but you could only flip once a week.

Though, I should be clear that I doubt the better team loses most of the time. More that I just think the static view where there is a clear "better" team is flawed anyway.

taejo · on June 23, 2014

Consider a sport Chess--, which is very similar to chess, except that before making each move, the player rolls a 100-sided die, and if it comes up 100 they instantly win. If Magnus Carlsen plays Chess-- against me, and is one move away from checkmate when I throw the 100, did the best player win?

ownagefool · on June 24, 2014

Not so much Chess, poker would be more apt. We then veer into the territory of deciding what is and isn't a sport.

smackfu · on June 23, 2014

Baseball is probably closest, where you play 19 games per season against each of the teams in your own division.

Too bad they don't use the same pitchers in every game, since that adds two more significant variables.

pnathan · on June 23, 2014

I think that Cricket Test matches (take place over 5 days) also seem to stress team capability fairly well. But I'm not a cricket expert (I would need to be introduced by someone knowledgable), so I may be quite wrong!

cja · on June 23, 2014

I might be completely missing the point (haven't studied probability since 1998) but go on then, tell me who's going to win the World Cup. Or just Croatia vs Mexico, which is starting now.

Unless you can do that fairly accurately then I don't see why you're picking on football. Surely life is just random events, some executed better than others!

notahacker · on June 23, 2014

This has got to be parody right?

Assuming that Team A can meaningfully be assigned a prior score for the expected number of goals against Team B (and, independently, vice versa), and assuming without reference to any evidence that the number of goals they actually score over a period is independent from the number of goals the other side score, and assuming the outcome of the match is determined by a random number generator along a probability distribution based on said statistical priors, and not by a bunch of quick-witted athletes and a ball... you get a set of results with quite a high variance.

Any fan who struggles with basic arithmetic will tell you it's an interesting sport precisely because teams with significant disadvantages have a non trivial chance of achieving a result.

But no, that's not important because if you assume things that are palpably untrue, like there being no indications of one team being more likely to score next from general play, and scoring attempts being a matter of probability rather than ingenuity, skill and physical effort... then it would be a bit like watching a random number generator.

Admittedly, he doesn't follow soccer.

beachstartup · on June 23, 2014

the very clear anti-sports bias and rhetoric (how very original...) prevented me from finishing the article.

however, i'll just say this: a team of amateur soccer players will lose to professionals every single time. this isn't throwing dice or drawing cards. both teams have to be VERY good at soccer to reduce the outcome of a pro match to anywhere near "random".

like in anything else, it's only when skills are evenly matched that the outcome of a game cam be influenced by small variables.

devindotcom · on June 23, 2014

"My thesis is that soccer is an amalgam of random processes whose net effect produces rare events—those more-or-less unpredictable events spread more-or-less uniformly in time."

What? What proportion of events in soccer is he proposing are random? This seems a very poor start for any kind of examination of sport whatsoever.

hderms · on June 23, 2014

It's disingenuous but he assumed it because he wouldn't be able to cram it into a statistical model as easily otherwise

dang · on June 23, 2014

The submitted title was "Attention devoted to the World Cup is founded on flimsy numerology", which does appear in the post. But to make it less linkbaity, we changed it to the question that appears in the first paragraph and describes the article more neutrally.

myrmidon · on June 23, 2014

Thanks! Transparent moderation is always great, but I feel that de-baiting links is really important and generally underappreciated... Keep up the good work!

Dewie · on June 23, 2014

> It’s a bit off-topic for the series, but I can’t even go to Google now without being reminded of the World Cup and soccer this, soccer that.

An American complaining about content not really geared towards his culture on the English-speak Web? That's rich.

wmil · on June 23, 2014

I am surprised that after all of the years of tracking Google still hasn't realized I don't care about soccer. It seems like big brother is phoning it in.

yaeger · on June 24, 2014

I have switched homepages to google.com away from my local google homepage as that one does not bore me with yet another damn soccer doodle.