Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A Math Model Is Predicting the Ebola Outbreak (vice.com)
62 points by mikeleeorg on Oct 15, 2014 | hide | past | favorite | 21 comments


I think these models are extremely suspect, with nil predictive value. Where you have sustained exponential growth dynamics, as here, the results are fantastically sensitive to the parameters which go in the exponent, which in this case you can't predict with enough accuracy.

The simplest ODE is something like:

du/dt = k(t) * u(t)

Whose solution locally looks like (for slow-varying k(t))

u(t) ∝ e^{ k(t) * t }

One possibility is k(t) > 0 sustained for 10 effective doubling times. One possibility is k(t) < 0. In this epidemic, either is plausible (?): the difference is only a small difference in some infection control parameters. A small uncertainty that translates into a factor of a thousand uncertainty in the outcome, because it gets blown up by a gigantic e^x.

And the estimates of k(t) seem to hover around the critical value of k(t) ~ 0. In the CDC model (their Excel spreadsheet is open source [0], and FYI won't import into LibreOffice), the shape of the epidemic is purely determined by the shape of their k(t) assumptions. Their defaults parameter exhibit, first, a fast growth phase, assuming k(t) > 0; then they assume a slight reduction to k(t) ~ 0, leading to slower growth; then slightly more reduction to k(t) < 0, causing the epidemic to halt. As far as I can tell k(t) is basically speculative, and completely determines the shape of u(t). So really you predict nothing.

[0] http://stacks.cdc.gov/view/cdc/24900


Well put.

A 'predictive model' should say 'if you do X then you will end up with Y' - and the X cannot be adjusting some number. The X has to be stuff like 'building ETU's in West Africa', or 'canceling all flights',... A predictive model should be able to say 'don't bother canceling flights, it's no use - instead do this...'.

Note: I'm going to be teaching a course in Erlang programming next month where the homework assignment is epidemic modeling - 100K's of concurrent processes moving around, getting exposed to each other - that sort of stuff. Rather ambitious, but I feel it's a good use case for Erlang - a pandemic is a 'viral' chat application.


> A 'predictive model' should say 'if you do X then you will end up with Y' - and the X cannot be adjusting some number. The X has to be stuff like 'building ETU's in West Africa', or 'canceling all flights',... A predictive model should be able to say 'don't bother canceling flights, it's no use - instead do this...'.

This is just wrong. A predictive model does not necessarily have any "action" input. Example: weather forecast.


Is the course your teaching going be a MOOC or will the assignments be posted publicly? I'd love to try that assignment myself.


Course is classroom format. i will post the assignment on github.


I have not had a chance to review these models yet. Since you have, I ask:

How does the model account for isolation capacity constraints?

In other words, hospitals have a finite capacity to deal with these kinds of patients.

How does the model account for infection and mortality within the medical community?

How does the model account for handling and or procedural errors?

How does the model account for the threshold beyond which N percent of medical professionals will refuse to treat patients?

Does that model estimate this threshold? This is pure conjecture on my part, I am guessing that if four or five nurses and doctors fall ill, or worst, die, it could trigger a really difficult scenario to deal with.

Does the model account for some of the tens of thousands of afflicted in West Africa travelling outside of the hot zone while not showing symptoms and then infecting general populations during the initial phase of their sickness?

Does the model assume a tolerance range for various parameters due to mutation or other factors, for example rate of reproduction, ease of communication, etc.?


haha i can save you some time. there are two models. the first is hypersimplistic, reciprocal of an decaying exponential with four parameters and zero inflection points (although there's one visible in the data so far)... it does not even come close to addressing your questions.

the second model at least accounts for decay - gee - but is still essentially two terms with three parameters- the result is two competing decay terms.

This is, essentially, a trivial joke. it has no delay parameters for incubation, etc. I've only skimmed the article, but I can't find an actual attribution of these numbers to the model (the model itself is from a journal). It seriously appears like an editor said, "Let's get some scary-mathy-lookin stuff up there on Ebola).

if it bleeds...


"Let's get some scary-mathy-lookin stuff up there on Ebola"

That was funny. Yup, math is scary to a lot of people. Sad statement.


The CDC spreadsheet is supposed to model a single West African country, and doesn't explicitly model any of those things.

It doesn't account for finite medical capacity. It assumes a specified fraction of patients are isolated in hospitals at a given time, a fraction that increases, regardless of the size of the epidemic. I think the burden is on users to check if the results make any sense (it does give you the # of beds in use at a time, so you get that feedback).


I wish we put more effort into gathering data on what is really happening on the ground rather than modelling almost certainly wrong data. I guess on the upside the chance of catching Ebola is much lower doing studies like this.


My comment was more about forward-prediction, but you're right.


This is really, really weak sauce. They show right on the graph it following the trend line for reported cases, which looks to already be leveling off.

The problem is, as the WHO has made very, very clear: "Problems with data gathering in Liberia continue. It should be emphasized that the reported fall in the number of new cases in Liberia over the past three weeks is unlikely to be genuine. Rather, it reflects a deterioration in the ability of overwhelmed responders to record accurate epidemiological data. It is clear from field reports and first responders that EVD cases are being under-reported from several key locations, and laboratory data that have not yet been integrated into official estimates indicate an increase in the number of new cases in Liberia. There is no evidence that the EVD epidemic in West Africa is being brought under control ...

Evidence obtained from responders and laboratory staff in [Liberia] indicates beyond doubt that there is widespread under-reporting of new cases, and that the situation in Liberia, and in Monrovia in particular, continues to deteriorate from week to week. Approximately 200 new probable and suspected cases, but very few confirmed cases, have been reported in the capital Monrovia in each of the past three weeks. A substantial proportion of these suspected cases are most probably genuine cases of EVD, and the reported fall in confirmed cases over the past three weeks reflects delays in matching laboratory results with clinical surveillance data. Efforts continue to urgently address problems with data acquisition in what is an extremely challenging environment, and it is likely that the figures will be revised upwards in due course."

In other words, awesome job building a model that fits garbage data.


Incredible Accuracy? You mean retrospectively fitting a logistic model to something known to grow logistically? I'm not impressed, especially not by retrospective modelling.


Here's the link for the referenced paper. I find it mostly accessible to the average science graduate.

http://www.plosone.org/article/fetchObject.action?uri=info%3...

Perhaps someone can help discuss the merits of this particular model? In my humble opinion, I suspect a lack of empirically gathered results and support for it's justifications. On the other hand, what qualifies as a representational event if we're supposed to prevent human death?


I'm voting this up. People should be aware that it's out there. But this story is worse than bad. It's the epidemiological equivalent of startups with nice spreadsheets and graphs -- there are so many hidden assumptions and unknowns, I'm not sure it's possible to make it into anything useful.


one of the important factor seems to be missing is the negative feedback loop - the more devastating the outbreak the more resources will be thrown at it


That's a positive feedback loop. A negative feedback loop would cause people to throw more resources at the outbreak until the derivative of the number of cases approaches zero.


Wow, a mainstream news source covering mathematics. This is news.


No offence, but a mathematical model can be made to “prove” anything, including the half-inch pink elephant that lives in my desk drawer.


Hari Seldon, is that you?


Not sure on the downvotes but obviously someone didn't appreciate my reference to Asimov's Foundation trilogy and the mathematical predictions of social behaviour.

It all falls over when The Mule appears! (Sorry if this is a spoiler for you)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: