Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Don't throw the baby out with the bathwater. There are ways of calculating whether a hidden variable is more likely to cause the observational data than the independent variables of your model.

A simple example would be a wet lawn. We know rain causes a wet lawn, and our observation shows indeed that rain and wet lawns are strongly associated. However, observing a case where a given lawn is wet, and yet there's no associated rain is a clear signal that a latent cause hasn't been accounted for (namely sprinklers). This principle still applies in noisy observational data or probabilistic rather than deterministic causal relations, though you do need a bigger sample to reach the same confidence.

We also can look at measures of model fitness. If a variation of the model that hypothesizes a latent causal variable is more likely to generate the observed data than the model without it, we know we've missed something. The general case of this is that we learn the model itself from the observed data.

This page from Kevin Murphy is a reasonable survey of the methods: http://www.cs.ubc.ca/~murphyk/Bayes/bayes.html I also recommend his textbook.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: