Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The flip side is with messy real world data you just need a model that's ok enough, rather than being concerned whether the p-value is this or that.


At that point, if you don't care about interpretable coefficients, you might as well use gradient-boosted trees or a full neural network instead.


It depends on the "severity" of the violation of assumptions--you can also use GAMs to add flexible nonlinear relationships--and the amount of data you are working with. Statistical modeling is a nuanced job.


I tried to argue that while at CMU and it didn't go well.


They may not know at CMU that the vast majority of applied, trained-on-data statistical models that help run the modern world seriously violate one or more of the model's assumptions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: