Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Glad it helped.

I've used online quantile estimators (GK in particular) to very effectively look for anomalies in streaming data.

It worked much better than the usual mean/stddev threshold approach (which was embedded in competing producsts), because it made no assumptions about the distribution of the data.

One thing to note is that GK is online, but not windowed, so it looks back to the very first value.

However this can be overcome by using multiple, possibly overlapping, summaries, to allow old values to eventually drop off.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: