Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You're absolutely right! Some learning that we came to later that isn't unrelated to what you're saying... don't just look at metrics (in the case I've described above, it was timings of operations in a large system), but look at histograms for them. You should be able to explain why they have the shape they do. The distributions are so often multimodal, and understanding the modes helps you understand a lot more nuance about your system.

We were an infra-focused team building up a scalable data handling / computation platform in preparation to apply ML at a non-trivial scale. When we finally hired some people with deep stats knowledge 10 or 12 years ago, there was a lot of great learning for everyone about how each of our areas of expertise helped the others. I wish we'd found the right stats people earlier to help us understand things like this more deeply, more quickly.



Even some regular engineers with experience dealing with many servers will have built up an intuition to exploit.

In fact, this is often where decisions on what metrics to have in the first place come from. Ask why. You can go far without deep stats knowledge!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: