Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think that's expected. SGD in such a high-dimensional space is exceedingly likely to find such a minimum.


I’m aware of both theoretical and empirical arguments that I find quite persuasive that you’re exactly right.

I think it’s extremely thought provoking why they would be symmetrically, locally Euclidean in such abundance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: