Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The vectors don't need to be orthogonal due to the use of non-linearities in neural networks. The softmax in attention let's you effectively pack as many vectors in 1D as you want and unambiguously pick them out.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: