Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, it could be stacked on quants. It might be that quantized activations already are more "dense" and so they can't be compressed as much (from 16 -> ~11 bits), but certainly possible.


I read it similarly - that this is a specific attribute of bfloat16, so the quants folks tend to run on local hardware don't have the same inefficiency to exploit




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: