Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Generalize has a tendency to imply you can extrapolate. And in most case it's actually the opposite that happens: neural nets tend to COMPRESS the data. (which in turn is a good thing in many case because the data is noisy)


The point of compression is to decompress after. That's what happens during inference, and when the extrapolation occurs.

Let's say I tell GPT "write 8 times foobar". Will it? Well then it understands me and can extrapolate from the request to the proper response, without having specifically "write 8 times foobar" in its model.

Most decompression algorithms focus on predicting the next token (byte, term, etc.), believe it or not. The more accurately they predict the next token, the less information you need to store to correct misprediction.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: