Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Actually, a counter-point: there might be situations where you want an anti-cryptographic hash function - where similar inputs produce generally similar hashes.

It could have some application as a preprocessing step in clustering (if it could be done efficiently).



Those are called fuzzy or locality-sensitive hashes; they’re used a lot in image processing for similarity testing, for example. I don’t consider them to be in the same category as normal hashes, though, because fuzzy hashes have vastly different properties.


The term for this is locality sensitive hashing, and it's very useful in some contexts.


Easy: for n-byte output, separate input into n equally-sized chunks (pad with zeroes if not enough). For each chunk, XOR all the bytes together.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: