Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think its about the naive "rep_array[hashcode(obj) % bucket_size] = object" which is all too common.

If your rep_array is doing linear probing and/or robin-hood hasing, then incremental hashcodes (such as 1, 2, 3, 4, 5...) is a bad thing. Especially if you're doing both inserts and removals: this sort of incremental pattern would lead to many "runs" where linear probing would perform poorly.

Of course, it isn't very hard to do rep_array[(hashcode(obj) * large_constant_odd_number) % bucket_size] instead and get a good distribution. But the question is whether or not people know about those kinds of steps.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: