Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>Any large HashTable in Java starts to yield the problem of duplicate keys, it's just a weird situation, like you can 99.999% trust something ... but can't ever fully trust it so that over time, you're guaranteed to have something wrong.

hashCode() is a prehash function the outputs of which need to be mapped further to the (typically much smaller) number of buckets in a hash table of certain size (which would depend on the number of objects currently in the table), those "duplicate keys" are not a problem, they're how hash tables work in any language. Objects' hashcodes are used to find the relevant bucket, then this bucket is properly examined using equals(). HashMap and Hashtable are backed by arrays which have the max size of Integer.MAX_VALUE (minus some change) in JVM anyway, so those would need to be indexed by an int. I hope this helps to overcome the trust issues you have with Java data structures.



I think I miscommunicated.

I understand hashtables effectively work from 'hashes' which imply collisions etc..

I'm so used to using the term 'hasthable' I forgot that it implies a specific implementation, I should have use the term 'Map' or 'Key/Value' table, I'm resigned to having used the terms interchangeably too often.

The notion of 'hashes' which can produce 'collisions' creates a bunch of unnecessary concerns and complications given the ultimate objective of a hashtable, i.e. as a key-value store.

If every object had a guaranteed unique global id, which we could use as a key, then this would provide a lot of clarity and avoid problems. Of course the word 'hash' doesn't really even belong in the context of the higher level abstraction of key-value store as it's implementation specific.

Unfortunately, Java uses the word 'identity' in the System.identityHashCode which is really confusing. It's not really an 'identity'. It's misleading and I bet tons of Java devs are unaware (or forget). There's actually a bug on it [1]

A few years ago, I had to spend a day down this rabbit hole, as many devs have to and it's just unnecessary. A better use of I think would really help.

[1] https://bugs.openjdk.java.net/browse/JDK-6321873




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: