Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Lookup in the ArrayMap is shockingly slow at large data sizes—again, I don’t have a good explanation for why this is.

Probably because it's doing a binary search over a large array? I wouldn't expect that to be particularly fast with large arrays, since large array + binary search == lots of cache misses.

There is a way (the name escapes me unfortunately) to order the items of the array such that for N items, the item that would be at index N/2 in a sorted array is at index 0, then the items that would be at N/4 and 3N/4 are at indexes 1 and 2, and so on which of course is much more cache-friendly when doing a binary search.



The issue is not just that it's ‘slow’, but that it's frequently slower even than std::map. My suspicion is that the issue is its use of power-of-2 sizes, which is a terrible idea for binary search because it causes cache line aliasing. This is discussed in a lot of detail by Paul Khuong.

https://www.pvk.ca/Blog/2012/07/30/binary-search-is-a-pathol...

> Even without that hurdle, the slowdown caused by aliasing between cache lines when executing binary searches on vectors of (nearly-)power-of-two sizes is alarming. The ratio of runtimes between the classic binary search and the offset quaternary search is on the order of two to ten, depending on the test case.

The approaches there are likely to give significant speedups.


Yeah, in my past experience if my sizes for my B-tree were too close to powers of 2 the performance would tank. Adjusting them to be no where near a power of two and also not take up too many cache lines was the ticket for me.


The name is Eytzinger layout.


Excellent, thanks!


Right... this result is one of those examples of why you don't guess, you benchmark, because my intuition is that binary search should be "really fast." Clearly though the cache misses hurt a lot more than my intuitions capture.


Any idea why is it not using B-trees?


Oh, the sorted array-backed "map" isn't using B-trees because I wanted it as a comparison point (to see how the LLVM containers fare against the old "just use a vector" advice). So the ArrayMap isn't production-ready or anything... it's intended to give a baseline against which to compare what would happen if you wrote your own dead-simple map replacement with the hope of improving perf.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: