I don't get it. The problem with binary search and branches is not the branches ...

mhdm · on Aug 11, 2023

Prefetching is the right tradeoff for large arrays. Addressed at the end of the article: https://mhdm.dev/posts/sb_lower_bound/#prefetching

VHRanger · on Aug 12, 2023

Right, this is why proper faster binary search use eytzinger array layout:

https://algorithmica.org/en/eytzinger

returningfory2 · on Aug 11, 2023

If the array is fully in L1 cache, isn't the cost of the branch mis-predict much greater than the memory fetches?

alain94040 · on Aug 11, 2023

Full array in L1 is not a typical scenario for binary search. Binary search is usually for large data sets, that are in DRAM.

thefifthsetpin · on Aug 11, 2023

Binary search reduces the search space exponentially as it proceeds, so actually quite a lot of the total comparisons can hit L1d cache. (Maybe half of them for a ~250GB dataset.)

Of course, you could keep a cacheable partial index of your huge dataset to accelerate the early part of your search as well.

Bognar · on Aug 11, 2023

Sounds like we're just reinventing B-trees.

thefifthsetpin · on Aug 12, 2023

Yep. I considered phrasing my answer that way.