Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't get it. The problem with binary search and branches is not the branches themselves, it's the fact that until you have done the comparison, you don't know which memory location in the array to fetch next. It doesn't matter if you use branches or anything else, the question is what do you want the processor to do?

There is a data dependency: until I read the middle index, I can't tell if I want to search the data in the upper section or the lower section. I can speculate and issue reads to both. That would fix the dependency, but create more memory traffic. Is this the right trade-off?

What do you want? Removing branches is not it.



Prefetching is the right tradeoff for large arrays. Addressed at the end of the article: https://mhdm.dev/posts/sb_lower_bound/#prefetching


Right, this is why proper faster binary search use eytzinger array layout:

https://algorithmica.org/en/eytzinger


If the array is fully in L1 cache, isn't the cost of the branch mis-predict much greater than the memory fetches?


Full array in L1 is not a typical scenario for binary search. Binary search is usually for large data sets, that are in DRAM.


Binary search reduces the search space exponentially as it proceeds, so actually quite a lot of the total comparisons can hit L1d cache. (Maybe half of them for a ~250GB dataset.)

Of course, you could keep a cacheable partial index of your huge dataset to accelerate the early part of your search as well.


Sounds like we're just reinventing B-trees.


Yep. I considered phrasing my answer that way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: