How much faster is the branchless version in practice? I make heavy use of binary search in some code I maintain, and I wonder if it would make much of a difference to switch to a more efficient version of the function.
It’s bounded precision, but Rust limits the max size of an object/array to isize’s max[1], not usize’s max. So adding two isize::MAX values using usize will never overflow.
Such an overflow could still be problematic for slices of zero-sized types, which can contain up to usize::MAX elements, since the total size in bytes is always 0. But (left + right) / 2 doesn't actually occur anywhere in the code, only left + size / 2, which clearly can't overflow as long as size is sane.
It's only size that is being divided by 2, which is the size of the segment still under consideration (which is initially the whole array). left is the starting index of segment still under consideration (which is initially 0).
Left and right are both usizes, which are 64-bit pointers. You will need work on an array of 2^63 elements before you have to worry about integer overflow issues.
This array would not fit in any kind of memory for the foreseeable future :)
Ah yes, fair point. It makes it a bit more subtle, in that the cases where you have to worry about integer overflows on the pointer addition, are cases where you have an array of 2^((64 or 32) - 1) bools... which seems rather silly to do a binary search on?
No, it uses 2’s complement and is well defined in release mode. From [1]:
> When you’re compiling in release mode with the --release flag, Rust does not include checks for integer overflow that cause panics. Instead, if overflow occurs, Rust performs two’s complement wrapping.
Zig’s binary search is here, it’s an unoptimized textbook version: https://github.com/ziglang/zig/blob/b835fd90cef1447904d3b009...
At TigerBeetle, we have our own branchless implementation here:
https://github.com/tigerbeetle/tigerbeetle/blob/e996abcf7154...