I took a stab at the same problem a while ago. Since the upper bound of iterations is based on the input length, if you write your search in a way that extra iterations don't change the result, you can use a switch fallthrough to "unroll" the loop and not have to branch.
https://github.com/ehrmann/branchless-binary-search/blob/mas...