Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Finding a kernel regression in half an hour with git bisect run (2018) (ldpreload.com)
209 points by 10000truths on July 28, 2021 | hide | past | favorite | 39 comments


We used to have to do this a lot on my team at Google, except with Google tools (piper and blaze). The difference is that because Google is a monorepo, you often end up with 50K+changes to sort through, at which the log2 propertyy of bisection becomes necessary, not just convenient.


I’m sure you already know, but can you restrict it to changes from one or more directories to avoid that, worst case scenario it’ll point you to the general change even if it was due to another repo necessitating the migration.


i worked with code that could be broken by a wide range of things, anything from a change in tensorflow, XLA, device drivers, firmware, a support library, to cluster software upgrades that happens async (sometimes months after checkin) from changes in the repo.

I used to update a major numerical python library periodically, and sometimes the global test would show that some Java code broke. at first, I assumed it was just a false positive, but some sleuthing showed the Java code had embedded a python script that ran numpy to compute something over an array (whyyyyyyyyy) and numpy just happened to start outputting some slightly different values for that computation.

These are the perils and promises of the monorepo.


Out of curiosity, was there a process to raise that design "oddity" to someone? Is there a way in the Google mono-repo to even know who is responsible for which section?


There are 'OWNERS' files which list email addresses for directories, but anyone can contribute any piece of code. Generally, it is like a giant open source project to the eyes of a developer.


yes you can use the 'blame' feature to identify specific lines, or look in piper to see who edited a directory recently. In addition to OWNERS files.

By the time I finished maintaining numpy, I could more or less find the right person with a couple minutes poking. Again, it's one of the things that google managed to do well- a large scale code base with many developers working more or less in concert, with efficiency at scale.


You can still do the sneaky thing to query the build graph and only get changes in the transitive dependencies. This is almost always far smaller than the entire range between good and bad. For fast tests, the cost of doing the dependency lookup can be not worth it, but if your bisection will take more than 10 minutes, it's probably worth the effort.

It's still a little more complicated because then you have to bisect over a list of changes and not a raw range, but still.


log2 bisection is pretty necessary for just hundreds or even dozens of commits. I'm not linearly searching through the history of something that takes minutes to build and requires manual installation and test steps.


Automating the installation and test, and then pointing `git bisect run script` at it is pretty damn awesome if you can automate it.


This is one of my absolute favourite things to do. There’s something so beautiful about having software automatically finding the bug for you.


This is why keeping good, linear history is so important. The software isn't magic. It won't be able to tell you where the bug is in that giant commit that touched every module in the codebase. Good history is your input to git bisect. It's the only reason (but an extremely compelling reason) to care about your git history.


Here is a bisect scenario that seems to be happening before my eyes.

I was using git bisect to find a problem. Let's call it issue A. During the bisect, I discovered that some builds are not testable at all. Let's assume that instance of that problem are all issue B. So the verdict from the git bisect is that there are four possible consecutive commits that could be the start of A.

But those four include commits afflicted with B, which seems related. Why it seems related is that the last of those four bad commits suddenly fixed B, so builds became testable, but introduced A.

Since I was bisecting for A, I don't have a complete list of commits afflicted with B; git bisect just happened to find three commits afflicted with B.

Thus I have to do a second git bisect, this time validating for issue B, to see where exactly the builds became untestable.

That will hopefully put together the story: something like: at commit X, a serious problem showed up, rendering builds untestable (issue B). Then at later commit Y, B was fixed but A started happening, which wasn't a problem before X.


Well, find the commit where the behavior is introduced, in any case.

Or, in some cases, multiple possible commits. If the bad commit is preceded or followed by untestable commits (declared that way via "git bisect skip", they are implicated also).


A while back, I wrote a Bayesian version of git bisect which -at least in theory - means you can fully automatically find your regression even if the bug is a nondeterministic one (intermittent): https://github.com/ealdwulf/bbchop

Never actually used it in anger though. Unlikely to work in half an hour either. But you could leave a machine churning away and eventually get an answer.


I can't imagine working on a big project without `git bisect run` or an equivalent. Especially on projects with a big user surface and a QA department, there will be a constant stream of bugs that have nothing to do with the most recent commits.


Fond memories of the time I used "git bisect" to find a bug, and the commit that introduced it was some breathtakingly-large codebase-reorganisation-but-shouldn't-have-changed-anything-functionally.

Thankfully the commit didn't have my name at the top. I made him fix it.


> Thankfully the commit didn't have my name at the top. I made him fix it.

EDIT: To make it more explicit than the original passive-aggressive comment I initially wrote (below): I think we need to stop worrying about writing perfect, big-less code, but instead acknowledge that anyone can make mistakes during refactoring and we should never leave that responsibility alone.

What happened to collaboration, support, and blaming the lack of automated tests instead of the person doing the hard work of a zero-velocity (but highly valued) refactoring?


This can very much depend on your company 'culture' (for lack of a better word) if you ask me.

Depending on the company you might be in one of a few situations and not in all of them should you blame something else.

For example, this person might have actually put this PR up with a description akin to "I finally did the big xyz refactoring we've all been wanting to do for a long time but never dared. I checked a, b and c and d as well, had Peter from QA do some exploratory tests around the areas we were most afraid of and I _think_ we should be OK. I know it's a huge change set but please take and extra careful look. It _should_ be OK but ya know now I've alerted Murphy". Senior people actually gave it a good look in code review and it was finally merged. Shit happens. This guy totally deserved collaboration, help and not blame. Agreed.

Now second scenario: the guy that's known as 'the cowboy' around the company merged yet another ultimately objectively useless and simply opinionated refactoring that 'should change nothing' and got his buddies to OK the PR without even looking. Lo and behold this refactoring also went up in flames causing a dumpster fire yet again. This guy deserves nothing but to be made to fix this up all by himself. Maybe he will finally learn. If he doesn't it might be a good idea to part ways with him. If the company can't do that and protects him maybe it's time for the good guys to go to a company that deserves them instead.


The person who knows the change and thinks that it shouldn't have changed anything is probably in the best position to figure out why it broke something after all.

Probably refactoring according to a pattern, and missed something. Still, the person with context gets the bug.


Eh, depends on what you’re optimizing for. For fixing the bug as quickly as possible, yes. For spreading that context throughout the team or getting other features out that the person with context is best equipped to ship, no.


If it is just reorganisation of project files then the fastest way to fix the bug would likely be to just revert the commits.


Consequently, this highlights why the first person may have an incentive to fix it: if their code gets reverted to fix the issue, they will definitionally be forced to rework it before they’re able to merge again.


I was vaguely flippant in my post and it lacked nuance. Actually this project [and my workplace in general] is very friendly and collegial, he was happy to do it, he even already knew where to look since I had done the bisect and otherwise triaged the rest of the issue. We collaborated on an even larger-scale refactoring earlier this year.

Mainly my point was the first paragraph; that bisect is awesome right up until the issue is in a ten thousand line commit.


this isn't blame. you broke it, you fix it. this is how people learn from mistakes. do you ridicule the person? absolutely not. does it belong to the team? absolutely. But, you broke it, you fix it. See also, "Pottery Barn Rule"


On large codebases that does not hold because you can't expect everybody to know all the code and how it should behave. If you broke it but there was no test, it's not your fault.

"if you liked it, then you should have put a test on it".


git bisect works remarkably well on monorepos like Nixpkgs, where a commit hash lets you build software exactly as it was at a given point in time with complete dependencies. I bisected[0] 15K commits overnight to pinpoint when exactly a Haskell package failed, and sure enough the culprit was a glibc update from 2.31 to 2.32.

[0] https://github.com/NixOS/nixpkgs/issues/107358


Git bisect run still doesn’t (?) support —-first-parent so unless people keep every commit in merged branches passing CI, the tool is very difficult to use. I wish that feature would make it into git. Another useful feature would be to be able to specify additional code (not in the repo) to leave alone during true bisect. Usually (in my experience almost invariably) you need a new test to bisect in the past. This requires some scripting gymnastics to patch the tree after each checkout in the bisect.


I can't remember when exactly I found out about `git bisect` but it was life-changing and I never looked at git repos the same way.


git bisect is life-changing, but make sure you know about git rerere also. I don't use either of them on a daily basis, but when you need them ... wow, just wow.


I haven't used `git bisect` before, but a month ago I used it to find a bug in the RTSP part of VLC, https://code.videolan.org/videolan/vlc/-/issues/25812.

It's one of those utilities that makes your life easier in so many levels.

I was wandering if there was no `git bisect` how difficult and time consuming the bug hunting would be.


A hacky script to emulate git bisect should be fairly easy to set up. It's not much more complicated than writing standard binary search.


Cool. Now I'm (benignly) curious what's keeping you on VLC 3.0.x...


It took me a moment to realise that the title refers to the Linux kernel and not to kernel regression from statistics - https://en.wikipedia.org/wiki/Kernel_regression


In addition to that I find it confusing that the term kernel is often implicitly defined as kernel := linux kernel, although there are many other kernels where `git bisect run` is useful.


It might be a personal thing but I find the word kernel confusing both in the OS context and in linear algebra, statistics, ML. I find it very nondescriptive in each of those cases. It's almost like "kernel" doesn't invoke anything in my brain and I have to pause and think about it every time.


I swear the first thing that pops into my head, every time, is "corn". And then I go through the exercise you describe after convincing my brain this is definitely not about corn.


`git bisect run` and fuzzing are two tools in my toolbox that I don't use that often, but when I do, I'm always amazed by results.


I recently used `git bisect` to find the latest compatible code revision for a DB snapshot by checking the db migrations applied in that snapshot. Bisecting 80K commits takes a few seconds. Magical stuff.


Reading comprehension; I’ve heard of it.

I read “biscuit run.”

Good technique.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: