Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I will corroborate one main point of the article: debugging with rr is so much better than a traditional debugger that I will only buy Intel hardware, even with all of the security flaws and corresponding performance hits, even though recent AMD CPUs would halve my half-hour compile time.

It really is that much better. Once you start using it, it's really hard to go back to a situation where you don't have it. It's like driving a car that can only turn left -- sure, if you need to go right you can always 270 degrees to the left (by rerunning over and over again to get closer to what you want to examine, assuming your problem is reproducible enough), but it feels burdensome and ridiculous.

If AMD fixed their performance counters so that rr could work, I would switch immediately.

I am also a fan of Pernosco. It is especially useful when hooked up to a continuous integration system that can hand you a ready-to-go recording of all failures. You can get assigned a bug with a recording to go along with it. Even if it doesn't look important or tractable to look into the ordinary way, having a recording makes it easy enough to take a glance and check it out that you'll actually do it. At shops that do a lot of system-level programming (as opposed to scripting), the productivity boost ought to be worth a serious amount to the bean counters.



FWIW, rr support for AMD CPUs is getting close. See https://github.com/mozilla/rr/issues/2034#issuecomment-55649...


I don't see any signs here that it is close, just that it is still broken in the same way.


Also from the issue: "the bug, whatever it is, could be in the kernel and not the hardware"


What prevents you from having a build server that uses AMD cpus? Budget or physical space limitations? Power consumption concerns? Too complicated to have different build and test boxes? Large binary and debug symbols that would result in no effective gains after network transfer time was accounted for?


In theory, nothing, and that's a good idea that I've considered. My past experience with distributed builds (distcc, icecc) hasn't been that great. They've worked, but they tended to break down or slow down, and I ended up spending more time maintaining my setup than I gained in compile times. Perhaps things have improved. I still have bad memories of trying to diagnose why the network transfers had slowed to a crawl again (on a local gigabit network.)

The other demotivator is noise. I work from home, and the other family members who share my office aren't keen on the sound of a fully spun up build server. I could run ethernet cable under the house to another room, maybe. (Relying on wifi bandwidth hasn't worked out very well. If only debuginfo weren't so enormous...)


> network transfers had slowed to a crawl again (on a local gigabit network.)

10 gigabit is pretty cheap nowadays. Just in case your problem could simply be solved with higher bandwidth...


It very rarely hit the actual bandwidth limit. It would start out close to it for a while, then drop down. And down. And down. Until it was using like 2% of the full bandwidth, but never completely stalling.


I have use gdb and various debuggers for a long time. They are very useful but there are limitations.

"gdb rr" - tried it with demo helloworld. That works. But when tried it with a more complex program (FBOSS - Faceboss' Switch/Router open source program), gdb rr function core dump immediately.

Also there are certain categories of bugs what are very difficult to use gdb with:

   * multi threads / race condition bugs - gdb break point in affect the flow of the program and cause different behaviors. 

   * Bugs can not be reliably reproduced.
 
   * Server programs or embedded system programs where the code can not be stopped.

   * This problem get much worst when combine with performance related issue - I work on one system design requirement that would switch over to backup system when system missing a heart beat for 50 milliseconds.     Any gdb breakpoint in that app would automatically trigger a hard system fail over to backup.


BTW, I do use gdb a lot and can script gdb to do conditional breakpoints with script that auto dump variables, info etc.


I'm not sure what you mean by "gdb rr". "rr" is a separate debugger that uses the gdb frontend but a different backend. Importantly, it does not utilize breakpoints during the initial program execution, precisely to address your first, second, and fourth points. The third point is still a problem, since it requires the program being debugged to run under rr from the start.


I would highly recommend looking into dtrace or bpftrace (depending on your platform).

They're not as powerful as a debugger, but you can use them when nothing else will work. They're totally safe to use in production, because they can't modify memory, and they have a very small performance impact when in use.


If you really want to use rr on FBOSS, please file an rr issue and we'll look into it.


The wild thing is that scripting languages don't have that kind of a debugger.



Is there any equivalent on chromium? I believe Firefox devtools have been ported to chromium.


As far as I know, there isn't. This needs substantial support in the engine, so "devtools have been ported to chromium" is not enough.


Python had one:

https://morepypy.blogspot.com/2016/07/reverse-debugging-for-...

Never got popular enough to be ported to cpython, let alone python 3.

It's no surprise though: my experience is that most python devs don't know how to use a debugger. In fact, the huge majority of tutorials to learn python don't include one, nor they give explanation about python -m, venv, pip, the python path and other very important basis.

Even the people using pdb don't know very important commands like until, jump, up or tbreak.


Jupyter notebooks are also hilariously debug-unfriendly.


Wait what? Do you know about the %debug magic? It is the best thing ever, particularly when prototyping in the notebook which keeps most of the state.

I saw Fernando Pérez use it in a YouTube video. Shame I cannot find it anymore. He was saying he debugs his code only by placing 1/0 in the part of the code he wants to inspect. After the exception is thrown %debug allows you to step into the stack and explore it via a repl. I found that crazy and tried it it. It is life changing. That's how I debug my python code for years now.


Expecting users to sprinkle 1/0 through their code is arguably what I mean by "debug unfriendly".


Pdb works fine in notebooks though, and that's one of the numerous reasons students should learn it first, before any GUI debugger.


That implies that students don‘t stop listening as soon as anyone mentions command line.

Those who desperately need a debugger to learn are the first ones to not use it because of no GUI.


It's sad really. I gave a talk a year ago on debugging with the pdb, the basics. Turns out I was the only developer who had ever used the pdb as opposed to bucket listing it.


same with arduino. breaking from pro emedded work to Arduino for some just for fun projs, im flummoxed wo my debuggers and astonished how unfamiliar the community is!


User name checked out ;)


Maybe that's because of the different kind of work I do with but I have wrestled, frustrated at some bugs hard to spot with gdb, but I never felt retrained by pudb despite it being, I guess, more limited.

All the bugs I had in python were easier to reproduce, probably because it provides less rope than C or C++ to hang yourself. Usually, stopping a bit before the exception, exploring the variable states was enough to spot the bug.

Isn't there something about garbage-collected, dynamic-types, GIL-constrained languages that makes their typical bug easier to spot with a more limited debugger?


Depends. Image-based runtime languages such as Smalltalk and various Lisps will, crudely put, pause at the site of the error and give you options to fix it and continue. That already covers 99% of the OP's issues with gdb.

edit: Implied is that image-based runtimes by definition produce reproducible snapshots of the error, with the full power of the Lisp or Smalltalk introspection tools. Edit-continue is the icing.


OP here. Edit-and-continue addresses none of my issues with gdb. Read the paragraph "If you use a traditional interactive debugger..." again?


I think you are missing the implications of working with image-based language runtimes.

The CI or even the client can attaches the live image of the error, the programmer opens up the image when he opens the ticket and has the full venerable introspection tools of CL or Smalltalk. This directly addresses the reproducability issue you raised in said paragraph. There are indeed occasions where you need a proper rr-like trace, but what fraction of bugs fall into that category.

For illustration purposes: Grammarly has an example of fixing an intermittent network bug on a production server under load on the much-HN-linked blog post: https://tech.grammarly.com/blog/running-lisp-in-production


> There are indeed occasions where you need a proper rr-like trace, but what fraction of bugs fall into that category.

Good question. I guess it depends on what you mean by "need".

If you mean "for what fraction of bugs will a developer be unable to figure out the bug, given only the state where the error surfaced, but given unlimited time", I don't know. Most developers I've worked with either have rr recordings available or they do the work to reproduce the bug locally so they can work backwards to the root cause by rerunning the test, so their experiences don't reflect those constraints.

If you mean "for what fraction of bugs is it valuable to have an rr recording", I think the answer is very clearly "almost all". Many rr users, some on this very HN thread, will testify that debugging with rr is almost always better than debugging with gdb alone, whether or not they are able to reproduce the bug on-demand for debugging with gdb.


To restate this: developers almost always want to work backwards in time, by rerunning the program under test if necessary, even if you argue that in some sense they don't have to in some cases. So I think providing a detailed snapshot of the state when the error surfaces, while useful, is definitely going to be seen as inferior to the experience of a full record-and-replay system (all other things being equal).


I agree that debugging in Common Lisp is much more powerful than debugging in GDB. For one thing, when you get a condition and your repl enters the debugger, you immediately learn what condition handler you should add to your code so that it can automatically handle this case gracefully in the future. And you have the full compiler available to you in that repl, so you can redefine functions, change variables, and so on. This lets you recover from the error and leave the program running and in a better state than it was before you started. This is actually a super power that you just don't have when you're debugging a program written in C. (Technically I suppose some IDEs have an edit-and-continue feature, but I've never heard of anyone using it to save a program running in production.)

On the other hand, rr gives you a different set of features. You can rewind the full state of the program to any previous point in time. I've used rr a lot, and this capability is extremely useful. It makes debugging heap corruption, buffer overflows, use-after-free, and many other complicated bugs much, much easier. You just set a watchpoint on the data that was overwritten, then run the program in reverse (using the reverse-continue command) until it gets back to the point where the write happened. It really is like having a super power, but it's a different kind of super power than the Common Lisp repl.

Pernosco is another level above that. Instead of presenting you with the state of the program at a single moment in time and letting you run it backwards and forwards, it lets you query the full state of the program. Where before you would set a watchpoint and run backwards until you got to the previous write, with Pernosco you click on the value and it presents you with a list of _all_ writes to _all_ locations in the chain that leads to that value. For example, you can see that the value was written to the memory location from a register, and that before that it was read from some other memory location into the register, etc. It's really very cool.

There's no reason why a Common Lisp or Smalltalk runtime couldn't have these features in addition to a proper repl.


I have dumb question

> You can rewind the full state of the program to any previous point in time

Where does all this saved state go? I'm running a game that runs at 60fps, every frame 10s of megs of data change. It won't take more than a few second to run out of memory to save all the state. Further, if the game doesn't run at 60fps (or 90fps on VR) it's not usable, if saving this state slows the game down to 20fps or 15fps I can't tell if my code is working since the entire goal is to run at speed and check it all feels good.

Is there some technique these kinds of tools use to make it possible to do keep all changed state in this type of app or do they only work for some other type of app?


By capturing all external inputs to application, it can replay it many times and results will be always identical.

https://arxiv.org/pdf/1705.05937


It's a good question. rr writes all the state to disk. Every syscall of any kind gets saved and replayed back the same way when you replay the recording. This means that if your program reads a network packet in the recording, during the replay it will read exactly the same network packet at exactly the same point in it's execution.

On the other hand, outputs are not really recorded. During the replay your program will generate exactly the same output, so saving it to disk isn't necessary.

See http://rr-project.org/ for all the details.

There's no such thing as a free lunch, so it'll certainly slow your game down a bit. I recommend saving the trace to the fastest disk you can find. Also, you would want to run your game during development both with and without rr, precisely so that you know that you're not exceeding your time budget.


It's just a pity that it's on the "If you have to ask, you can't afford it" level of expense.

I'd like to try Pernosco, but we don't use it at work. There's no way I could afford it for my open-source projects, and I can't lobby for it without first trying it.


I'm fairly certain that they're working towards letting anyone sign up to use it. Of course I've no idea what the pricing will actually look like once they get there, but I doubt it will be that bad.

The real expense is in integrating with lots of different environments; Roc mentioned that integrations with Github and Travis were already working, but not Gitlab yet. (On the other hand you can do without any integration at all. Take a look at https://github.com/Pernosco/pernosco-submit/; it will upload the source from your computer to your debugging session if it's not cloned from a recognized git/hg host.)


You can try it out right now here: https://pernos.co/debug/e9UFXkGq__8KQB-N6n59lA/index.html https://www.youtube.com/watch?v=LR0pms_mQuY

We definitely are not aiming for "If you have to ask, you can't afford it".


Good to hear.

I could pay a bit for personal use, and hopefully it'd still be useful without integrations etc; I don't think I could ask anyone else I'm working with (open-source wise) to do that.


Thats what I like about lisp environments, and sadly after trying out Clojure, it can do no such thing (as of the time I tried it, about a year and a half ago), continuing at point 9f exception after allowing the user to change state interactively.


Clojure is heavily dependent on the host (JVM, JS) for such things, vs. a Scheme like Racket which is ground-up Lisp, or even a naive s-expr interpreter, which allow greater flexibility in this area.


The image contains the state that the point where the error was raised, but AFAIK it doesn't contain the history leading up to that. It's more like a coredump than an rr recording, right?


Not at all. Edit and continue is only a small part of what debuggers can be.


Doesn't need an image based runtime either. Ruby's "pry" is inspired by Smalltalk, in that it's providing functionality to manipulate the running environment, edit and reload code etc., and continue. During development I almost always have a make target to drop me in a pry session with all the app code loaded and initialized so I can play with the code as it is actually used.


My experience is this is only if it's your own code and you have a full mental model of the state of the program. I worked on a lisp project and the lead would see a crash, type some lisp, continue, app would run. The rest of the team not so much. Maybe we were all just bad lisp programmers.


Node-ChakraCore had Time Travel Debugging with editor support in VS Code


Python has several


"There's more than one way to do it"


I will concede that rr is likely extremely useful, but as a counterpoint to this and the entire article, getting bugs earlier and earlier in the development cycle beats better debugging any day. I will definitely try to add rr to my toolbelt if Ryzen support ever lands, but I still prefer catching problems by writing fast test cases instead. It's like debugging that happens automatically.


It's nice to say that catching bugs earlier is better. But what if that time is long gone?

Lately, I've used Pernosco to debug my changes to a multi-process-relevant decade-old part of two-decade-old codebase, which was single-process when written, only later became multi-process-relevant, and recently became even multier-process. (And there are test cases, but there's a huge mass of them and they depend on decade-old comments not actually being true the interesting situations.)

Pernosco is very useful for debugging multi-process interaction in cases that would be very hard to debug even with rr.


rr is exactly the harness to debug tests.

We have a system when a test fails you get a rr replay of it to debug. It's the primary use for it.


OK, I'll just be honest: I had no idea that's how people used rr. I saw demos some years ago and assumed one used it similarly to gdb but with rewind.

I think this is an awesome use case for rewind debugging, that I had never thought of, and look forward to hopefully seeing it run on AMD Ryzen processors in the future, since I've bought into that ecosystem too much to go back already.


"I saw demos some years ago and assumed one used it similarly to gdb but with rewind."

You can do reverse debugging in gdb.

Here's how:

  Run: gdb --args /usr/bin/foo --bar baz
  Type "start", then "record", then "continue"
  Now can try a number of reverse-debugging commands, for example:
    * reverse-stepi
    * reverse-next
    * etc


gdb's built-in recording doesn't scale. You can't record across system calls, thread context switches, etc, and it's about a 1000x slowdown during record and replay. (rr is more like 2x.) It is, unfortunately, an attractive nuisance, because a lot of people have tried it out and concluded that record-and-replay debugging is useless.


Ooh, have you blogged about that anywhere?


I don't see these as separate solutions. Running your tests in a debugger usually makes it so much faster and easier to find where it went wrong.

I would argue find bugs at compilation would be a separate solution from this and even earlier in the dev cycle.


Indeed, finding bugs at compilation is better, which is why I prefer languages with strong typing.

I reread my comment to make sure, but I am being painfully explicit that good debug tools are still useful. But I prefer good tests over even good debug tools. I very rarely even printf debug as a result of automated testing.

This isn’t really just lip service; I have a project right now where I literally can’t attach a debugger or run locally. Of course, that’s a flaw and unreasonable. But, with good test hygiene, you can still be pretty productive in the face of this. Not only that, pushing code feels safe, even without the ability to actually run the code locally.

Do I think debugging is bad? No. I think it’s worse than testing, which when done right can save you from a hell of a lot of manual debugging, even in the face of complex software with networked dependencies.

I get ire on HN every single time I mention the virtues of testing instead of debugging. I suspect people literally don’t believe you can exchange quantities of one for the other, but where I’m standing it’s obviously possible. Same thing comes up with strong typing frequently, where you have people making the claim that typings don’t prevent bugs - but anyone who’s transitioned a medium sized code base to TypeScript can attest to the improvements it can bring.

In my opinion, automated tests are almost literally just debugging done by a script instead of a human.

edit: It's also being pointed out to me that rr is being used to debug failing tests, which is not something I thought of and my ignorance may be contributing to some of the reason my comment was so poorly received here rather than the testing vs debugging comparison. (In many cases, for unit tests, I had never thought of using a debugger to try to find the problem, because the actual problem is often made obvious enough by the failed assertion output.)


I agree on the value of tests, but it somewhat depends on what you're working on as to how hard it is to get a certain level of test coverage. I tend to work on a garbage collector, which is notoriously good at collecting the flaws in all memory manipulation code anywhere, and funneling them down to a couple of crash sites where we are traversing memory. We use assertions very heavily to get mileage out of everyone else's test cases, and we add regression tests studiously, but we still end up in the debugger a lot.

Also, one surprisingly useful feature of rr is that just about everything is debuggable without needing to plumb through debugger invocations through your whole system. It will record an entire process tree, so you can go back after the fact and select which process to replay debug. It's incredibly useful. I've been so happy to ditch my weird wrapper scripts to run cc1plus under a debugger while compiling with a buggy static analysis plugin I'm working on. Now I just run `rr record make` and wait for it to crash. It's magical.


"a project right now where I literally can’t attach a debugger or run locally" ... this is exactly where record-and-replay tools like rr can change the game. You can (potentially) run that code under rr on the remote machine --- without interrupting it, which is often problematic --- then upload the recording somewhere (e.g. Pernosco) for debugging.


It matters how you say something. This doesn't leave much room for discussion. That's why this would attract criticism. Even though your points are valid they are not beyond discussion. Personally I prefer interpreted languages, Because of the faster edit-run cyclus. Compilation Time can be excessive. I prefer debugging over extensive testing all the time because testing takes time all the time and debugging only sometimes. Now testing is still necessary for production level applications pre-release because you don't want to make people angry. But otherwise.... I don't like obligatory typing because I find the extra effort of specifying the types to usually not be worth the effort of typing.


OP here. I am obsessed with improving debugging, but I am also obsessed with Rust. Pernosco is mostly written in Rust. I am a huge fan of strong typing and tests, but I know from experience that they don't eliminate the need for debugging in complex systems.


Yes, automated tests are almost debugging by a script.

If you've ever used GDB in command-line mode then it's evident to you that it's the same thing. You can write tests of any level.

With a bit less funky GDB it's completely viable to solve a hard task of writing unit tests for the old untestable code without refactoring for example.


I guess if it doesn't even work on AMD, ARM must be out of the question...


https://github.com/mozilla/rr/issues/1373 is tracking that, but the short story is that ARM seems to not provide enough performance counters to make the current rr approach work at the moment. So it would take either CPU-side changes or a somewhat different approach.


Technically possible, if someone wants to sponsor: https://twitter.com/rocallahan/status/1199477451602067456




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: