I’m here to let you know about a new tutorial I’ve written to help with new ROS/ROS2 users interacting with gdb. All too often, we as maintainers get tickets (and emails… please don’t email us) with something along the lines of:
help, my thing failed, I don't know why. Please?!
[ERROR] [planner_server-6]:
process has died [pid 73156, exit code -11, cmd '/home/<install_path>/install/nav2_planner/lib/nav2_planner/planner_server --ros-args -r __node:=planner_server --params-file /tmp/tmpsiwxomfg -r /tf:=tf -r /tf_static:=tf_static'].
Look familiar? Absolutely. Looks like something I can help with? Absolutely not. My responses usually go something like this
Give me your versions / OS / install type / other metadata
Get me a backtrace or print some logs to isolate the area of failure
If you’re so compelled, try fixing the issue you find.
A typical response is “What’s GDB? How do I get a backtrace?”.
This is where the new tutorial comes in place. Many junior developers and new C++ folks haven’t run into GDB and Valgrind yet. There’s also a number of ROS developers learning C++ and ROS together so they might not have experience to know how to react to these situations. This tutorial tries to standardize a workflow in ROS2 for getting backtraces that we can work with as maintainers. The goal is to use to link users who file tickets with crash reports to a workflow to get useful information from it we can act upon.
As such, I’m very open to more details, guidance, etc to be added to this tutorial so that we can use this across the ecosystem as a tool for education, but more importantly progress in debugging issues reported by users without this knowledge. I encourage you to look over this and submit PRs with any additional information or context you think might be necessary to use for this purpose in your own projects. As always, these are living documents and the hardest part is just getting started. Help me make it awesome!
Also see https://github.com/ros2/launch_ros/issues/165 I filed while writing this tutorial. I found that launch files prefix aren’t giving back gdb session prompts after a crash, which is a problem. It was confirmed by other users on Foxy and master builds.
Edit: The tuto looks great.
It does not cover (possibly rightfully) launching a multi-node launch file all at once with --prefix. As far as I know that’s not possible at the moment - there is no --launch-prefix, but there are workarounds.
Awesome! It looks to be only available in ROS1 for the moment so I couldn’t update it to include that, but I’d love a PR to add a subsection about that should it be ported
Many IDEs will have some kind of debugger or profiler built in, but with ROS, there are few IDEs to choose. Therefore it’s important to understand how to use these raw tools you have available rather than relying on an IDE to provide them.
While gdb CLI is great when your in a tight spot or just want to quickly inspect a crash, I still find IDE debuggers advantageous for deeper debugging sessions, where code model navigation goes well with interactive breakpoints, and GUI visualizations help me inspect the call stack and memory contents at a glance. That said, it’s still not as straightforward to use for the same reasons stated in the tutorial, given ROS’s extensive use of launch orchestration tooling and dynamic runtime parameters.
Perhaps it might be worth adding a section to demonstrate attaching a debugger to an already running ROS process. This alleviates the need to alter the launch sequence, though isn’t as helpful if the node in question crashes before you can attach or when it is the startup of the process you wish to debug.
In such circumstances, if the node itself is relatively stateless with the rest of the subsystem, I can get away with letting that one node among many in the launch file crash, then read from the ros launch traceback over stdout to copy the process commands invoked by roslaunch itself and paste them into the program arguments for a new debugger session. A caveat is that roslaunch passes parameters via temporary yaml files which also go out of sync if you happen to alter the deriving launch files.
It would be nice to have some sort of IDE integration that could hint to roslaunch which processes is marked for debugging, as having to single out a particular node from a nested set of launch files and reproduce the same startup environment from a separate shell session is a bit tedious.
P.S. if anyone knows a way to force roslaunch to always show the underlying commands used to spawn orchestrated processes without needing the respective node to crash, that’d be awesome.
I welcome PRs if you wanted to add a subsection about working with specific IDEs. I didn’t want to tie this tutorial with a new editing tool since that would be even more confusing I think for most people if they weren’t already using <whatever IDE you write about>. A tutorial or introduction to a topic should never expose the reader to multiple new ideas or tools to explain or introduce another concept. However, we could add an “Advanced” subsection including this information with the expectation the reader already read how to use the CLI tools.
Plus as I explain, sometimes you need to do remote debugging sessions. Understanding how to use GDB on commandline is an important tool in your toolchest you shouldn’t rely solely on your IDE for. Knowing the base tools is important, then you can use the convenient tools.
I thought about this over the evening and a middle ground if you didn’t want to write a full-blown QT Creator tutorial would be adding a note in the introduction on a list of IDEs that support colcon & has a debugger built in. If they themselves have documentation about interacting with its debugger that could substitute synthesizing a new section.
To circumvent the issue of not being able to send input to GDB when using ros2 launch, opening the executable in a new terminal works pretty well. For instance, when trying to debug an instance of Ignition Gazebo while using a launch file, adding gnome-terminal (or any other shell) to the ‘prefix’ option allows you to continue using GDB.
@v-lopez I’d be more than happy to merge a PR adding documentation to this tutorial about using this tool! It would only take you 10 minutes but could really help other people thrashing for hours on this topic.
Oh my god, just discovering this. So useful and convenient. Confirmed working with ROS2 current rolling / Galactic.
This deserves to have more publicity!
I’m happy this all took off so well! This is one of the most successful tutorials on the Nav2 documentation. If there’s any other common skills that are worth an explicit walk-through page, let me know!
18-ish months later, this has helped me guide users on Nav2, random packages, and ROS Answers get the data required to dissect an issue and all these projects are better for it!