Understanding Misconfigurations in ROS: An Empirical Study and Current Approaches

Hi all,

Ever spent hours trying to understand why your system is not working, only to find that the culprit was a tiny bug in a configuration? If so, you’re not alone!

We’re a team of researchers from Carnegie Mellon University and University of Lisbon focusing on detecting misconfigurations in ROS! In our recent work, we study the types of misconfigurations developers encounter by manually inspecting thousands of ROS Answers questions. Our goal? To build a comprehensive understanding of these issues so we can develop better tools and techniques to prevent them. Find more details in our paper: Understanding Misconfigurations in ROS: An Empirical Study and Current Approaches!

But we can’t do it alone — we need your help! We are eager to hear your thoughts on this subject:

  • Have you encountered misconfigurations in your ROS projects, and how difficult was it to detect them?
  • What strategies or tools have you found most effective in detecting configuration issues?
  • Are there specific areas within ROS development that you believe require more focused research on misconfiguration prevention?

We’d love for you to check out our paper in ROS misconfigurations and its dataset, and share your thoughts on how we can work together to make ROS systems misconfiguration-free.

Thank you for your time, and I look forward to your thoughts!

Best,
Paulo
(https://pcanelas.com)

7 Likes

Hi, Paulo!

This is an interesting paper. Thanks for sharing!

Indeed, misconfigurations tend to be common in ROS systems, given their heterogeneity and multiple package integration.

I see that some misconfigurations are strictly related to the hardware/environment states. However, multiple ones could be solved by static analysis, which could be a contribution derived from your study.

Regards,
Michel

We’ve encountered misconfigurations many times, typically:

  1. Misconfigurations that resulted in obvious symptoms without really pointing to the cause; or
  2. Misconfigurations that did not have (obvious) symptoms, so we didn’t know there was an issue and thus weren’t looking for it.

To identify & fix these misconfigurations, we’ve found system execution visualizations to be invaluable. In particular, we’ve been using ros2_tracing to collect runtime/execution data and Eclipse Trace Compass to visualize it. I’ve talked about this in a ROSCon 2023 talk: Improving Your Application’s Algorithms and Optimizing Performance Using Trace Data (slides, video). This kind of tool makes some issues really obvious.

I think everything around the executor(s) deserves more research. Specifically, given a system/application made up of multiple nodes, how can everything be orchestrated (e.g., split into N processes using executor X and executor Y) for optimal performance?

2 Likes

Excuse me for the slight off-topic, but I find your mention of runtime traces interesting.

I have been working on an easy approach to generate runtime monitors that verify certain properties over a trace of messages (online and offline). Do you feel that such a tool would have been helpful in your case (writing down properties)? Or do you feel that the visualization you used is fit for the job?

For example, verifying that topics are correctly remapped should be easy enough with both methods, but perhaps you ran into more complex issues where property verification would have been the go-to solution.

Both can be useful of course, but I don’t think that online/offline monitoring of some properties/expectations (there have been papers on this) would completely replace this kind of visualization tool and all the higher-level information/metrics it can provide. It would definitely be useful as a first layer for validation or as a complementary tool, though!