Better Console Logging

I have used ROS (1+2) for many years and also teached courses with it in university. Most students or people who used ROS for the first time have difficulties to find out what the problem is if their robot is not doing what they expected. When I was asked for help, I just scrolled through the terminal where they have executed their launch and in the vast majority of cases found the problem in a few seconds, because it was something simple like a node crashing on startup. Of course, there was an error message saying what the problem is, but somehow people new to ROS (even those with a lot of coding experience) had trouble seeing it. I also remember many times where it took me a lot of time to actually find a relevant output.
So what is the problem? I think there is just too much spam in the terminal output which makes it really hard to get an overview. In this post I would like to discuss different ways on how to improve this.
I know, that it is possible to use tools like rqt_console to better structure the logging output. But I think there are some good reasons why people sometimes rely on the terminal output:

  • See general error messages that are not ROS logging (e.g. exceptions from Python)
  • Get a quick feedback after “ros2 run” or “ros2 launch” without switching windows
  • Working without GUI, e.g. via SSH

To better visualize the problem look at this screenshot.


The first is the output of starting the logging_demo which just starts a single node that sends some logging messages. Half of the output does not provide information:

  • The log level is already reflected in the color, writing [Info] is redundant.
  • I have never used the timestamp in the terminal log for anything but it takes away a lot of space. I is also very hard to read as a human.
  • The node name can be important in some cases, to know where the message is coming from. However, in my experience, most of the time I already know based on the error message which node it is. Even if I am interested in it, I would like to read the messages first to see which are interesting.

It gets worse when we use launch, as we can see in the next screenshot.


Now launch adds another tag. This one is basically the same as the name of the node and does not provide any information. There are some cases where it does, but again in the majority of cases it is just spam. Another problem, in my oppinion, is that due to the different length of the node names, the start of the actual message on the horizontal axis does not align. This makes it already harder to read. In this example the length difference is not so large, it can be much worse.

So how can we improve this?

You can influence the behavior of the logging using the environment variable RCUTILS_CONSOLE_OUTPUT_FORMAT (see Logging and logger configuration — ROS 2 Documentation: Rolling documentation).
The default is “[{severity}] [{time}] [{name}]: {message}” and leads to outputs like above. We can easily change this to tackle the problems mentioned above. The output is much more readable in my opinion.

If we really need additional information, we can also put it after our message and the node name.

However, if we start something with launch, we will still end up with the extra tag that is introduced by launch.


Now we will have to get a bit more creative and use ANSI control sequences. (I also proposed to have an environment variable which lets you control the launch output but it has not been merged yet Add option to set output_format via environment variable by SammyRamone · Pull Request #713 · ros2/launch · GitHub. So in the future the following might not be necessary.)
ANSI control sequences allow you to manipulate the position of the cursor and how text is displayed. See ANSI escape code - Wikipedia for details on this. The short version is that we can use the character sequence “\x1b[” to initiate an ANSI control sequence. For example, we can use “\x1b[1k” to perform the “erase in line (EL)” command. Then we can do “\r” to move the cursor on the first position. Now we can just erase the launch tag infront of us. Now we get an output, which I find much easier to read.

We can also use this method to change how parts of the messages are displayed. This is especially interesting if we want to show additional information but still highlight the main message. For this we can use the ANSI “Select Graphic Rendition (SGR)” code. In this example, I made the message itself bold, the node name italic, and the additional information darker. Furthermore, the color based on the message type is only used for the message itself and not the additonal information. I added a dash infront of the message, which is helpful when using launch to differentiate messages from the nodes with messages from launch itself.


It is also possible to split one log into multiple lines. This leads to a clean look, but takes up more space.

By the way, you can also use these codes in the message itself. So if you want to send a message that blinks in the terminal, you can using “\x1b[6m”. Just beware that your message will look not so nice in rqt_consol or in a log file.

Here are some interesting codes
Newline: \n
Carriage return: \r
Delete everything until this point: \x1b[1k
Reset font to normal: \x1b[0m
Bold: \x1b[1m
Faint: \x1b[2m
Italic: \x1b[3m
Underline: \x1b[4m
Slow blink: \x1b[5m
Rapid blink: \x1b[6m

See also ANSI escape code - Wikipedia for more information and the color codes.

Some Tips:

  • Beware when stetting a foreground color, as it will override the colors for the different log levels. However, if you only set the color after displaying the message itself (e.g. to put the name where it came from in a different color), then it will work fine.
  • Don’t forget to reset any formatting changes with \x1b[0m or you might mess up any output that follows in the terminal
  • Put the settings for logging in your .bashrc or .zshrc so that they are automatically set for each terminal.

If you have any further tips or ideas on how to get a nicer terminal output, I would be interested in hearing it.

11 Likes

Thanks for sharing @SammyRamone !
My tip would be to have {file_name}:{line_number} somewhere in the RCUTILS_CONSOLE_OUTPUT_FORMAT.

This will log as something like:

/home/ubuntu/ws/src/example/src/main.cpp:5

This tells you exactly which file and line the log is comming from. In addition, if you launch your nodes from a terminal inside an IDE (eg. in VS Code - commonly done if you’re using dev containers), it will create a link to the exact file / line where the logging happens.

2 Likes

Totally agree with this post - the default logging output has always bothered me, especially the unnecessary timestamp! I wish the default in ROS would be updated to be more readable, as suggested here. Thanks for sharing!

I definitely agree that nodes crashing is not clearly visible in the logs. Those logs should probably have FATAL severity instead of ERROR.

However, I have to disagree on timestamps. To me, it’s an essential part of the log. When I have to diagnose an issue on a robot, the time when it occurred is sometimes the only information I have. But I would prefer if the default used ISO format instead of seconds since the epoch.

The node names are useful to quickly filter logs from a specific node to see the chain of events that might have led to the issue.

I never thought of the severity as redundant with the color, but I guess you’re right (unless you’re writing the logs to a file and losing the coloring).

Having a way to align sections would be great. It would make it easier to tune out useless fields (while keeping them for when they are useful). Maybe something like Python format, which supports formatting types that could be used to achieve this.

It’s a hard balance to find between having too much information and not enough. The current logs might have too much information (except maybe the process name, I’ve never found it useful), but I think it’s better than not enough. I have this issue with colcon that, by default, hides all interesting logs and is pretty much unusable without specifying an event handler.

3 Likes

Are you aware of the {date_time_with_ms} format specifier? It ouputs the time in ISO format, e.g. 2024-06-11 09:29:19.304

See here.

3 Likes

Great job sharing this. I also find the node name and timestamp useful when I’m trying to determine relative times.
This is probably unrelated to your post but I would like a couple more levels of verbosity. I feel the existing ones are too limited.
Also, I think the main problem with console logging is that it runs in the context of the thread that posts the message. In a real-time system, the log messages should really be marshalled to a lower-priority thread.

There’s also this useful tool: log_view - ROS Wiki .

Another tip I have in ROS 2 is to create a logger object in each class, using a consistent naming convention.

By creating the logger object locally in each file or class, you avoid the need to pass around a shared pointer to the node (or its logger) just for logging purposes.

Using a naming convention for logger names also enables more granular control over log levels. For example, at my workplace, we follow the convention <package_name>.<file_name>.

So, for a file called bar.cpp in a package called foo, we would have:

// At the top of bar.cpp
static const auto LOGGER = rclcpp::get_logger("foo.bar");

// When logging:
RCLCPP_INFO(LOGGER, "Hello World");

This allows you to set logging level at a file level:

ros2 run ... --ros-args --log-level foo.bar:=DEBUG

or at a package level:

ros2 run ... --ros-args --log-level foo:=DEBUG
5 Likes

I do think that the named logger, that you are creating, instead of the node logger, does not publish on /rosout. If that’s important.

Great post, but getting rid of the severity of the log such as INFO, DEBUG, etc. just because YOU can see the color isn’t a good option in my opinion. What about colour-blind people? Also, you spoke about people new to ROS but you adviced to remove node names just because you already knew the origin of the problem just reading it.

Despite all this, I didn’t know about that EV which can be really interesting in case there are a lot of nodes running in the same terminal. Great post!

It might be a bit off-topic as it isn’t exactly “console” logging, but back in ROS1 I hacked a thing to forward all logging to the systemd-journal. This way, you could search the logs, filter by priority, by node, etc… like this:

journalctl --user -t move_group -p warning -S today -U '13:30' --grep "CONTROL_FAILED"

In ROS2 last time I looked, you could create a custom rcl_logging implementation, but had to use LD_PRELOAD to force its use as the packaged rcl_logging plugin was hard-linked to spdlog.

I understood the post as “hey people, these all options are available” rather than “hey, everyone should use this”. Thus colorblind people would not remove severities etc.

You might be right, perhaps I misunderstood the focus of the topic hehe. My apologies in that case!

Thank you everybody for the great ideas and tips! I am also glad to see that I am not the only one that is bothered by how the logging output looks.

I want to expand a bit on the following points:

Thanks for the input. Your post made me realize that I have not described the whole picture in my original post. I only talked about the logging output that is shown in the screen window. However, there are of course also the log files which (basically) save the same data. However, I have different expectations from them:

  • The screen should give me quick me a quick overview of what is happening. I want to see if a node crashes or gives warnings. This can either be while I launch it (when I am naturally looking at the window) or when I have the window still open somewhere on the screen and I perform some action (e.g. launch a different node in a different terminal).
  • The log should give more detailed information to find a non-trivial problem. If it is trivial (e.g. typo in launch file), I should already have found it from the output on the screen. If it is non-trivial, I will also not find it without additional info (like timestamp or line of code).
    So basically I would need to have two different types of logging configuration. Unfortunately, the RCUTILS_CONSOLE_OUTPUT_FORMAT will affect both. However, I think I found a nice hack to get around this. I already mentioned above that we can use a ANSI control code to remove everything in the line before us to get rid of the launch output. This ANSI control code is not affecting the log files, since these are just plain text files that will not interpret this code. So we can, for example, use the following configuration:
    RCUTILS_CONSOLE_OUTPUT_FORMAT="{date_time_with_ms} {name} {function_name} {file_name}:{line_number} \x1b[1K\r{message}" This will give us very detailed information in the log file. However, on the screen we will just the message. We can even add a delimiter between the types of data and we can easily make a .csv file if we want. The only downside is that we will have the control codes in the text, but they can be removed by a simple find-replace operation.

Yes, that is a very valid point. While I am happy to be able to remove this from my personal log settings, I would also agree that changing the standard to something that relies on color is not good. If we are talking about a better default solution, we could still reduce it from the whole words to just a letter. Luckily DEBUG, INFO, WARN, ERROR, FATAL all start with different letters, so the first one carries enough information in my opinion. This would save already a bit of space.

I think I was not clear on my intentions. I think it was a bit of both. Definitely a heads-up that these options are available, but also a bit of a push to reassess the default values.

1 Like

There is a wide variety of ROS users, from students and researchers to companies running in production, and different users have different priorities.

There’s always somewhat of a tradeoff between human-readability and machine parseability. In general, I think human-readability is more valuable for development or simple use-cases, and parseability is more important when building into a production ecosystem that can manage large amounts of logs from many systems. I think that the default log format does an ok job of being somewhat human readable and somewhat parseable, but there could be some improvements in both areas.

For parseability, I think that multi-line logs and relying on color or control sequences is a big no-no. But it’s great that the system allows for configuration so people can customize. And I 100% agree that it would be awesome if there was a no-code way to configure the logging for launch! The default launch format is fairly problematic for parseability because of how it treats its own logs differently from node logs.

One example of a common format that tries to balance human-readability and machine readability is logfmt. It’s a little more human-friendly than JSON for example, but has a strong enough structure to make it very easy to parse. However, a downside is that putting in key names can make it less compact. It is supported out-of-the-box by some observability engines, and its simple and consistent structure makes it easy to write parsers or regex for, unlike JSON, or the current default ROS format which is very order-dependent.

Another approach is to use a machine-readable format like logfmt or JSON, but then to use some sort of parser even for simple use-cases. For example hl can convert logfmt or JSON logs into a human readable format on the fly. So then you can get the best of both worlds.

1 Like

My main application has both ROS and non-ros logging, so I’ve been looking at tooling that ingest both nicely. So far I have found https://lnav.org/ to be a really nice UI for viewing log files, but have not yet configured it successfully as a sink for all my processes. So in the meantime, I’ve been doing

./process_that_starts_multiple_sink_sources > aggregate_log.txt
less +F aggreagate_log.txt

You can get fancier with the filename via something like "$(date +%Y-%m-%d_%H-%M-%S.txt)" as well.

So that I can still stream the output to the terminal for really obvious errors (unfortunately this appears to strip the color coding). But I can also bring up lnav to filter for more specific messages I am looking for.

Also, for people still on Humble, it is good to be aware of Logs from child loggers are not sent to /rosout · Issue #1694 · ros2/rclcpp · GitHub (which is fixed but is not backported) - tools that involve /rosout will be missing log messages.

Try less -r +F. This enables raw control characters which your terminal should then translate back to colors.

1 Like

We’ve also written a simple application to help us filter through large systems that are outputting a lot of messages: swri_console.

Wanted to share my findings with trying to set up lnav here. TL;DR: didn’t get it to work the way I wanted

While rqt_console and swri_console have a few nice features, I wanted something that launched inside the terminal otherwise I wouldn’t be able to get my team to use it. It’s somehow really hard to get people to look at logs from a different window, even if you create aliases and so on. So what I really needed is something where I can do ros2 launch/run xyz | lnav and it would show up in a similar format, but with some nice extras such as filtering. I didn’t want to do a bunch of grep/regex fu since I’d burned my fingers on dropping important logs before which is to be avoided at all cost. I tried the following two approaches with lnav:

  1. piping logs directly to lnav: this doesn’t work with launch files since the launch process logs information in a different format than the nodes and which cannot be changed (rcl_logging/#92, TODO in the code) and this isn’t an issue that will likely get fixed soon and lnav doesn’t support multiple formats in one stream
  2. So I tried instead using the logs written to file which are naturally split up based on the node vs launch process. lnav can handle multiple files in a directory with different formats, but now the issue was that those log writes are buffered every 5s to reduce CPU usage. That unfortunately can’t be configured either (same TODO essentially), so while we now have a beatifully configured logging system, the logs don’t roll in nicely, which is quite essential for daily debugging.

configuring lnav in the first place

Perhaps useful for someone else who wants to make progress on this.
warning: works but with the caveats of above section!

  1. install lnav (I believe a beta version is needed here due to recently fixed issues)
  2. set the log format for the console messages to a json format that lnav can easily deal with: export RCUTILS_CONSOLE_OUTPUT_FORMAT='{"level": "{severity}", "time": "{time}", "name": "{name}", "message": "{message}", "function": "{function_name}", "file": "{file_name}", "line": "{line_number}"}' and turn off coloring (lnav has its own coloring and it adds color chars): export RCUTILS_COLORIZED_OUTPUT=0
  3. create ~/.lnav/formats/installed/ros2-logs.json and add the following config to show lnav how to recognize ros logs and how to display them:
{
    "$schema": "https://lnav.org/schemas/format-v1.schema.json",
    "ros2_console_log": {
      "title": "ROS 2 Console Log",
      "description": "Log format for ROS 2 logs emitted using RCUTILS_CONSOLE_OUTPUT_FORMAT in JSON.",
      "file-type": "json",
      "timestamp-field": "time",
      "level-field": "level",
      "body-field": "message",
      "level": {
        "fatal": "FATAL",
        "error": "ERROR",
        "warning": "WARN",
        "info": "INFO",
        "debug": "DEBUG"
      },
      "line-format": [ {"field": "__timestamp__"}, " ", {"field": "__level__"}, " BONK ", {"field": "message"} ],
      "value": {
        "time": {
          "kind": "string"
        },
        "level": {
          "kind": "string",
          "identifier": true
        },
        "name": {
          "kind": "string"
        },
        "message": {
          "kind": "string"
        },
        "function": {
          "kind": "string",
          "hidden": true
        },
        "file": {
          "kind": "string",
          "hidden": true
        },
        "line": {
          "kind": "integer",
          "hidden": true
        }
      }
    }
  }
  1. you should now be able to pipe individual nodes’ output to lnav, e.g. ros2 run mypkg mynode 2>&1 | lnav -q. This doesn’t quite work for ros2 launch without string editing with sed -E 's/^\[[^]]+\] //'

If you want to try the log approach, do something like

export RCUTILS_LOGGING_BUFFERED_STREAM=0
LOG_DIR=$(mktemp -d /var/log/.ros/roslog-XXXXXX); ROS_LOG_DIR="$LOG_DIR" ros2 launch my_pkg my.launch.py 2>&1 > /dev/null | lnav -r $LOG_DIR

though as mentioned this will be buffered.

1 Like

In the production-scale ROS systems I’ve been involved with we have mostly ignored the ROS logging system as much as possible. Instead we redirect our logs to cloud logging providers like Mezmo or Betterstack.

The reason for this is that being able to log in JSON format has been invaluable, and it allows us to include debugging and metadata information beyond a simple log string.

Cloud logging providers make it easy to gather statistics, create errors for alerts and review history. We can also share links to log lines and collect time stamps, filter by system and process, it’s very handy.

For our own nodes and processes we use rich logging libraries to log with metadata directly to a logging agent.

For ROS nodes we don’t write, we use the RCUTILS_CONSOLE_OUTPUT_FORMAT to format to an intermediate format which then gets translated to Json by our logging agent (we use fluentbit currently)* And sent to the cloud.

The one downside to this is we don’t have rich logs in our rosbags, which is frustrating, but we have found the tools for reviewing logs from bags so inferior to the web services it just doesn’t matter.

* We did this because we ran into issues having RCUTILS_CONSOLE_OUTPUT_FORMAT log Json directly, but I see Achille above got it working, I’ll have to review what issue we ran into.