Our (and your) plan for the ROS1 -> ROS2 migration

As a Company deeply using ROS for its robotics products, we are facing the problem of the “if”, the “when” and, above all, the “how” to switch to ROS2. We are then trying to understand what is the best path forward. Did some of you already think about it? Or did you already design and prepare the migration? Or are you in the process of migrating? Are you able to do it in an incremental way, without stopping the normal development? Do you want to share here your thought?

Our current idea

Our system is composed of several (100+) ROS nodes, developed in C++ or Python, deeply interconnected through ROS topics, services and actions.

We studied different possible paths to a ROS2 system, taking into account that, for obvious reasons, we cannot stop the development and reserve all our development resources for the migration to ROS2: this is particularly true because it is something that has little to no visible (short-term) effect on the product side… we actually expect a reduction of performance during the migration, due to bugs or unforeseen effects of the changes in the code as well as in the middleware.

We understood that the migration between ROS1 and ROS2 is not just a matter of changing the API, since there are pieces of software that work in completely (or subtly) different ways: actions (implemented as three topics in ROS1, and as two services and a topic in ROS2), dynamic parameter reconfiguration (unneeded in ROS2), nodelets (unneeded in ROS2), concurrency model (meaning how the asynchronous subscribing / publishing threads are structured in ROS1/ROS2 and in C++/Python), etc.

We currently designed two possible paths for our migration, both of them share the following principles.

  • we will migrate node by node, to allow to more easily debug the possible issues; this requires to use the ros1_bridge that our initial tests show to be a potential bottleneck, that we can possibly alleviate by carefully choosing the order of the nodes to migrate (always enforcing to have as little as possible topics to be shared between ROS1 and ROS2… but it’s not completely clear if, how and to what extent this is possible); actions are still not implemented in the bridge (although we know there is active development for them)
  • the final goal is to have a working system with the whole codebase using purely ROS2 (without additional interfaces, ROS bridges, etc.), but while the deprecation of the ROS1 middleware (and thus of the ros1_bridge) on our robots is urgent, the use of additional interface APIs or other temporary tricks can be acceptable for a while

The two paths differs as described below:

  • rospy2/roscpp2 path. We found a couple of interface libraries that allow us to keep our ROS1 code while using a ROS2 middleware: GitHub - dheera/rospy2 and GitHub - dheera/roscpp2. Pros: in theory, this allows us to re-use our current code-base as it is or with limited changes. Cons: these two libraries (in particular roscpp2) are at a very early stage of development and then we would require to extend/finish their development. If we follow this path, we would then need to drive these two libraries to a usable state and then start migrating our nodes to ROS2 (i.e., compile them in a ROS2 workspace, potentially modifying the package structure, and run them using a ROS2 launcher), using the bridge to allow them to communicate with the rest of the system.
  • ROS1as2 path. We can follow the opposite policy, developing a facade (called ROS1as2 from now on) that allow us to slowly changing the API of our code while still using a ROS1 middleware and workspace. Pros: this allows for a slower migration of the code (that could be also a Con), since some parts can be left using the ROS1 API directly. Cons: we have to develop this facade from scratch.

In the following I will describe in detail the two paths, please take into account that the path is to be considered node-by-node, i.e., every single node will follow this path, and the system will slowly migrate node by node to ROS2, leveraging the ros1_bridge during the months that the system is still in a hybrid ROS1/ROS2 state.

Decisions / actions to be taken beforehand

rospy2/roscpp2 ROS1as2
:gear: Install a ROS2 middleware on robots, that allows to run ROS2 nodes. The system should also contain ros1_bridge. The deployment of a ROS2 middleware can be delayed
Carefully choose an order of nodes / subsystems that, at any steps, reduces the required topics/services/actions to be shared between ROS1 and ROS2 nodes. The decision of the order of the nodes to migrate is neither urgent nor important, but it should not be postponed too much.
Implement the action interfaces, implement all the missing features in roscpp2, allow to include the library without the need to copy it in every package, etc. Implement the ROS1as2 facade from scratch, including the API for the management of the topics, services, actions.
Decide how to deal with features that are different between ROS1 and ROS2 (1. Parameter server and node access to the global parameters, 2. Nodes with dynamic parameter configuration, 3. Nodes with nodelets). Changes related to these features must be migrated anyway, even if using rospy2/ roscpp2 Decide how to deal with features that are different between ROS1 and ROS2 (1. Parameter server and node access to the global parameters, 2. Nodes with dynamic parameter configuration, 3. Nodes with nodelets). Changes related to these features must be migrated anyway, even if using ROS1as2

For each node - part 1

rospy2/roscpp2 ROS1as2
:file_folder: Modify the workspace / building system (i.e. CMakeLIsts.txt, package.xml, etc.) so that it can use the ROS2 facilities (e.g., colcon / ament) The workspace and the building system remain related to ROS1
:pen_ballpoint: Modify the code of the node in such a way that it imports rospy2 libraries or #includes roscpp2 headers. Deep changes are required for features that are not compatible between the two frameworks (e.g. nodelets, dynamic parameters, etc.) :pen_ballpoint: Modify the code in such a way that it pretends to use the ROS2 API, while having ROS1 underneath. Deep changes are required for features that are not compatible between the two frameworks (e.g. nodelets, dynamic parameters, etc.) but they can be postponed, since the node still uses ROS1 underneath.
:two: Start running the node in a ROS2 environment, connecting it to other nodes either by the ROS2 middleware (to nodes already migrated) or to ROS1 nodes through the ros1_bridge :one: Still run the node in a ROS1 environment, connecting it to other nodes by the ROS1 middleware
:lady_beetle: In case of unexpected behavior, correct/debug the code (either by correcting the rospy2/roscpp2 interface, or by rewriting the code using the ROS2 API directly, if that is not possible; bugs in this case are more probable since the code is just pretending to be using the ROS2 API)

Actions required to start the part 2 below, but only for the ROS1as2 path

rospy2/roscpp2 ROS1as2
:gear: Install a ROS2 middleware on robots, that allows to run ROS2 nodes. The system should also contain ros1_bridge.

For each node - part 2

rospy2/roscpp2 ROS1as2

rospy2/roscpp2 ROS1as2
:pen_ballpoint: Modify the code to use ROS2 API.
If the node works as expected in a ROS2 environment, this step can be postponed.
:file_folder: Modify the workspace / building system (i.e. CMakeLists.txt, package.xml, etc.) so that it can use the ROS2 facilities (e.g. colcon/ ament).
:pen_ballpoint: Correct / improve the code if it was not completely using the ROS2 API.

Change the parts of the code that relates to features that are not compatible between the two frameworks (e.g. nodelets, dynamic parameters, etc.)
:two: Start running the node in a ROS2 environment, connecting it to other nodes either by the ROS2 middleware (to nodes already migrated) or to ROS1 nodes through the ros1_bridge
:lady_beetle: Check for bugs / unexpected behaviors due to the use of ROS2 underneath (since it is using directly the ROS2 API, this is less likely to happen wrt the other path)
9 Likes

There was a very relevant ROSCon talk this year: Migrating from ROS1 to ROS2 - choosing the right bridge on Vimeo .

4 Likes

Thanks to the US Thanksgiving holiday slowing down the continual flood of PRs, I finally got the time to review the actions support PR recently. I’m confident that we can have it merged by the end of the year. It will certainly happen in time for Iron Irwini.

My recommendation is to migrate subsystem by subsystem rather than individual nodes. Divide your system into groups of nodes that have minimal connections between them, and port one chunk at a time. You can run bridges between the trunks, including running multiple bridges to avoid bottlenecks.

2 Likes

Thanks for the recommendation. Since our subsystems are responsibility of single teams, it would be also be easier from an organizational point of view. I guess you also suggest not to start with another subsystem before the previous subsystem migration has finished (or mostly finished), correct?

Regarding your suggestion about the use of multiple bridges, I am not sure to understand correctly: as far as I understood, the bridge already spawns different threads for different topics that should be bridged, so how “multiple bridges” (I read it as multiple instances of the ROS Bridge node, am I correct?) would help with the bottleneck?

1 Like

I have a few small tips for once you start making code changes:

  • Make a list of your dependencies and check if they’re ported to ROS 2. You’ll want to resolve those before starting porting your own packages.
  • Assuming your code base is divided into packages that depend on each other, I’ve found it easiest to port each package in its entirety from the bottom up.
  • Unit tests make porting a lot faster because they give quick feedback. Spending a just a little time writing tests before starting helps a lot. Even tests as simple as “make sure the class doesn’t throw when it’s constructed” are helpful.
  • When porting an individual package, I’ve found it helps to port the build system code first. It doesn’t really matter that it won’t compile yet. Being able to try compiling allows me to use compiler errors as a tool to find the next thing to convert.
1 Like

Hi there,

I’m not certain, if it’s applicable to your use-case, but we, a RoboCup Humanoid League team (Hamburg Bit-Bots) recently went through the migration process or rather are in the end-phase of it. We run ~45 nodes with a good mix of C++ and Python some of which with relative high-frequency communication (~500Hz). We already published a blog post about our “Experiences with ROS 2 on our robots". You can read the thread here and the blog post here.

As for your ROS1as2 approach:
During our migration process, we repeatedly discovered various communication and performance problems, where ROS2 worked not as expected/designed. A lot of issues have already been addressed, but writing a “fake” ROS2 interface seams dangerous. Your migration using the facade might work well, but flipping the switch to using real ROS2 almost certainly introduces numerous unforeseen problems, which you would already have resolved one by one using the other approach.

4 Likes

In our case the challenge involves a different base middleware (Isaac SDK rather than ROS 1), but is quite similar. We are adopting an hybrid approach:

  • In a first step some packages are split into middleware-agnostic and middleware-specific implementation, allowing for the middleware-agnostic part to be shared and maintained during the transition.
  • In a second step, a reduced amount of subsystems are ported to ROS 2 and bridged (a customized bridge in our case). Hi-throughput low-latency information is ideally kept off the bridge.