ROS Resources: Documentation | Support | Discussion Forum | Service Status | Q&A answers.ros.org

Discussion on ROS to ROS2 transition plan

(I’d like to prefix this by saying I haven’t looked into the ROS bridge recently as we’ve been focused on ROS1 on Windows bring up. If my comments are invalid or otherwise cringeworthy, please disregard.)

One of the lessons from surviving multiple API transitions in Windows - you will fragment your developer ecosystem if the API set differences are too large. The best API transitions I’ve been through, are the ones where the API transition is gradual. (Ship of Theseus comes to mind)

Is there opposition to making changes to ROS1 to make it ROS2 aware, in order to make the transition gradual?

(Admittedly trivial) Examples of these changes:

Making it so a single workspace can host both ROS1 & ROS2 could help ease the pain. What is the feasibility of making Catkin “Ament aware”, and ament “Catkin aware”, so that you can drop a ros2 node into a catkin build and have it Just Work™?

Would it be feasible to make roslaunch auto-start the ros bridge when launching a ROS2 node?

1 Like

Well, maybe not now, but in 2019/2020 (when the next normal release / LTS release is due). We need to continue with the regular release schedule past that point for the reasons I outlined in my previous post.

I fully agree with @ooeygui’s suggestion to make the API transition gradual in order not to split the developer base. Not out of charity for ROS1, but to give ROS2 a shot at survival; I don’t think it’s a foregone conclusion that ROS2 would win if the developer base is split at this point.

I’ve used ROS since 2010, and I’ve seen many examples of new APIs being introduced (rosbuild to catkin, tf to tf2, C++98 to C++11, …), and it always took many years until the majority of packages was updated. For example, tf2 was available since at least 2013, and navigation switched to tf2 only last month. Since ROS1 to ROS2 is a much larger step than any of those examples, I’m afraid it will take much longer than that for the majority of packages to migrate to ROS2.

3 Likes

I fully agree with @ooeygui’s suggestion to make the API transition gradual in order not to
split the developer base. Not out of charity for ROS1, but to give ROS2 a shot at survival;

OSRF can just going to stop the buildfarm for Melodic, then how will the ROS1 community survive? There would have to be a thrid alternative to an OSRF supported ROS1 and an OSRF supported ROS2.

I fully agree with @ooeygui’s suggestion to make the API transition gradual in order not to
split the developer base.

That was a necessary discussion to lead 3 years ago. Different people pushed for it at the time. See e.g. this discussion: https://groups.google.com/d/msg/ros-sig-ng-ros/coG7Wdkbb4E/cm5SYVe4AwAJ

And one reaction was this statement at ROSCOn 2015: https://vimeo.com/142151734, starting minute 46, Brian Gerkey took the mike:
“I’ll just add that one of the things that […] we talked about is what we colloquially refer to as a library shim. So this is something that you can imagine in any language […]. It would present a ROS1 API, but under the hood it would call into the ROS2 libraries. […] There are going to be different migration paths for different use-cases. We’re not a gang of super-villains out to give you a really bad day. We want to make this as useful as possible and make it as easy as possible to migrate.”

Then at ROSCon 2016, we got this talk:
https://vimeo.com/187696091 (Minute 29:46) with William basically saying they could not get design a shim that works beyond simple cases. Also talks about experiments to unify the buildsystem (minute 32).

I don’t think there has been any encouraging development since. Given the TSC committee notes (ROS 2 TSC Meeting Minutes: September 6th, 2018): “When will ROS1 be EOL’d? […] It would be good to have a tentative plan for migration. The longer people can use ROS1 the less motivated they will be to move to ROS2.”, it seems that instead of providing a smooth migration path, just scaring the ROS1 community into all trashing their existing work and investing in rewriting their systems for ROS2.0 is favoured by OSRF and the TSC. Basically make the community pay for OSRFs decision to create ROS2 fast and backwards-incompatible, and worry about migration later.

So not sure if Brian today would repeat his above joke “We’re not a gang of super-villains out to give you a really bad day.” in light of all the talk of making Melodic the last release.

1 Like

If OSRF were to do that, there would be a fork. All those companies and research labs deeply invested in ROS wouldn’t just say “well, it’s been fun, I guess we’re just going to do something else now”. Everything necessary to run one’s own build farm is open source, so it’s not like OSRF have a kill switch for ROS in their hands that they can push at any time. But probably there wouldn’t be just one, but multiple forks; also, without a source of funding, it’s unclear to me how the fork would be able to keep up the quality of support that OSRF is providing at the moment. Thus the “chaos, confusion and pain” I mentioned in my earlier post.

That said, I have confidence in OSRF to do The Right Thing™ (i.e., not trying to actively kill ROS), and I really believe Brian when he’s saying they’re not a gang of super villains. :smiley:

Your points about the shim are very good. I get the same vibe that it’s not going to happen, but I still believe we would really need one for the reasons that @ooeygui listed. :cry:

In the absence of a shim (I don’t know whether that is possible or not, but for the sake of this question assume it isn’t), what about a migration tool or script to do 80+% of the migration effort? Would that help ease the transition?

I agree with Martin. As a company building solutions based on ROS 1, porting all of our code base to a new middleware would be a major effort. Our robots perform mobile picking in warehouses, where performance and reliability demands require a well-understood system. Exchanging the foundation of this very complex system and getting to a similar performance level again will not be an easy step.

And even if there was a script for doing many of the basic steps, the real work begins afterwards when you discover and fix all the bugs related to a different communication model, bugs in the still fresh and mildly tested libraries, in core components etc that often only appear after long-term operation (cf. some of the bugs in roscore my colleagues have reported and fixed)

Since many of the new features of ROS 2 are not really required for our use case, we would have to decide whether it makes more sense for us to migrate, or to fork and build, in addition to our own packages, also the required ROS core packages. The latter would definitely not be my preferred outcome, and I hope that we can find either a good migration path or ways for maintaining both versions for a longer time.

I’m coming late to the discussion here, but it seems like there are a couple points that are being conflated. I totally agree with @Martin_Guenther that ROS API changes are hard things to live through. I’ve been using ROS since before Mango Tango, and I shudder when I think of some of the transitions we’ve had to go through. However, I also agree with @mkhansen that not releasing another LTS after Melodic doesn’t kill ROS next year. Much of my lab is still on Indigo because some of our robots are locked to particular ROS version, and the pain of upgrading outweighs the lack of recency of our version. If there were no more releases after Indigo, I wouldn’t have cared. Of course, everyone’s use cases are different, but I don’t think it’s accurate to say that ROS1 is over if there’s no N-Turtle. If does, however, start the clock ticking.

I guess that, for me, it breaks down to the question of “Are the improvements in ROS2 worth the pain of migrating all of my code?”. We’ve got a lot of code, and that would be a significant pain. Personally, I think that ROS2 will be better (for me and my students) than ROS1 is, once we get some features in place (looking at you actions and the new nav stack). Given that we have to upgrade from Indigo next April in any case, I might just bite to bullet and move to ROS2.

In the end, I think that the question of limited resources might be the most powerful one. If Open Robotics has N hours to work on ROS, how many should they spend on ROS1 and how many on ROS2. If we need M hours of work to make ROS2 viable, then the math is simple; more time on ROS1 means a longer time until ROS2 is ready.

The worst thing we can do is fork. If this happens (and it’s a non-zero probability event, for the reasons the @Martin_Guenther lists), that’s bad for everyone.

2 Likes

I think that the issue of emergent bugs that @moritz mentions is the thing that worries me most about the transition, since this sort of thing is inevitable. However, my hunch is that there are going to be fewer of them, since we’re moving to a more robust communication system, and because Real Companies ™ with Real Software Engineers ™ are now starting to contribute to ROS (hat tip to @mkhansen and his crew, among others). How many of the emergent bugs in ROS1 were because of code written by grad students that wasn’t properly tested?

The big question, of course, is whether the new (hopefully) more mature and better-developed parts of ROS2 will dominate the inevitable emergent bugs. I have no idea, but I’m at least a little optimistic.

1 Like

True, a number of components seem better designed, and software engineering practices are used more from the beginning. A problem when porting a complex system built on top is just that the application code, often silently, relies on properties of the communication layers (think for example about when exactly to ask tf for transforms and what to consider beforehand). If such behavior changes (even if it changes to the better), problems will appear.

That’s probably the crux of it; how many of these implicit assumptions will surface, and how long will they take to find and fix. I guess that’s an argument for both never moving to ROS2 (so we don’t have to deal with it) and moving to ROS2 immediately (so we don’t write any more code that relies on ROS1 idiosyncrasies). :-/

1 Like

I’m not sure. To be honest, I haven’t had time to use ROS2 yet, so I cannot say how similar the APIs are, and therefore how viable it is to write such a script. (BTW, the reason I haven’t used ROS2 yet is because it’s still missing some features (actions, nav) that we require for our paid projects, so that ruled out ROS2 without even trying, even when starting a new project.)

In the past, I found conversion scripts pretty useful for small upstream API changes that a lot of dependent projects have to do. Some examples are here:

My hunch is that the ROS1 -> ROS2 conversion is more than just a few renames, so it’s hard to write a script that gets it right (but I could be wrong here). More importantly, when starting to switch over to ROS2, I would be willing to invest some effort into learning ROS2, and after that doing the conversion manually probably wouldn’t be more effort than running an imperfect script and fixing it up afterwards; especially since doing the renames is only 10% of the effort, the other 90% is debugging the subtle changes in behavior that @moritz mentioned. In contrast, any effort put into learning the pluginlib conversions is lost, because you only need that knowledge once. But if it’s easy to do, why not write a conversion script and see how well it works.

BTW, here is an example that’s probably closer to what a ROS1 -> ROS2 script would look like:

It’s a collection of scripts that was used to switch from the old rosbuild system to the new catkin system 5 years ago. I’ve never found it terribly useful for the reasons I outlined above: I had to learn catkin anyway, and the scripts didn’t get it 100% right, so it was easier for me to start off with a clean catkin template and manually move stuff over. It’s mostly mechanical work, but you end up with a clean and correct result.

2 Likes

I’ve had second thoughts about if we really need a shim. In the end, what I want is to mix ROS1 and ROS2 packages in the same system. If the ros1_bridge works well enough, maybe that’s all I need (I haven’t tried it yet).

My reasoning goes like this: What’s great about ROS1 is the wealth of community-contributed packages. In recent years, I was forced to use a different robotics framework on some projects. The workflow was usually like this:

  • week 1: figure out what components we need for this project, discover that 90% already exist as ROS packages
  • month 1-4: port all those ROS packages over to Different Robotics Framework
  • month 5-6: actually implement the missing 10% from scratch

I often banged my head screaming “It would be so easy if we just used ROS”, and I would very much like to avoid repeating this experience, where Different Robotics Framework == ROS2. In the foreseeable future, this problem isn’t going to go away, because there are tons of ROS1 packages out there that are still useful, but no longer actively developed, so there is little chance of them being converted to ROS2 any time soon. If the ros1_bridge works perfectly, that would allow us to use the wealth of packages from ROS1, while porting everything over to ROS2 piecemeal. I’ll have to invest some time trying the ros1_bridge soon.

4 Likes

In the end, I think that the question of limited resources might be the most powerful one. If
Open Robotics has N hours to work on ROS, how many should they spend on ROS1 and
how many on ROS2. If we need M hours of work to make ROS2 viable, then the math is
simple; more time on ROS1 means a longer time until ROS2 is ready.

That simple math was true from the first day of work on ROS2 four years ago. It is not an argument for or against any strategy.

How about this question: If under ideal circumstances, ROS2 could be ‘feature-complete, production-quality, viable as a complete replacement of ROS1’ in X years. And if providing further ROS1 releases added Y years to that. Who would actually be hurt by those additional Y years of delay? Amazon? Will they go bust over a delay of ROS2? Microsoft? IBM? Intel? What’s the worst that could happen, and to whom, for additional delays to that state of ROS2?

We can see clear and obvious harm in even talking about stopping ROS1 releases (though not harm to the financial sponsors of Open Robotics), but I fail to see any harm in delaying a “really ready” ROS2 further. The feature-complete DDS robotic middlewares that ROS2 is based on are available to anyone desperate enough, no team in the world is blocked without ROS2 becoming “really ready”.

So if Open Robotics wants to be a foundation serving the whole open robotics community (not just their financial sponsors), what is the morally best decision?

I’d like to echo that point: the underlying package code is open, as is the build farm code. There are multiple organizations that run their own copies of the build farm now, usually to produce custom distros for internal use. So if Open Robotics were to stop doing ROS 1 releases, anyone else can pick it up. You can even do it now. I’m of course not encouraging the creation of forks, which I think that most of us will agree would be bad, but it’s not crazy to have a backup plan if you’re inclined to worry that we’ll abruptly walk away.

Of course, as you also point out, you probably don’t have funding to support that kind of effort. But then, neither do we! Amazon is now generously funding our use of AWS resources to host our build farms, but nobody is paying us to spend time releasing or even maintaining ROS 1. For reference, from our internal staffing plans I estimate that preparing and releasing a ROS 1 distro requires 8-9 person-months of effort from our team. Since leaving Willow Garage in 2012, we’ve received approximately $0 directed at maintenance or improvement of ROS 1. And even if we had such funding, we’d still be limited by the number of people on our team. If you’re currently trying to hire software engineers, especially in the bay area, you can likely sympathize.

I’m not complaining. We’ve made it work because we believe that it’s important. But the trade-offs laid out by @mkhansen and @wdsmart are real: time that we spend on ROS 1 is time that that we’re not spending on ROS 2, thereby (further) delaying the development of the latter. And the risk in that delay is also real: at some point organizations that eagerly want ROS 2 because ROS 1 doesn’t meet their needs will decide to stop waiting and instead build or buy something else, likely a proprietary solution.

4 Likes

For reference, from our internal staffing plans I estimate that preparing and releasing a ROS 1 distro requires 8-9 person-months of effort from our team.
Since leaving Willow Garage in 2012, we’ve received approximately $0 directed at maintenance or improvement of ROS 1.

So for each future release of ROS1, how much $ would you want to make more releases? Just enough for 8/9 person-months every year? Can it be raised via kickstarter?

If the problem is having developers, what other options exist? Does the releaser have to be in the bay area? Can it be a freelancer / outside organisation similar to what I guess is still happening with ROS answers?

time that we spend on ROS 1 is time that that we’re not spending on ROS 2, thereby (further) delaying the development of the latter. And the risk in that delay is also real: at some point organizations that eagerly want ROS 2 because ROS 1 doesn’t meet their needs will decide to stop waiting and instead build or buy something else, likely a proprietary solution.

Let’s check this risk. We assume there is a company X that will wait for ROS2 if no more ROS1 releases are done (8-9 person-months gained every year), but not wait if ROS1 continues to have releases. So company X has at least 3 options:

A: continue waiting for ROS2
B: build or buy something else
C: Provide a dev-seat equivalent of 8-9 person-months every 2 years

It seems to me that both A and C will always remain more economically viable than B. So how realistic is scenario B as a risk, really, if the crucial difference can be made by just dropping ROS1 releases?

A: continue waiting for ROS2
B: build or buy something else
C: Provide a dev-seat equivalent of 8-9 person-months every 2 years

Just throwing my perspective here, having worked for a variety of robotics startups over the last few years. Option A is not an option - If ROS1 can’t be used for it’s fundamental deficiencies, then they either need ROS2 or an alternative to ROS. Option C is not an easy one either - most developers who actually write code do not have control over the purse strings - it’s not always easy to convince management to opt for C when that money could just as easily be allocated to buy a license or support contract for alternate solutions so that they have a chance of delivering a product now.

I am one of those roboticists eagerly waiting for ROS2 to mature. I still have to deliver to my employer - so if ROS2 is delayed, I will go with option B. It’s a no brainer, really.

2 Likes

To give a sense of scale for 9 person-months in the Bay area, here are some numbers that I’m going to make up. I don’t know how much Open Robotics pays it’s people, but I’m guessing you can’t get a good software engineer for less than about $150,000 a year in the Bay Area. My rule of thumb (as someone who hires people and students at a university) is that people cost about twice their salary, once you add in benefits, insurance, and all the other stuff you have to pay for. So, $300,000. I don’t know what Open Robotics’s overhead rate is (overhead is the money you have to dedicate to keeping the lights on, and stuff like that), but here at OSU, it’s 53%. So, if I spend $100, I have to give and additional $53 to the University. This is probably high for a business, but I really have no idea. Call if 50% for easy math. So, your employee costs $450,000 a year. 9 months of this is $337,000. Assume that I’m way off on all my calculations, and cut that in half (I think that this is unrealistically low, but let’s do it anyway). That’s $160,000 a year, which is a heavy lift on Kickstarter.

Asking Open Robotics to make the “morally best decision” is a passive aggressive way of claiming that they’re choosing to do the morally wrong thing. Passive aggressively suggesting that there has been embezzlement at Open Robotics is just an ad hominem attack (at worst) or simply shows an ignorance of how companies operate (at best). Neither helps your argument.

From my point of view, as someone who has to raise money in a similar way to try to do similar things (albeit at a university in a research context), the explanation @gerkey gave makes sense. It’s not the optimal situation, and it sucks, but that’s the way it is. Unfortunately, the days of Willow Garage pumping millions of dollars into ROS development from a magical pot of money are gone.

Of course, all this arguing doesn’t help solve the problem.

5 Likes

Jumping ship and burning bridges has never been a great recipe when you are dragging a lot of weight (community) with you.

@bmagyar why do you talk about burning bridges? Who ever mentioned this? It is a bit sad to see this being brought up out of the blue. This post and similar events are exactly to not do that and get the feedback from the community and prepare it for the transition.

2 Likes

If you already have an inflight or working ROS1 application it can, and probably should stay on ROS1 until that application has exhausted its useful lifecycle. Any new or next generation of that application should be targeted to be built on ROS2 and getting the libraries there to support it.

Using ROS 1 and ROS 2 side by side using https://github.com/ros2/ros1_bridge works perfectly as we have detailed in our ROSCon talk: https://roscon.ros.org/2018/presentations/ROSCon2018_ROS2onAutonomousDrivingVehicles.pdf.

So a) you do not need to do transition over the night and b) you could even keep running your system as ROS 1 and ROS 2 hybrid.

1 Like

@tkruse noone is stating that Melodic will be the last release. We are polling the community and will decide based on the feedback.