During today’s navigation WG meeting, we had some more discussion about the use of life cycle nodes and actions in the ROS 2 navigation code. This discussion came back again to issues that several people have with the design of the APIs for these features, which then further got into the fact that there is not much discussion about the detailed design and implementation of many features. Several of us feel that time pressures are leading to designs that are driven by implementation, features that are not fully fleshed-out before being implemented, or what are assumed to be prototype implementations ending up not changing and then being used by an increasing number of users, making them hard to change. We would prefer to see long-term design that aims for “we will eventually get to here”. Even if the implementation does not get there immediately, at least everyone will know where we are going and that any current design or implementation is not the final result.
The design repository is good and has allowed many people outside the OSRF, myself included, to contribute to designing concepts for ROS 2 and in some cases put quite a lot of detail into specific parts (the launch facilities come to mind). But the documents in this repository more often than not do not go into detail on things like APIs or library organisation, which are important details that impact how developers can use the feature being discussed. The two most prominent examples that I have seen are the life cycle nodes being implemented as a separate object type, which requires duplication of much of the API and means life cycle nodes can’t be used in some parts of the ROS 2 API, and the implementation of actions external to the Node class while topics and services are included in the Node class, which contrasts with what many were expecting would happen before implementation would begin. I have also seen the design of topic names often come up in security-related discussions, such as this issue.
We recognise that there are limited resources being put into ROS 2 development, that those resources are also being asked to maintain and develop ROS 1, create new ROS 1 releases, maintain a build farm, and so on in addition to developing ROS 2 under huge time pressure. However the main problem is not so much the lack of time being put into detailed design of core ROS 2 features as it is the lack of an opportunity for people to comment on the details until a pull request with a completed implementation comes along - by which time it is difficult to say “change the whole concept of how this is implemented”. When detailed discussions do happen in GitHub issues, it is usually after-the-fact and frequently the discussion thread just stops without a resolution - probably because the relevant people have so much on their plates to deal with. It was also mentioned today that these discussions often end up going circular due to the limitations of talking in text all the time. This is in contrast to there being working groups for non-core features that meet regularly (weekly, in the case of the navigation WG) and talk about requirements, goals, etc. that are driving how many things will work. These weekly meetings help push things along and help prevent discussions just stopping.
The purpose of this thread is not to accuse anyone of ignoring detailed design. The conversations on the issues linked above show that the Open Robotics developers put a lot of thought into how they design APIs and implement features. I do feel that in general the core ROS 2 libraries are well-designed, but there are a growing number of places where I have concerns, and I know that others have places that concern them, too. What I think we need is:
- A venue to have regular discussion of detailed design and implementation issues before we get presented with an implementation, and not buried amongst the dozens of other issues that come across our GitHub notification lists every day.
- Active participation in this venue by anyone working on a core ROS 2 feature.