On the `rclcpp::experimental` namespace

Hi,
I would like to bring up a discussion about the meaning, purpose and guidelines around the rclcpp::experimental namespace in the rclcpp repository.
This topic was started in last week meeting of the ROS 2 Client Library Working Group.

The rclcpp repository has a very short README describing this namespace:

Notice that headers in this folder should only provide symbols in the rclcpp::experimental namespace.
Also notice that these headers are not considered part of the public API as they have not yet been stabilized.
And therefore they are subject to change without notice.

Having some parts of the client library that are not subject to strict API/ABI requirements is IMO very important to ensure a fast development cycle.

From this description however, it’s not clear when/why a new feature should be placed in this namespace and, most importantly, when/how a feature should be promoted out of it and considered stable.

This namespace currently contains only two features: large parts of the C++ intra-process optimizations and the events-executor.
(fun fact: I was the author of both features, so maybe this is my personal namespace).

I would like to propose the following approach:

  • New client library user-facing features should always be introduced as part of the experimental namespace.
  • After spending 1 full-release in the experimental namespace, features can be put up for promotion (e.g. if something is developed today it will first go to rolling, and K-turtle will be its first full-release; so it can be up for promotion as part of L-turtle).
  • The promotion evaluation process can be started by anyone, not necessarily the original author.

Proposal for the promotion evaluation process:

  1. API and roadmap review: do we expect large / breaking changes to the APIs of the feature in the short future? If yes, it may be better to keep it as experimental.
  2. Known issues review: are there known bugs / limitations affecting the feature? If yes, fixing them can be indicated as a requirement before promotion. Note that this item is discretionary, and up to the repository maintainers, depending on the severity of the issues.

The idea is that 1 full-release cycle should allow maintainers and community to get a more clear opinion about the stability of the new feature, and for the main critical bugs and limitations to be discovered.

The possible outcomes of the promotion evaluation process should be that: either the feature is promoted or a clear list of requirements is presented, after which the feature will be automatically promoted.
Once the feature is promoted, a dedicated PR should be created to move it out of the experimental namespace.

FYI @clalancette @wjwwood @tomoyafujita

6 Likes

I understand where this desire comes from, but this seems really restrictive. This means that it will take ~ 2 years for any new feature to make it to stable. We are already slow in integrating new features; adding an additional 2 year delay to that doesn’t seem like a great idea, in my opinion.

At the end of the day, my feeling is that this needs to be on case-by-case basis. Some things are very large and complicated (like intra-process comms and new executors), so it makes sense to put them in experimental for a while. And some things are pretty small, and can be done directly in the main API.

IMO placing things in experimental is actually a way to go faster.

As a developer, it means that I have 1 year to refine the API and address bugs without having to worry that much about API/ABI issues.

People can, and should, use things from this namespace if they want them.
IMO being able to break APIs/ABIs for new features wouldn’t be as disruptive for the users.

This brings me an another question.

before the discussion on how to promote (though i know the history about EventExecutor and that has been developed and supported very good), the question here is why we should have experimental namespace and what is the guideline to get something under that namespace if we have that?

Without clear guideline, it would be chaotic under experimental namespace? cz people could try to push the experimental things breaking ABIs anytime in the mainline.

Do we have guideline that explains how to put things into experimental namespace? I think at least REP is required to approve?

and about the promotion, i think experimental is already beta (not alpha) feature since it is enabled by default and user can use that.

there could be user application rely on this beta feature even though API/ABI would be updated during development.

that makes me think when we get this new feature REP in the 1st place, it should also be able to describe how to graduate from experimental? and i believe that would be really feature dependent.

If we cannot see the clear graduation path or process, maybe we should not take it under experimental at all.

Another option is to reconsider how experimental is used altogether. Generally speaking, experimental features are typically not considered ready for production use, so it would seem that a used of an experimental feature should have to do something explicit to enable that feature. Many times, this involves setting a feature flag at build time or at runtime.

While the EventsExecutor is not a default executor and has to be included via the experimental include path, the intra-process optimizations are a little more convoluted. They do have to be specifically enabled, but it’s not clear that it’s an experimental feature from the docs or the code where it is enabled.

Thinking on this, here are a few options that might be worth considering:

  1. Eliminate the experimental namespace and define a compile-time option for enabling experimental features.
    • This would require manual compilation of rclcpp in order to use experimental features, which seems prudent if they truly aren’t considered ready for production use.
    • Promotion from experimental would simply require removing the pre-processor blocking and the user code would remain unmodified (e.g. no having to change include paths).
  2. Incorporate a “feature-freeze” step in the release process.
    • The rolling branch is already considered unstable, and new releases are created yearly. Perhaps non-trivial new features shouldn’t be included in the last several months of the release cycle in order to allow for stabilization prior to release.
    • If option #1 is used, then new features could be incorporated late in the release process, but locked behind the experimental compile-time flag.

Rather than a time-based graduation criteria for new features, I think a more flexible approach is to have the ROS PMC decide. Possibly this could involve a PMC Member or Committer (depending on the impact of the feature) acting as the shepherd of that feature, and so being the person to bring its graduation to “stable” (for want of a better label) to the PMC for a decision.

That’s very true, and definitely indicates that we need guidelines for this process.

I like this proposal, as I think that most people interested in using the cutting-edge features are also expert enough to build parts of the ROS 2 stack from sources.
However, would the ROS 2 CI run with experimental enabled? And what about the nightly builds?
Could this approach make testing/validation more complex?

I think that the guideline we come up with should be as inclusive as possible towards the community and telling someone that opens a PR: “Hey, you will need to wait X weeks/months before proceeding”, may discourage them to get the PR to the finish line or continue contributing (this combined with the fact that we already have a lot of PRs that remain open for months because it takes a long time to get a review and then the contributor has already moved elsewhere).

I agree that it won’t be one size fits all, and we will need flexibility.
However, I also think that the time-component is important.

IMO part of the purpose of the experimental namespace is to allow discovering issues/bugs/requirements that were not evident during the PR review process.

Maybe the time-component could be the default and the PMC can decide to handle some features differently?

Being expert enough to compile code from source does not mean this would be practical. Exactly for this reason. CI build times would go through the roof. Quickly verifying if an experimental feature actually works in production would be really hard.

Especially the latter: quickly testing experimental/beta features on a ‘production’ machine is quite valuable. We’ve found issues there which do not pop up in a simple talker/listener setup.
However creating significant changes in our deployment pipeline, just to test something out of curiosity, is a large hurdle.

Then I’d favor an ::experimental:: line here and there which is easy to track.

I don’t think involving the PMC here is a good approach. If this would be the default, the PMC would become the choke point for any feature moving out of experimental. Involving the PMC should be a last resort of thing, if there is disagreement about the move.

In general, I see a time based rule as a good idea, but I would redefine it:

If some feature has been more than X in experimental and no work has been done to it

  • Promote it to stable
  • Remove it altogether

This would be just to catch stale stuff.
Additionally to this, there should be a ‘fast track’. If everbody agrees something is ready and stable, it should be promotable at any time, its basically like a feature pull request…

I would leave the decision to promote stuff out of experimental to the package maintainers, as they have the best knowledge of the ramification that may be caused by it.

It looks like there’s a few discussion items here:

  • What does experimental mean, why do we want it and when should it be used?
  • How is experimental implemented?
  • What’s the promotion process?

There’s already a lot of good ideas/thoughts in the thread, but I think that it would be better to proceed with order and try to answer the first question.
No need for a fully formal and exhaustive definition here and now, but at least to agree on the idea.

Right now the experimental concept applies only to the rclcpp repository; IMO we could try first to get something functional for this specific use-case and later discuss whether it could be applied across all the ROS 2 client libraries (and make it a REP).
@clalancette what do you think?

My interpretation of this first question:

The experimental concept is not an hard definition, but it’s more of a guideline that the maintainers / PMC of the repo can decide to follow when necessary.

experimental contains user-controllable features that are added to the client library, but are not in a stable state yet.
They may present known and unknown limitations, be subject to less rigorous testing and be subject to API changes within a release.

The experimental concept has several purposes:

  • Signal the users that a feature is not fully mature yet, so special consideration should be taken when using it. We should try to maintain an up-to-date README/comment indicating all the known limitations of an experimental feature and other useful info.
  • Allow developers to move faster: both in the first stage, avoiding keeping something in a PR for years and allowing to merge even if we may expect API changes or further testing, as well as in the later stages, as API-breaking changes will be easier to push.

Maintainers / PMC take the decision on what should be marked as experimental, reviewing PRs that introduce new user-facing classes and utilities, APIs in existing classes or runtime configurable changes to the internal behavior (e.g. intra-process).
They can take a decision for example taking into account the complexity of a feature, its history, testing, etc.

Marking a feature as experimental should be accompanied by a clear motivation and tentative roadmap for promotion.

1 Like