Tagging ros/rosdistro.git

Would it be possible to tag the commit in https://github.com/ros/rosdistro from which new ROS distro releases are made? This way, we can eventually have superflore use read from it instead of ROS_DISTRO-cache.yaml and thereby have the versions of the ROS packages in meta-ros exactly match those listed in the release announcements (eg, Dashing Update 1: Patch release and package sync for ROS 2 Dashing Diademata).

It’s possible to do, but it is another step in the process of doing a release. It’s also somewhat removed from where the actual packages are, which is why I sort of think ROS_DISTRO-cache.yaml is a better place to get the data from. What’s the actual problem you’ve run into with using ROS_DISTRO-cache.yaml?

@tfoote @nuclearsandwich any thoughts here?

Edit accidentally posted before finished.
Edit finished my post.

I’ve wanted to do this for a while and the main thing that has prevented me from doing so is that it’s not on my release checklist so by the time a sync comes around it slips my mind. Second to that is the fact that I’ve never formally brought it up, just casually. So thanks for taking the time to post.

I think it would be reasonably good to tag the sync state of the ros/rosdistro repo like we have been doing for the ros2/ros2 repository containing ros2.repos: release-$ROSDISTRO-${YYYY}${MM}${DD}.

This would be the responsibility of the ROS release manager colloquially “ROSBoss” for a given release and would occur just before starting the “sync to main” job. Some care may need to be paid to make sure that if concurrent changes are being made to different rosdistros that the release state doesn’t become muddled but no guarantees should be made about the state of other rosdistros tagged in a release for one specific one.

Another minor advantage to tagging ros/rosdistro is that this also provides a reasonable snapshot of the rosdep db at the time of the release. This snapshot is not reliable for reproducing a rosdistro as bloom releases will be run on earlier iterations of the db and can even be run with non-canonical rosdep sources. But it does provide downstream packagers with a reasonable set of waypoints for forensic investigation of rosdep if a superflore run fails due to rosdep resolvers.

We don’t have separate release copies of the rosdistro cache either. So the moment after a sync is performed, if new releases are then merged into the rosdistro repo the rosdistro caches are potentially also divergent from the rosdistro as just synced.

The rosdistro cache also includes package.xml data for packages in the rosdistro whereas the distribution.yaml does not. Does superflore make use of those cached package.xmls or does it always pull from release repositories?

We certainly could set this up. It’s extra information, but I’d like to understand a little bit more about your use case to know if I think this is what I’d recommend.

In our current models the majority of users are using the debian packages, and as such the syncs end up being effectively synchronized batch releases to the community. However the mechanics of it are actually mostly artifacts of the debian build pipeline. The other thing is that the syncs don’t actually represent the state of the rosdistro at the time of the sync. They represent the aggregate results of the buildfarm’s attempted builds. For example if Package Foo has been updated but the updated package fails to build, the sync will have the old package Foo, until such time as a package Bar that Foo depends on invalidates it, then Foo will dissapear. The current release announcements actually come from the debian repository state and not from the rosdistro and reflect the above sort of artifacts, and consequently don’t achieve your goal of having the same state.

Overall the reason that we have the syncs is that we don’t have a good way to test things prior to a release. We have the prerelease tool, which does catch a lot. But with newer source based builds coming online such as Gentoo and OpenEmbedded they could relatively easily consider running a test build in the pull request for a new release. We’ve also wanted to turn on more comprehensive automated prerelease tests in the rosdistro PRs, but we haven’t had the time or resources to do that and we’ve relied on the maintainers to run the appropriate prerelease tests.

I think it’s a great idea to work towards having consistency across the different platforms, but I’d rather see us work toward having the rosdistro become more authoritative and deemphasize the results of the syncs. If we can improve our QA process for releases the syncs could even be deprecated.

1 Like

Sorry for the delay in responding – I’ve been on vacation.

Does superflore make use of those cached package.xmls …?

Yes.

The other thing is that the syncs don’t actually represent the state of the rosdistro at the time of the sync.

By “syncs”, you mean what generates ROS_DISTRO-cache.yaml, correct? Is there some commit of rosdistro that exactly matches what’s in a release announcement, even if it’s not HEAD at the time that it’s made?

The current release announcements actually come from the debian repository state …

Could a ROS_DISTRO-cache.yaml be generated from the debian repository state at the time of the release announcement and preserved? If so, then we could point superflore at it and thereby generate recipes for the exactly the same package versions.

Could a ROS_DISTRO-cache.yaml be generated from the debian repository state at the time of the release announcement and preserved?

@tfoote Could this be done for the recent releases of crystal, dashing and melodic? Or are there copies of ROS_DISTRO-cache.yaml squirreled away that match the release announcements?

We currently do not have the ability to generate a rosdistro cache from the state of the debian repository. It would require a script to scrape the apt repositories for the versions than take the state of the rosdistro file and munge it with all the versions detected in the apt repo.

As in my above comment I believe that that actual result is not a meaningful state. I would suggest it might be better to simply snapshot the rosdistro at the time of the release. Until we start tagging this can found by just using the closest commit to the sync for finding the state. The state of the apt repos should be a very close approximation of the rosdistro except for things that aren’t building.

Related, we don’t store any history of the cache. It’s literally a cache, designed to optimize querying of the rosdistro content. Everything inside of it is just a cache of existing content. However it’s relatively costly to collect that information from the federated package ecosystem compared to how often you want to query it which is why we host the cache publicly and have the tools accelerate their queries using the cache.

If you have any checkout of the rosdistro you can generate the cache for that exact configuration using the command rosdistro_build_cache. That’s all we do on the buildfarm with the HEAD of master on ros/rosdistro and then push it to a public host.

Somewhat related: this is actually what we do in rosinstall_generator_time_machine, where we use it to generate .rosinstall files “from the past” to recreate versions of ROS packages as they were at a certain point in time (or actually: close to a certain point in time).

Exactly because of this we do have a repository with a nr of *-cache.yamls (rosin-project/rgtm_rosdistro_caches), which will be either used directly, or a starting point to create a new cache.

1 Like