Dear ROS 2 build farmers,
I really like the infrastructure provided with the ROS 2 build farm, https://build.ros2.org, and do think that it’s a great service to the community. However, as a package maintainer, I often struggle when builds fail. While I learned my ways around the build farm mainly via observation and trial&error, I propose to create a documentation so that package maintainers don’t have to learn it on their own. I suggest to link this documentation prominently on https://build.ros2.org and within the Jenkins mails. Otherwise mails from the build farm may lead to frustration and may often just be ignored (my own experience ).
I suggest the documentation to comprise:
- When are jobs spawned?
- Presumably when you bloom your packages? → link to bloom docs
- Which jobs are spawned?
- What are their different purposes? - ‘My’ ROS 2 package(s), consisting of a main package, an interface package, and an example package, built for 4 ROS 2 distros, yields 63(!) build jobs on the build farm[1]. Finding the particular build job that helps with debugging is often hard, especially since their names are quite cryptic and their purpose is not immediately clear.
- How do they differ from ros-tooling github actions? - While the dev jobs and PR jobs directly pull from the upstream repository, most of the other jobs seem to pull from the upstream branches of the release repository instead. This leads to situations where your local builds, your github actions, the dev and PR jobs on the build farm work, but some of the jobs on the build farm still complain and fail although seemingly doing the same.
- FAQ: What are commonly observed effects / common failures?
- From my personal experience, dependencies not properly being declared is the reason >90% of the time.
- This could benefit from explaining why this is not hitting you in your local builds/tests, github actions, and dev jobs.
- Troubleshooting
- How to debug? Is blooming + waiting for merge to rosdistro + getting feedback from the bin jobs the best way?
- In my experience, typical build failure mails read "
apt-src build ros-…
failed. This is usually because of an error building the package". However, the root cause, usually cmake missing a build dependency, is not visible in the snippet provided with the email. Instead, it can be found when navigating to the particular job and looking at the full log.
While I can’t provide the documentation on my own because I mainly rely on observation/speculation only, I am happy to get involved, e.g., by providing suggestions and feedback.
Cheers,
Arne