For the past several months, the example buildfarm_deployment_config has included a pinned Jenkins version to prevent using versions that comply with JEP-200 from being used due to issues with the ros_buildfarm scripts and the plugin versions they use. (First reported by @gavanderhoorn in https://github.com/ros-infrastructure/buildfarm_deployment/issues/193)
I’ve been working on upgrading the buildfarm deployment and ros_buildfarm scripts to work on post-JEP-200 deployments and am happy to report that we’re now using the latest Jenkins LTS on both build.ros.org and build.ros2.org.
That work is being prepared for merging across three pull requests:
- https://github.com/ros-infrastructure/buildfarm_deployment/pull/207
- https://github.com/ros-infrastructure/buildfarm_deployment_config/pull/37
- https://github.com/ros-infrastructure/ros_buildfarm/pull/587
To allow time for community members using the master branch of buildfarm_deployment to make the transition we won’t be merging these changes right away. The important dates are here:
- 2018-12-17 buildfarm_deployment
master
branch will require Jenkins >= 2.138.3 and updated plugins. - 2019-01-17 buildfarm_deployment transitional branch
jenkins-lts-upgrade
will be removed from buildfarm_deployment.
For those who aren’t able to schedule the upgrade before 2018-12-17, the pre-jep-200 branch is a snapshot of the current master which won’t be receiving further updates and moving your existing deployments to this branch will keep you in the current state as long as you need.
Additionally, I’ve written the guide below based on my experiences updated build.ros.org and build.ros2.org with review and testing from some intrepid community members. I’ve done my best to communicate everything I encountered during the upgrade process but I can’t make any guarantees that the guide below will cover every contingency. If you do run into trouble feel free to open an issue or ask a question with the buildfarm_deployment tag on https://answers.ros.org
ROS Buildfarm Deployment JEP-200 Upgrade Guide
The last LTS version prior to JEP-200 was 2.89.4 which has been the pinned version for new deployments since https://github.com/ros-infrastructure/buildfarm_deployment_config/pull/31.
The current Jenkins LTS as of this writing is 2.138.3.
Overview
Due to the way the buildfarm_deployment is put together, it is not recommended to re-run puppet or the reconfigure.bash script on the Jenkins master after initial configuration (see #160).
Buildfarm deployments wishing to update existing deployments will need to take manual steps on their jenkins hosts to migrate successfully to the latest version of ros_buildfarm (forthcoming).
Notable changes
User-facing changes:
Aside from some minor changes to the Jenkins sign in page and other visual updates, most users of ROS Buildfarm deployments will not notice any significant differences.
The minor issue #166 has been resolved with updates to the GitHub API and GitHub Pull Request Builder plugins.
Users with Jenkins API tokens may need to recreate them after coordinating operators.
Operator-facing changes
JEP-200 mandated that a large number of plugins used by the buildfarm be updated,
so I took the opportunity to update all of them and test comprehensively rather than trying to work out only those updates that are necessary.
Jenkins upstream recommendeds revoking existing “Legacy” API tokens and replacing them with the new token system.
Legacy tokens are not revoked on upgrade and there’s a dedicated UI for reviewing and revoking tokens as you have the opportunity to reprovision them. See https://jenkins.io/blog/2018/07/02/new-api-token-system/ for details.
ROS Buildfarm aspects which can make use of API tokens:
- Swarm agent client authentication
If your organization’s clone of the buildfarm_deployment_config repository uses GitHub for authentication, switching to use of Jenkins API tokens from GitHub API tokens will reduce the number of GitHub API calls necessary to manage authentication to your buildfarm deployment.
The token can be used for this config value. - ~/.buildfarm/jenkins.ini may contain a Jenkins token rather than a password
Especially if your buildfarm deployment uses GitHub for authentication it’s recommended that operators create and use Jenkins API tokens rather than GitHub API tokens for authenticating to your buildfarm when running ros_buildfarm scripts.
The GitHub API tokens will require your buildfarm to authenticate the token with GitHub on each request, which can exhaust your Jenkins instance’s API rate limit for large deployments.
Using a Jenkins API token to avoid a GitHub API request for each authentication is also anecdotally faster, resulting in faster reconfigurations.
Actions Required
Before you start
Make backups and test them!
Unless you’ve made a reliable backup before you start I cannot recommend proceeding with this guide.
Backups aren’t real until you’ve validated the restore procedure.
Review this guide in full.
This guide was written after upgrading build.ros2.org revised after upgrading build.ros.org and should be used as a starting point.
There may be other necessary actions for your setup that this guide does not capture.
System requirements: These upgrades are only recommended if you’re running a relatively recent Ubuntu 16.04 based buildfarm that was set up after the buildfarm_deployment migration to xenial or has undergone the procedure described in ROS Buildfarm October 2017 Guide to new changes
Updating buildfarm_deployment_config
Your buildfarm deployment config is likely a duplicate of the master
branch from https://github.com/ros-infrastructure/buildfarm_deployment_config. This repository, like https://github.com/ros-infrastructure/buildfarm_deployment, has a master branch which roughly tracks the production deployments build.ros.org and build.ros2.org and updates to the buildfarm_deployment puppet logic may not be compatible with your existing configuration.
Some changes to deployment behavior are also implemented directly via configuration directive.
To review your buildfarm_deployment_config against the changes made since, I’d suggest creating a local clone and using git-diff to compare the state of your config against the upstream one.
Upstream configuration url
Before merge of https://github.com/ros-infrastructure/
Updating Jenkins
-
Put jenkins into shutdown mode and let any outstanding jobs complete.
-
Take jenkins offline with
systemctl stop jenkins
. If there are many jobs queued, Jenkins may take a while to shut down. Waiting until the jenkins java process stops is recommended. -
Stop jenkins agent on master with
systemctl stop jenkins-slave
-
(Optional) Run
apt-get update && apt-get upgrade
. Review the list of upgraded packages -
Remove any holds placed on jenkins and run
apt-get update && apt-get install jenkins
. -
If prompted to update the
/etc/default/jenkins
file, you may either reject the changes (what I did) or review them to be integrated with your setup. -
Install updated plugins. I’v prepared a script which will use bash and curl to install plugins by dropping them in the correct location. You could also modify this script to use the jenkins-cli.jar, or just use it as a list of plugin updates to perform with an online jenkins instance although I experienced filesystem permissions issues when doing so and the resulting state left my Jenkins un-bootable. However, if you check filesystem permissions beforehand to make sure all plugins are owned by
jenkins
rather thanroot
it may work for you. -
Remove any .pinned files from your jenkins plugin directory
rm /var/lib/jenkins/plugins/*.pinned
. You may save the list of pinned files if you wish to restore them later. -
Start jenkins, watch the logs for any issues failing to initialize plugins
-
From the Jenkins UI, review any administrator warnings.
-
(Optional, but recommended) Disable JNLPv2 and JNLPv3 on the Jenkins Global Security Configuration screen.
-
Sign into Jenkins as the user your agent hosts authenticate as and generate an API token for use as the
Buildfarm Swarm Agent
copy this token into your buildfarm_deployment_config as thejenkins::slave::ui_pass
value. -
SSH Credentials can no longer be read from a file on disk due to this vulnerability. Credentials using the key from disk get migrated to “Directly entered” keys without the key being read from disk. Very early or modified deployments of the buildfarm may have been using an ssh key read from a file. This can be fixed with the following steps
- Copy the text of your ssh key from the buildfarm_deployment_config master.yaml field
jenkins::private_ssh_key
. Take care to trip the leading whitespace if you copy it from the YAML block text format. - Open the Credentials UI and select the global credential matching your ssh username (jenkins-agent by default). Select “Update”
- Paste the ssh key text into the text area and select “Save”. No other field should need to be changed.
-
Reconfigure Jenkins using a compatible branch of ros_buildfarm.
As of 2018-11-16 the Jenkins LTS work for ros_buildfarm is not yet merged into master or released. The pull request https://github.com/ros-infrastructure/ros_buildfarm/pull/587 has a branch with some workarounds for plugin issues and template updates for plugin versions in configuration structures. -
Update the Script Approval whitelist. If you have a highly customized ROS buildfarm deployment your scriptApproval.xml whitelist of allowed scripts and Groovy signatures might have diverged. To proceed you have three options:
A. Let jobs fail due to sandbox violations and whitelist them as needed.
B. Manually apply the additions to the scriptApproval.xml file in buildfarm_deployment to your deployment’s/var/lib/jenkins/scriptApproval.xml
and restart Jenkins.
C. Copy the scriptApproval.xml file to your instance replacing your existing/var/lib/jenkins/scriptApproval.xml
and restart Jenkins.
Updating Agent on Master
-
Download the swarm client jar version 3.14 from this link and place it in
/home/jenkins-agent
on the master host. Make sure it’s owned by the jenkins-agent user. -
Open
/etc/default/jenkins-slave
and make the following changes:
- Update the line for
JENKINS_SLAVE_JAR
toJENKINS_SLAVE_JAR="${JENKINS_SLAVE_HOME}/swarm-client-3.14.jar"
- Update the line for
JENKINS_PASSWORD
to match the API token used for step 12 when updating the Jenkins master.
-
Remove the old swarm client jar
rm /home/jenkins-agent/swarm-client-2.0-jar-with-dependencies.jar
-
Start the Jenkins agent
systemctl start jenkins-slave
and verify that it is able to connect to your master host.
Updating Repo and Agent hosts
- Pull the latest version of your buildfarm_deployment_config and run
./reconfigure.bash
from the repository root. - Verify that both the
building_repository-*
andagent-*
nodes have successfully reconnected.
If you follow this guide, please report back your successes or any problems you encounter.