ROS Resources: Documentation | Support | Discussion Forum | Service Status | Q&A answers.ros.org

Post-mortem: ROS GPG Key Expiration Incident

Authors: @Katherine_Scott @nuclearsandwich
Reviewed by: @cottsay @tfoote

Incident team: @cottsay @gbiggs @Katherine_Scott @nuclearsandwich @tfoote

Last week the published GPG public key which is used for the ROS and ROS 2 repositories expired causing repository verification errors when updating or installing packages from the ROS apt and rpm repositories. We created the current signing keypair as part of the total redeployment of the ROS build farm in 2018 as a precautionary measure following unauthorized access to the ROS build farm Jenkins hosts1. In our haste to provision new keys to respond to this event we published a public key with a default expiration date of two years in the future. The expiration date of this new key was noticed and a question about it was raised internally; the team resolved to revisit our key use and rotation practices. However, we never did revisit our key rotation policies and so the expiration date of the previously published public key came and went. While the expiration of the key caused inconvenience and halted some ROS services, the security of ROS services and the integrity of the ROS repositories were not affected.

The public key expired on 2021-05-29T00:00:00Z and shortly after we started receiving the first reports of GPG verification errors from both the community and via build farm alert emails.
Shortly afterward members of the ROS team at Open Robotics returned from their Friday evening activities and upon seeing the Discourse and build farm traffic returned to work and began working toward a resolution. We determined that publishing a new public key with an updated expiration date was the minimum necessary resolution and work began on that process. We also considered rotating the private key but decided not to change more than strictly necessary. In tandem we began itemizing a list of locations the public key text would need to be updated across ROS project infrastructure and made preparations to do so.
We decided to set the expiration date for the current key to 2025-06-01 which means that we expect the current signing key to be valid for the entire support lifespan of Ubuntu 20.04, which is the primary platform for all of our currently active releases including ROS Noetic, ROS 2 Foxy, and ROS 2 Galactic. Updating the public key alone created unforeseen challenges with some downstream processes including our own build farm deployment, which did not automatically import the changed public key after it was updated on disk and in our Docker image building process.

As we begin preparations for ROS 2 Humble on Ubuntu 22.04 we will be updating our repository and key management practices in the following ways:

  1. Migrate repository and key configuration to apt and rpm packages which are distributed both as standalone downloads and via our existing apt and rpm repositories to allow us to provide updates to the repository keys and related configuration when your system performs upgrades to other system and ROS packages.This has an added benefit of simplifying the process for adding ROS repositories to a system.

  2. Create a physically secure root key and generate subkeys for specific deployments such as the ROS snapshots, ROS bootstrap, ROS build farm, and Gazebo repositories to start building a more robust network of trust around our GPG keys and take advantage of rotating subkeys to reduce the need to invalidate public keys with frequency.

  3. Devise a key rotation schedule for the deployment subkeys based on either calendar time or target platform in order to reduce the blast radius of a possible compromise of one of the deployed subkeys.

  4. Update our internal incident response procedures based on employee feedback from our internal debriefing of this incident.

We hope these measures will prevent all future key expiration issues. We would like to thank all of the ROS users who reported the issue and have worked with us to address the downstream consequences, particularly @marguedas and @ruffsl for handling the updates to the Docker image builds.

With a project as large as ROS there are many different venues for reporting problems when you encounter them. When something is not working for you we encourage you to check ROS Answers and look for questions there related to your issue or post one yourself. You can always check on the posted status of ROS infrastructure on the ROS Status page and if you think there is an issue with ROS infrastructure you can report it by opening an issue on the osrf/infrastructure repository which has been created to field infrastructure issue reports.

8 Likes