Read-through caching container registry

nuclearsandwich · November 21, 2020, 4:55am

Docker Hub rate limits have been in effect since 2020-11-02T08:00:00Z and have been causing periodic issues on build.ros.org and build.ros2.org.

I am completing the deployment of a caching registry mirror for build.ros.org and build.ros2.org following this recipe. Notably my first attempt to deploy this using each build farm’s repo host failed to resolve the rate limiting issue as the docker-registry version in Ubuntu Xenial appears to pass data through but is not correctly caching actual image files. Our production mirror is running on a Focal host and using the docker-registry package from Ubuntu without issue. I expect that using the official registry container image from Docker would also work but I haven’t tested it.

Configuring the registry mirror requires modification of each host’s docker daemon configuration. There is a draft PR here: https://github.com/ros-infrastructure/buildfarm_deployment/pull/244 to make this a configurable option in buildfarm_deployment_config.

A caching container registry mirror is going to be added to the new 20.04-based build farm deployment and will be enabled by default.

flixr · November 21, 2020, 5:40pm

Does the read-through cache really help to avoid running into the rate limit?
The limit applies to docker manifest pulls, not the actual image layer pulls…

As far as I understand the read-through caching registry, it will still pull the manifests to check if the image is still up-to-date and then cache the actual layers.

At least I could not find any option on how to control how often the manifests would be read from the main docker registry, which is what actually counts towards the rate limit.

Do you have any more information on how this is done in the registry?

nuclearsandwich · November 21, 2020, 6:11pm

The docker-registry doesn’t log its interactions with the upstream mirror but I expect that it’s doing HEAD requests rather than GET requests to check if there are changes to the manifests and per the documentation HEAD requests are not rate limited.

I wasn’t confident that this would be enough but it withstood an attempt to force the rate limit by making 500 sequential docker-pulls in a loop and the hosts we’ve deployed to use it have not hit the problem since. I won’t claim it’s guaranteed but it’s been working for us for the last 18 hours of heavy activity. (There are still a few failures on our ARM hosts of build.ros2.org which use a custom deployment pipeline and haven’t been gotten to yet.

Topic		Replies	Views
[RFC] Restricting the size of ROS docker images ROS General docker	0	1086	March 16, 2020
Upcoming maintenance for build.ros.org: 2020-12-23 Buildfarm	7	4051	December 31, 2020
Buildfarm down Buildfarm	8	1027	July 11, 2019
Preparing for Kinetic Sync 2021-02-05 Kinetic	2	506	February 11, 2021
New release of the ros_buildfarm package (version 1.4.1) Buildfarm	1	1004	September 5, 2017

Read-through caching container registry

Related topics