I just released a reinforcement learning framework for robot swarms of over 100 robots. It uses an ROS 2 service for communication between the reinforcement learning algorithms and the simulation and can therefore easily be adapted to other needs.
Video:
Github: