I will point out that the executors actually aren’t involved during the publishing of data; a call to publish
goes straight through the rclcpp layer down to the DDS layer, and then out to the network.
However, if subscriptions are used to measure the rate (using ros2 topic hz
or something similar), then that obviously does involve the executors.
What would be really interesting to see from someone is what the rate is on the publisher by doing measurements inside the publisher code. That will allow us to bisect the problem on either the publisher or subscription side.
The other thing to try here is different RMW implementations, and see if there is any difference between, say, Fast-RTPS and CycloneDDS.