When I try to use big data message by python code such as Camera Image or PointCloud2 on ROS2, I found the performance is terrible.
Problem:
Issue1: Execute “ros2 topic echo /topic_pointcloud”, no message output -> [Closed]
Description: Launch ros2_intel_realsense by realsense_ros2_camera, and subscribe “/camera/depth/color/points” by “ros2 topic echo /camera/depth/color/points”. It will cost dozens of seconds to print out the PointCloud2 (640x480) data. On ros 1, rostopic shows data immediately.
Issue2: Python sub-pub big data performance is worse that CPP - [Closed]
Description: Subscribe “/camera/depth/color/points(sensor_msgs::msgs::PointCloud2)” by both cpp subscribe api and python subscribe code, the fps is different. cpp-code is 4hz, but python-code only 0.4hz.
Debugging point to the time exhaust in the function convert_to_py while convert msg to python, in _rclpy.c:2261
Is there any idea to fix Camera message computation performance issue?
UPDATE:
Issue1: Fixed by PR
Issue2: Workaround by “export PYTHONOPTIMIZE=0”
It’s usually problem for ROS1. In ROS1 there is ECL package helping to address it: http://wiki.ros.org/ecl_ipc/Tutorials/Shared%20Memory. In essence problem is solved by using shared memory to pass big data between processes instead of standard ROS pub/sub pipe.
I thought for ROS2 it should not be a problem, because ROS2 was created specifically to solve this particular and other ROS1 problems. If issue remains in ROS2, I’d try to use same approach as ROS1 and pass Images and PClouds through shared memory.
Update debugging status:
I have wrote test code rttest_sample to publish full-length PointCloud2 on topic /rttest_sample. the result is:
Set data size as 64x48, “ros2 topic echo /rttest_sample” works.
Set data size as 640x48, “ros2 topic echo /rttest_sample” cost ~10s.
Set data size as 640x480, “ros2 topic echo /rttest_sample” cost more than 60s.
While I use ros1_bridge and echo the same(ros2) topic in ros1, it works well with 640x480.
debugging point to code:
rcl/src/rcl/wait.c::rcl_wait
[INFO] [rcl]: Initializing wait set with ‘0’ subscriptions, ‘2’ guard conditions, ‘0’ timers, ‘0’ clients, ‘0’ services
timeout = -1
[INFO] [rcl]: Waiting without timeout
[INFO] [rcl]: Timeout calculated based on next scheduled timer: false // Wait. here
rmw_ret_t ret = rmw_wait(
&wait_set->impl->rmw_subscriptions,
&wait_set->impl->rmw_guard_conditions,
&wait_set->impl->rmw_services,
&wait_set->impl->rmw_clients,
wait_set->impl->rmw_wait_set,
timeout_argument);
It looks like there were several different issues discussed here.
The ros2 topic issue seems to be a bug in the truncation of the output and printing to console is taking a very long time. This should be addressed by https://github.com/ros2/ros2cli/pull/126
The issue difference of FPS between C++ and Python that can be addressed by running the python interpreter in optimized mode: setting the environment variable PYTHONOPTIMIZE=0 or passing -O to the python invocation. (related ROS answers post here)
@marguedas
Thanks for your comments, I have just updated the issue status, here is only one issue left that why python is much slower than CPP to subscribe PointCloud2 msg, I will try PYTHONOPTIMIZE to verify.