This is a post asking about the best way to go forward. We are working on robots that have a couple of different architectures:
1: One or several on-robot Single-Board Computers for navigation, Image processing, etc., and one or several microcontrollers (usually some ESP32 variety) for low-level motor control, safety, etc., connected through serial.
2: A purely microcontroller-powered robot that is managed through WiFi from a host.
In both cases, we have ROS 2 on the SBCs / hosts and micro-ROS on the microcontroller. We developed this framework to modularize the micro-ROS development:
All this work over DDS/XRCE-DDS in the classic way. Now we are trying to migrate to Zenoh as our RMW, as it would help a lot with our use cases (control over WiFi, remote cloud access…). The ROS 2 side change is straightforward enough, but I am a bit stumped on the best way to migrate our microcontroller code.
From the discussion in the linked topic, it seems there are only two feasible and “clean” methods to link an ESP32 to ROS 2 infrastructure when using Zenoh:
Integrate Zenoh-pico as the middleware inside micro-ROS. PRO: allows to keep (some?) of the micro-ROS API and thus user code. CON: Sounds scary, high risk, and probably too complex an endeavour for us.
Integrate ESP32 code directly against Zenoh-pico. PRO: few moving parts. CON: unclear how complex it is to translate raw Zenoh-pico to RMW concepts (if too complex, perhaps the right place is inside Micro-ROS?)
An alternative is to just keep micro-ROS as is and link its agent to the rest of the ROS 2 application through the DDS-Zenoh gateway. PRO: no rewritting is needed. CON: ugly, does not help with the microcontroller-WiFi-host use case.
We’ve just run in the same issue, just from the other side. We started in Gazebo on Zenoh and then my colleague finished the MCU code and we figured we have to give up Zenoh. What a pain!
Integrate ESP32 code directly against Zenoh-pico. PRO: few moving parts. CON: unclear how complex it is to translate raw Zenoh-pico to RMW concepts (if too complex, perhaps the right place is inside Micro-ROS?)
The system configuration described by @xopxe is another example where we – as implementors/designers of (sub) systems – are not always in total control and can’t always dictate which specific RMW a (sub) set of nodes in a system should use.
Integrate Zenoh-pico as the middleware inside micro-ROS.
This is probably the easiest method to implement, except for optimizing memory management.
The zenoh-pico implementation dynamically allocates working memory internally and frequently allocates small amounts of memory. In contrast, the microros implementation uses static memory management in the rmw layer (xrce-dds), which matches the memory management of RTOS.
The other issue is political.
Since microros products are eProsima products, I think it would be difficult to officially adopt zettascale products.
Integrate ESP32 code directly against Zenoh-pico.
This requires solving several technical issues beyond memory management.
How do zenoh-pico generate a hash value when generating a zenoh key?
How do zenoh-pico serialize/deserialize ROS messages?
This is not a limitation of microros, it is a limitation of specification of rmw_zenoh.
This is why rmw_zenoh_pico was first implemented for microros.
The implementation of rmw_zenoh_pico makes use of these microros functions.
I think this is the ideal implementation for poor targets.
link its agent to the rest of the ROS 2 application through the DDS-Zenoh gateway.
This method probably isn’t available in the current implementation of zenoh_plugin_ros2dds.
The zenoh_plugin_ros2dds provides the ability to connect remote (cyclone) DDSs using the zenoh protocol.
However, this zenoh key uses a unique format and is not compatible with rmw_zenoh.
Therefore, it is not possible to communicate with rmw_zenoh.
The zenoh_plugin_xxx framework is intended to be used as a point-to-point bridge.
For this reason, in order to connect to rmw_zenoh, it is necessary to replace the current zenoh key processing part.
Thanks for detailed analysis. I think I’ll sketch a “2. Integrate ESP32 code directly against Zenoh-pico.” solution, and see what code is missing for this to work.
The more I read things in this thread, the more I like the idea of just rewriting rosserial That would be the true rmw-agnostic solution. Because I don’t want to lock us to a specific rmw just because of the microcontrollers. Is there already something like that?
The more I read things in this thread, the more I like the idea of just rewriting rosserial
If you want to achieve something similar to rosserial using zenoh…
I think it would be easy to repurpose the protocol currently used by rmw_zenoh.
On the ROS network side, in order to connect to an existing ROS environment, you need to change all RMW to rmw_zenoh or extend zenoh_plugin_ros2dds to convert to an existing DDS.
Currently, zenoh_plugin_ros2dds does not support rmw_zenoh connect.
Because I don’t want to lock us to a specific rmw just because of the microcontrollers. Is there already something like that?
I haven’t found that project yet.
I think mros2 is probably the closest project in concept to it.
However, I think it is necessary to review the versatility of IDL/hash issues of ROS messages and the development environment.
The original zenoh implementation supports some of the ROS rmw and rcl functions, which means it can implement basic communication functions such as pub/sub on its own.
I think it would be interesting to have a minimalist client on a microcontroller that implements the main ROS application APIs in zenoh-pico and replaces rmw/rcl with it.
A new rmw specific for microcontrollers? Client mode only, revised qos…
Sorry. I don’t know that there is a specification for only microcontroller targets.
the rmw_zenoh and the zenoh frequently change their specifications, but I believe that these are specifications based on the existing rmw_dds specifications.
Although there are too many changes…
Your question is about the implementation of rmw_zenoh_pico ?
We’ve implemented Zenoh Pico on our dual motor, field oriented control motor controllers running using STM32. What we did
Worked with Zettascale to fix a few bugs with Zenoh router on serial ports on our target implementation (RaspPi host, STM32 motor controller)
Rewrote Zenoh Pico so that it ran correctly on STM32 with ThreadX (this includes changes to system calls and interfaces with the UART)
Directly implemented the messages we were interested in as a struct in C. Essentially we prepared a payload using a struct, then simply passed a pointer to that to Zenoh as an argument when we called the function.
Hooked this up to our motor controller / FOC implementation
and Voila - we have robots driving around accepting Zenoh cmd_vel messages directly from Zenoh router and publishing odom messages. We are in the process of implementing all the other message types that we need for our motor controller. The rest of the robot operates with Zenoh Router so there is no special code on the host.
If you want us to open-source the ThreadX / Zenoh-Pico / STM32 bits of this we would be happy to do this - it may be useful if you are working with STM32 on any Zenoh implementation. Please just respond in thread that you’d like this. We just finished doing this 5 days ago - so the code probably needs a little bit of clean up and organization before it would be ready to put out there.
If you want motor controllers, if there is interest I can make a few more and ship them to interested parties who want to experiment. We currently have a couple of different hub motor types running on them and we’ve set them up so that we can get more types. Because they hook up directly to a raspberry pi you essentially can use a Raspberry pi to create an ethernet port - which, through the magic of publish and subscribe, means that you could essentially talk to your motor controllers over an ethernet port.
Very interesting, I’d be delighted to see your code, especially how you generate/get ROS concepts from the zenoh-pico API.
But I have another question for you, related to this other thread. As you describe it, your controller listens to cmd_vel topics and publishes odom, and this means that you implement the direct and reverse kinematics, e.g. differential drive, in your STM code. This is exactly the approach I had for a tracked robot, only with ESP and a simple motor driver instead of STM and your driver.
Though it works very well, another way looks more “canonical”: do the kinetics on the host (RPi in your case) using, for example, ros2_control, and have it drive each motor through a separate topic or communication channel. It would be akin to using Dynamixel smart servos, but instead of a proprietary control protocol, use some standard topic. In the reverse direction, it would be the same: the STM instead of publishing the /odom would only publish the /joint_state from the encoders, and the odometry would be computed on the ROS2 host.
Our current approach is to do simple diff-drive in the MCU and offer it as cmd_vel and odom_2d. But also provide direct control of each motor in case someone wants to do more precise/sophisticated stuff.
This has the added benefit that we can even connect the remote controller directly to the MCU and the robot can be controlled via standard diff-drive gamepad controls even when the main PC is off or dead. And for more advanced remote control, we have a keepalive topic from the PC to the MCU that tells it that the MCU should ignore the remote commands and just pass them to the higher level.
That is of course a good approach and definitely more canonical from a ROS2 point of view. It also may be an approach we may support in the future. There are a couple of reasons why we don’t do it that way at the moment
Timing To get good, high performance results in a differential drive robot the timing for the left and right wheel must be closely synchronized. We put both the left and right wheel in the same control loop so that left and right wheel get their impulses at almost exactly the same time. This means that you get exactly what you ask for from your robot, if you want the robot to go exactly straight that’s what it does.
More on Timing The original architecture of our system is that the host computer isn’t necessarily deterministic - while the motor controllers are. The result is that if you send separate messages from the host computer to control the left and right wheels there is no guarantee that the messages will arrive in timings that are highly consistent. See remarks about lack of synchronization in a differential drive robot.
Bandwidth Our design is that the micro-controller driven elements operate in a deterministic way with extremely fast control loops on the microcontroller, while the host computer (at the moment) is not running a RTOS so is non-deterministic and has relatively slow control loops. This means we have extremely high bandwidth on our motorcontrol control loop. To take advantage of this you want your kinematics down on the controller.
Fusion with other elements because you can set up such high speed interactions on the motor controller, the basic concept is that we have relative measures of position on the motor controller with absolute measures on the host computer. This means you have IMU on the motor controller and multiple sources of high frequency odometery data (encoders, electrical angle of the motor, magnetic ticks) all of which can be sampled at a kHz range. On the host computer you have camera, LIDAR and GNSS which can be sampled much slower (at most 20Hz). You have absolute positioning determination happening on the host computer slowly and high frequency relative determinations on the motor controller. Because you have such high bandwidth to the relative measures you can integrate over thousands of samples a second and get much better results - something that would be impossible to do on the host computer if we tried to feed up all the data there. The consequence is that you need to feed back a fused relative measure of position, the good point though is that it will be much more accurate than if you tried to do it all on the host computer.
The last and least good reason. We’ve been designing these things for a while and this design while advanced is descended from a design that pre-dates ROS2 completely - part of this is historical.
In the end the motor controller should just be an appliance. You shouldn’t have to worry about what it does. You should send it messages and you should get back precise odometery.
Of course the MCU has to offer a single topic to control any number of joints simultaneously. That’s a must for good control. However, if it provides a velocity- or position-level interface to the joints, there’s no more a need for kHz control loops on the PC. Controlling the velocity setpoints at 20-50 Hz with nondeterministic OS was always ok for us.
Regarding the odometry fusion, you are right. The closer to the source you compute it, the better. However, that also means you can’t reuse much of the ROS stack like imu filters, which is a pity.
Yes, these advantages are real for both approaches, as are the drawbacks. Perhaps having the controller do everything would show more of its benefits in an omnidirectional robot, which are more sensitive to fine motor control than differential ones. Conversely, if your differential robot is actually tracked or skid-steered, then no matter how fast you do the motor sampling the odometry will always be horrible, and it could be more advantageous to run the kinematics on the host where it is easier to fiddle with parameters and coefficients.
Also, in the category of bad good reasons, having your own kinematics in the MCU means you are responsible for the simulated version
Hi,
This sounds very interesting. I have a very similar setup: Motor Controller with STM32 (Core2 Board from Husarion) which is connected via Micro-Ros over serial to a Rpi4, so I am intested in your implementation.