IPC in ros2

RTI’s DDS does use shared memory internally when possible, I think.

Hi sagniknitr,

Fast RTPS will support shared memory in future releases, is in our roadmap. Regarding microcontrollers we are developing a lightweight version of Fast RTPS based on an OMG future standard for constrained resources devices. Of course it is not a full DDS/RTPS implementation, but a reduced API and different protocol.

You can see the first prototypes in the Dronecode.org use case, and we are working in an release now. This will let you to use ROS2 in micro-controlllers.

If you want more details, please contact me.

1 Like

Jamie,

Thanks for the update about how Fast RTPS will be evolving. A few questions, is the prototype with Dronecode you’re referring to the work you did on the PX4 project? Additionally, can you point to any public links regarding this new OMG standard for constrained resources? I know that the RTPS standard is publicly available, is a draft of the document you’re referring to available?

Thanks
- Eddy

The DDS XRCE (“eXtremely Resource Constrained Environments” because all OMG specifications need a cool acronym) specification isn’t even close to finished yet. The RFP is available but you have to be an OMG member to get the working documents, unless someone involved is willing to provide them. When the specification is adopted, that becomes publicly available.

As seems to happen a lot with the DDS specs these days, the RTI/eProsima/Twin Oaks camp and the PrismTech camp have produced their own versions of the XRCE specification and are struggling to reconcile them into a single specification that the OMG can adopt. I won’t wade into the debate over which is better, but I’ll let you draw your own conclusions from the fact that one of the two camps only has one company it it. :wink: I hope it’s not going to be the RPC for DDS spec all over again; the arguments over that could be heard in other meeting rooms. But it seems likely that it will be a year and a half or more before the final specification is released.

2 Likes

Thanks for the informative reply @gbiggs, I had no idea the work was under way.

I attended the OMG meeting last week, so I took the opportunity to hear what PrismTech and RTI/TwinOaks/eProsima had to say about their competing submissions. The following are the notes I took during their presentations. Remember while reading this that it is based solely on the presentations. I haven’t read the submissions themselves in any detail yet.

Prismtech presentation:

  • They believe that their proposal is more efficient in the wire protocol (trying to save every single byte possible), and supports brokered as well as peer-to-peer communication.
  • They have invited the competing submitters to join the PrismTech submission.
  • PrismTech believes that they have presented their final submission in Brussels (June) and so were expecting to vote for acceptance in New Orleans (September), but because there are still two competing submissions, and no final submission document was received (only a presentation), this is not possible.
  • PrismTech’s reasons for going forward with their own submission:
    • XRCE tries to target the most wire/power/memory efficient protocol, targeting not just IP infrastructures but a whole range of infrastructures.
    • Their submission provides reliability and fragmentation.
    • Their current prototype runs on an 8-bit microprocessor with 1 KB of RAM and has a wire overhead of 4 bytes for data samples.
    • They got a review from the AB which was favourable (only editorial comments), and they believe that the AB review shows that they satisfy all the mandatory requirements.
  • XRCE applications can be brokered into a DDS data space, or they can discover one another and communicate P2P.
  • Their submission is only about the protocol, it does not say anything about the API.
  • A DDS-XRCE Agent running on permanently connected hardware provides the access to the rest of the (non-XRCE) DDS domain.
  • XRCE provides a data space abstraction in which applictions can read and write data autonomously and asynchronously.
  • Data read and written by XRCE applications is associated with one or more resources identified by a URI.
    • An XRCE resource is a closed description for a set of named values.
    • Resource URIs allow for wildcards, which means that more than one resource can be targeted in one definition, which is useful for capturing collections of sub-namespaces.
    • A resource with a cardinality of one is called a Trivial Resource.
    • Resources also have properties, which are used to attach QoS settings to them, such as “this resource is transient” or “this resource is reliable”.
    • An XRCE selection is the conjunction of a resource and a predicate over the resource content and properties, used to filter a selection. For example, finding all lights with a luminosity greater than zero. (It seems to provide functionality similar to a very basic subset of SQL.)
  • There is a mapping from XRCE resources to DDS topics.
  • Resource serialisation uses the same format as DDS.
  • All XRCE messages are a single-byte header and a body.
    • There are flags in the header for reliability and synchronicity.
    • Messages can be “decorated” by prefixing them with a one-byte pre-header, which determines some of the properties for the following header byte, such as fragmentation.
    • The protocol is little-endian.
    • Variable-length encoding is used to save space.
  • For addressing, the source address for each message is assumed to be a unique address of the sender.
  • The protocol is modular, with a core profile (required for any communication), an optional query profile, and an optional discovery profile.
    • e.g. If using a communications system like Bluetooth, which already has discovery, the XRCE discovery profile can be removed, saving some space.
    • Discovery happens by scouting (sending a scout message to ask for types of nodes to reply, i.e. broker nodes or durability services or peers or clients).
    • Other nodes reply to a scout with a HELLO message.
  • After discovery, two nodes need to open a session, which enables publishing and subscribing between those two nodes.
    • Authentication data is included in the session open message.
    • Locator information, telling the other node how it can be reached, is included in the open message.
    • The receiving node replies with an accept message or a reject message.
    • There is an FSM describing the states a session goes through.
  • Resources and selections are uniquely identified by numerical IDs to save space on the wire.
  • A declare message is sent to declare what resources, selections, etc. a node has or will publish or subscribe to.
  • Subscriptions can be push, pull, periodic pull or periodic push.
  • All data messages and declarations are transported over a conduit (for the session), which is a pair of a reliable and a best-effort channel. Multiple conduits may be used in parallel to avoid head-of-line blocking and allow multicasting.
  • One decorator is used to select the conduit for the next message.
    • It is possible to make this decorator one byte instead of two if the number of conduits is less than five.
  • There is a sync message available to set the next sequence number to expect (e.g. for when not starting at zero).
  • The ACKNACK message can acknowledge multiple messages at once, usually up to the given sequence number. It has the option to optionally request retransmission of one or more messages after that point.
  • There is a one-shot write function in the protocol, which can do a write of a resource without needing to do any prior registration or discovery.
  • This proposal appears to be a very complex protocol with a lot of options in message header structures. An implementation would have a lot of choice points during the decoding of a stream of messages. The “decorator” idea especially may save bytes on the wire (and in message construction buffers) in some cases, but it complicates the protocol implementation.
  • PrismTech are already trying to bring their submission to market as a product.

RTI/eProsima/TwinOaks presentation:

  • This proposal also uses an XRCE agent to provide access from DDS-XRCE nodes and the DDS domain.
  • An important use case for them is that devices will tend to mostly sleep, and only wake up occassionally to do some processing, and send and receive data. This means that two devices may never be awake at the same time. This is why they have the agent, which is permanently present and provides a way for XRCE nodes to communicate with each other.
  • They say that their proposal focuses not just on the XRCE protocol, but also on the interaction between the XRCE protocol and the DDS domain.
  • It is possible for an XRCE agent to behave as an XRCE client to another agent, allowing for a hierarchical structure of agents and clients.
  • Their proposal is based on the web-enabled DDS specification, with a DDS-XRCE object model in the agent that has a one-to-one mapping to the DDS data model.
  • eProsima have put up a demo on Youtube: https://www.youtube.com/watch?v=HJ5eBQ2tZNQ
  • Their demo uses 43 KB of RAM (much bigger than the PrismTech proposal, which can fit in 1 KB).
  • The XRCE Agent object model is very similar to the DDS object model, making the mapping very simple.
  • XRCE objects are modeled as resources that are addressable by their name and have CRUD operations.
    • Each resource has a name to address it within the agent and a context; a representation describing the resource; and an ID.
    • Resources can be represented as a name, an XML description, or a binary representation.
  • Authentication capability is built into the proposal.
  • Types are typically pre-defined profiles in the XRCE Agent.
    • It is possible to transmit types as binary or XML representations.
  • Message structure:
    • A message header is either 4 or 8 bytes, depending on if a clientKey is used.
    • There is a sub-message header, which is another 8 bytes.
    • It is possible to send data in a sequence, meaning the header only needs to be sent once for a bunch of samples.
  • The transport can be message-oriented or packet-oriented.
  • CDR and DDS-XML representations are reused.
  • Message overhead is typically 12 bytes.
    • They consider that further reduction in overhead would increase complexity and reduce robustness. e.g. They do not need different code paths for multiple sessions, re-connections, handling variable-length encoding.
    • They say it compares favourably to TCP/IP overhead (40 bytes).
    • They think they could save 6 bytes from their message header, but the reduction is only 6% in the context of total overhead (when considering the use case of using TCP/IP as the underlying transport protocol) and so is not worth it in the face of increased complexity.
  • They also received 3 AB reviews, which they claim were supportive and did not find any non-editorial problems.

My summary:

  • The submissions are very different. PrismTech is aiming for cutting down the bytes used by the protocol to the absolute minimum at the expense of all else. RTI/eProsima/TwinOaks are favouring some overhead in order to achieve a simpler and more robust protocol and implementation.
  • PrismTech’s protocol is overly complex with too many choices during the decoding of a message.
  • The extra overhead of the RTI/eProsima/TwinOaks submission is not likely to be a problem in the majority of embedded micro-processors used these days (although I’m not sure how many 8-byte, 1KB-of-RAM micros there are in use in new products). However, their consideration that TCP/IP is going to be the most common transport may not be accurate.
12 Likes

Thanks for the summary @gbiggs! I’m very interested in how the work progresses

1 Like

Thanks Geoff, great stuff!

Hello @gbiggs,

First of all I’d like to point out that I am one of the author of PrismTech’s/ADLINK XRCE proposal, thus you may think I am giving you my perspective, in that case I urge you to read original documents to see how what I am providing here are facts. That said, let me make a few rectifications to some of the points above.

You say:

* They [PrismTech] believe that their proposal is more efficient in the wire protocol (trying to save every single byte possible), and supports brokered as well as peer-to-peer communication.

We are Engineer and Mathematicians not priests. Thus we don’t believe we measure :wink: You can find the result of our analysts here http://bit.ly/2yjUZMy but you are welcome to derive them by reading both specs.

* PrismTech believes that they have presented their final submission in Brussels (June) and so were expecting to vote for acceptance in New Orleans (September), but because there are still two competing submissions, and no final submission document was received (only a presentation), this is not possible.

This is also partially correct. We had asked to present what would have been our final proposal in Bruxelles to give a chance to the task-force to digest and one more opportunity to the competing team to join. This has nothing to do with the Vote-to-Vote and the fact that the task-force did not decide to allow the Vote-to-Vote procedure. The OMG has very complex rules… I know that can seam strange, and they still surprise me after having spent more than 10 years dealing with it.

* Their submission is only about the protocol, it does not say anything about the API.

The RFP (which I wrote) asks for a protocol not for an API.

* This proposal appears to be a very complex protocol with a lot of options in message header structures. An implementation would have a lot of choice points during the decoding of a stream of messages. The “decorator” idea especially may save bytes on the wire (and in message construction buffers) in some cases, but it complicates the protocol implementation.

First off, the protocol has a single byte header of which only 3 bits are used for flags and the remainder 5 bits for message-id. There are two flags that are used consistently to identify Reliable and Synchronous messages and another few flags used to mark the presence or absence of some information in the message. Personally I don’t find that daunting complex, it is quite normal in protocol to use flags to this end.
Additionally, you may argue that this may add some complexity in the message parsing – but again – I think that checking a flag and deciding wether some field is going to be present or not does not belong to the realm of hard. It is also worth pointing out that some flags are just informative and don’t require any kind of branching in the decoding. Finally, what I can tell you is that with this protocol we have measured performances that literally blow away RTPS! If you are curious we’ll share the numbers… Or better make available the code for you to see with your eyes :wink: And BTW, if our complex specification can fit in 1KB and RTI&Co simple protocol fits in 43KB… There is something wrong… Thus either our is not so complex or their is not so simple :slight_smile:

As I’ve explained several times during previous OMG meeting we have customers count bytes, and in some use cases they are not willing to spend more than 7 bytes wire overhead for data samples. Our proposal currently has 4 compared to that of RTI/TwinOaks/eProsima which has 16 bytes. We have had our implementation run on a Makeblock robot using BLE with an MTU of 20 bytes. You can check the comparison on this deck http://bit.ly/2yjUZMy and should wonder why the competing proposal did not provide a proper analysis of the wire overhead. Additionally when leveraging batching we have a wire overhead of (3+n)/n where n is the number of samples being batched.

Some other thing worth point out is that the protocol allows for data to be pushed, pulled or periodically pushed or pulled. The write protocol also allow to pace the streaming of data that results from a remote query. This is extremely important when dealing with resource constrained nodes that need to consume data little by little.

I did not hear RTI saying:

* They also received 3 AB reviews, which they claim were supportive and did not find any non-editorial problems.

But this is far from being true. The AB review should be public and I can ask permission from the AB to post those. The truth is that the two reviews of the AB raised questions concerning whether the submission was actually answering the RFP. The reason why RTI did not want to vote is that their submission – if selected would have been killed by the AB. That is as simple as that.

I’d like to understand why you say this:

My summary:

* The submissions are very different. PrismTech is aiming for cutting down the bytes used by the protocol to the absolute minimum at the expense of all else. RTI/eProsima/TwinOaks are favouring some overhead in order to achieve a simpler and more robust protocol and implementation.

What do you think is our submission cutting out? We are wire efficient yes, extremely wire efficient but at the same tine we support:

  • Dynamic Discovery (RTI does not, please read their spec!)
  • Peer to Peer Communication (RTI does not)
  • Client to Broker Communication
  • Non IP Transports (RTI does not)
  • Generalised Queries (RTI only supports DDS-like queries)
  • Push/Pull/Periodic-Pull and Periodic-Pull Readers
  • Unicast and Multicast communication – for both client to broker and peer-to-peer
  • Durability

At this point my question is have you’ve read both specification? If not, I suggest you do our is available here http://bit.ly/2wtJL3m

I hope this was useful in clarifying a few aspects and I am looking forward to get some feedback once you’ll have read both submissions. I’ll also be more than happy to answer any question you may have about our protocol.

A+

1 Like

Dear @gbiggs,

You are being to nice to PrismTech as everyone knows that consistency in design and elegance is seldom achieved through multi-vendor compromises. I think it is best for people to look with their eyes at what the two submission can do and make their own decisions.

On my side I like debates, I like inquisitive minds and I like hard questions. Thus I’ll be more than happy to answer any question on the XRCE protocol proposed by the PrismTech/ADLINK team explain why it is better than RTI proposal… And actually it is better than RTPS.

Thus, please let’s start the open debate to dissect the reasons why our proposal is the one which should be voted and by far the better one.

I’ll give you another small hint of why our XRCE proposal is an improvement over RTPS… Do you know how the RTPS protocol deals with discovery? What happens when you have loads of topics, readers and writers in your system and very asymmetric nodes?

Have you ever tried to do a one shot write in DDS? How much protocol traffic are you going to generate to make that happen… And how many entities do you have to create?

I’ll stop here… for the time being :wink:

A+

1 Like

Thanks @kydos for jumping into the community in such an energetic manner. I think we all appreciate having one of the authors of the proposals providing feedback. That said:

  • The view @gbiggs provides is a) unbiased, b) the view of a roboticist (that’s what this community is about :wink: ! ) and c) based on the information someone got from hearing “your presentation”. Note the following:

which answers:

  • I believe we all appreciate technical argumentation and slides. I love slides. But what I love even more is code and things that I can reproduce. How can I verify your arguments through experimental results? Is there any open code that supports your arguments? More than bashing around, i think it will do a lot of good to facilitate implementations that others can reproduce in common platforms. Even early stages will do. You might find that you could even get some support (and feedback!) before launching it officially and furthermore, that’ll definitely convince a lot of people on why your approach is better.
  • Last but not least, I really hope that the passion you’ve shown answering this thread is shown by supporting and making OpenSplice better.

I think I’ll stop here… for the time being :wink: .

Hi Angelo (@kidos),

Coordinate different companies to get a common view on a complex matter is always hard, and in this case, there are multiple design options leading to different tradeoffs.

Our submission (RTI, Twin Oaks & eProsima) already aligns the views of three different DDS vendors, and sure we will try to incorporate ideas of your submission.

Our submission tries to accomplish the following:

  • Propose a familiar model to the final user, making use of the DDS object model and specifications: Serialization (CDR), representation (XML-DDS), and some ideas of WS-DDS mapping
  • Neither the protocol or the API is designed to save every possible bit, but to have something robust, flexible and easy to use.

We coded a PoC of our submission, and during the presentation, you asked about numbers. At that point, with no optimizations at all, and in debug mode we answered 43Kb. I asked my team to optimize a little bit the code, and here are the numbers we have now for the client:

Total Memory Use: 8 Kb

But we could squeeze that even a little more. We are releasing this as Open Source (Apache 2) so anyone can review the results.

But again, we are not aiming to be as small as possible. We are covering all the requisites of the DDS-XRCE RFP, and testing our solution in what we consider typical microcontrollers today.

Regarding the process at the OMG meeting, we (RTI, Twin Oaks & eProsima) didn’t want to confront both specifications and choose one of them, but have the time to incorporate ideas from your submission, and that is why now we have an extended deadline.

Let me join in the fray. I’m the other author of the PrismTech submission and the one who built our tiny prototype. I wasn’t present at the OMG meetings, and I won’t waste any words on what may or may not have happened there.

Firstly, I don’t think a contest of bytes of RAM adds real value to the discussion, although of course it is an honourable contest in itself :slight_smile: I am surprised that you, @Jaime_Martin_Losa, had never even had a proper look at the memory use of your implementation given the purpose of the exercise in the first place, but if it is 8kB now then it is much better already — if still 7kB overweight :wink: In any case, memory use is determined more by the implementation than by the protocol messages.

The precise overhead on the wire is of more interest, as this is fundamental to the protocol. BLE gives you 20 bytes to play with, and a difference of a few bytes of header adds up in that context. Furthermore, as Angelo pointed out, we have customers to whom 8 bytes is too many already. Yet even that is not of such great interest to me in this discussion.

What really matters in my opinion is a difference in philosophy. The two proposals suggest very different views of what one would ideally want to accomplish.

The RTI/TwinOaks/eProsima proposal is limited to providing a means for performing DDS operations remotely, and it don’t see how it can do anything other than that. In a sense, it is just a hand-crafted alternative to CORBA with a lower overhead. (Simply using CORBA actually “just works” if the DDS implementation is faithful to the IDL interface mappings, even if it is ugly.) To me, this route is a pragmatic way of going about satisfying the RFP, but at the same time, an uninteresting one. (Sorry @Jaime_Martin_Losa and others …)

We chose to design a compact protocol that can support what amounts to performing DDS operations remotely, but doesn’t limit itself to it. Instead, it also supports a DDS-like peer-to-peer network with vastly lower overhead, and, in many ways a level of flexibility in specifying what data is of interest (through URIs and selections) that DDS doesn’t natively support.

All of that would be of little value if it doesn’t perform well or doesn’t scale well. Just like the code size and memory use are mostly determined by the implementation, so is maximum sustainable performance more determined by implementation than by the details of the protocol headers. Size-wise, my prototype can run as a client on an Arduino Uno (8-bit CPU, 2kB RAM). A small test application using the same implementation configured as a peer easily sends ~700k 8-byte msgs/s from one RPi3 to 3 others (CPU is << 100%, network load ~75% of Fast Ethernet, so I really should investigate why it isn’t faster), and goes another order of magnitude faster when run over local loopback on my MBP. That’s better than your typically DDSI implementation. Is this relevant? That depends on whether you have high rate, tiny messages …

Now my test application doesn’t implement all of DDS — not even close — and this is another significant reason why it can do this with only a few kB of code and RAM. At the same time, this is, I believe, where it gets interesting for ROS2.

As ROS2 has its own middleware abstraction layer that uses only a fraction of the DDS feature set, putting ROS2 directly on our protocol would get you a smaller and a faster system. Smaller and faster usually allows doing more interesting things, even if I can’t say what exactly those interesting things will be.

Disclaimer: I can’t do run ROS2 over it today, there’s more work to be done on my prototype before it supports all that is required. And I wish none of you would have to take my word for the data I mentioned, but that is not something that is in my power to solve today.

1 Like

It looks like I started something of a minor storm moments before starting a holiday…

I’m thankful that the DDS vendors, PrismTech, RTI, Twin Oaks and eProsima, are all engaged enough in the ROS community to be present on the Discourse board. It is encouraging to future adopters of ROS 2.

I wasn’t implying religious believe. It’s an simply expression to describe someone making an assertion. :slight_smile: I fully agree that PrismTech’s submission has much smaller messages than the RTI/Twin Oaks/eProsima submission based on the two presentations alone.

While this is true, the other submission has managed to define an object model as well, and in addition kept it close to the existing DDS one.

It is quite common to use flags. TCP is full of them. My concern is that the protocol is, in my opinion, undoubtedly complex and, based solely on the presentations, more complex than the other submission. Complexity and size are often a balance and in this situation we appear to have one submission at each end of the balance.

With the sorts of overhead you are achieving, I’m not surprised performance is amazing. I’d still like to see numbers, though. :slight_smile:

As was stated elsewhere in this thread, eProsima’s implementation wasn’t optimised. Since you said you are trying to bring yours to market already and eProsima claimed theirs was a tech demo, I’m not surprised yours is more optimised and thus smaller. Of course, the numbers are definitely in your favour for RAM usage. But I’m curious how much program memory each implementation requires, too. This is often the limiting factor on embedded microprocessors rather than the RAM usage.

I didn’t catch that even once during the presentation. Next time, put such an important motivating factor in your slides. :wink: RTI and co were much better at motivating their design decisions, and that put a positive spin on their submission.

You probably also should have put that requirement in the RFP, if it’s that important. The other submission cannot aim for a requirement they are not aware of.

Since the submissions have not gone to the AB yet, as far as I know, then there should not be any official AB reviews, which suggests to me that RTI asked for unofficial reviews from AB members. This may be why they are not public?

In that case I am very interested in seeing what these AB members wrote.

That may have been RTI’s reason, but the reason the rest of us present voted no is because we still have two vastly different submissions with no apparent readiness to work towards a single one. PrismTech even behaved in their presentation as if they are expecting RTI, Twin Oaks and eProsima to through away their submission and go with PrismTech’s.

“Cutting down”, not “cutting out”. I meant that you are trying to reduce the size of the messages on the wire as much as possible, at the expense of possibly needing more complex parsing code. I didn’t mean to say that you are cutting out features. It was clear from the presentation that PrismTech supports more features and has more flexibility than the other submission. But there are trade-offs involved.

Neither of these are required by the RFP. The RFP heavily directs the submitter towards the style of architecture that RTI/Twin Oaks/eProsima provided.

Yes, this is something that I was disappointed about. Hearing RTI’s presentation talk about TCP/IP only seemed to rule out using it on things like Zigbee. But on the other hand, perhaps it’s readily adaptable?

Well, it is DDS-XRCE, is it not?

It was very useful. I wish I had had this information during the presentation. I still have not had time to read the submissions in detail and unfortunately will not be able to do so before November, but fortunately we now have until February next year to try and resolve this situation.

And, as @vmayoral said, having code available would make a difference to how well we can judge things like implementation complexity. :wink:

While this is true, we are operating at a standardisation organisation, not a rubber stamp provider. There are interested parties beyond just the implementers. We would prefer not to just hold a vote on which submission to go forward with. We would prefer the submitters to actually work together and produce a single submission that combines the best of both without any technological compromises (yes, I’m aware how hard that is).

Or, you could provide the answers to those questions, along with the equivalent answers for your submission, so we can see and compare the data to back up your claims.

This is the strongest impression I got from the presentation, as I said in my own notes. The data model is similar to DDS, which makes adoption by existing DDS users easier, and the ability to implement the protocol in a relatively simple piece of code (which makes it easier to verify and certify) was considered as important as saving bytes on the wire. I’m not sure where the correct balance is between these two requirements, but the PrismTech implementation really gave the impression of being at one extreme.

This is something that I think is really relevant but that PrismTech have not addressed at all, and RTI/Twin Oaks/eProsima have not addressed enough. What are the typical microcontrollers in use today? What are the target environments for this protocol to be used in?

The RFP explicitly says this:

Both submissions fit within both the RAM usage (with much room to spare) and the protocol overhead.

More specifically, the actual mandatory requirement is:

Again, this should have been in the RFP if it is so important. That would have saved a lot of trouble. All we got was an evaluation criteria:

This is a miserably small set of criteria for a complex design space. Even you, @eboasson, say that protocol overhead is not as important as the design philosophy.

Which I agree is fundamentally different between the two proposals, and that this is where the root cause lies in the failure to reconcile them.

The RFP not-so-subtly pushes submitters in this direction. You cannot fault them for taking it at face value.

I will read both submissions and when I do, I will report back with more technically-informed comments.

2 Likes

Hello @gbiggs,

And you see this as a positive aspect? Our model is simpler and more user friendly. For instance, how many people can digest DDS partitions? That said, we have a well defined mapping between XRCE resources and DDS topics.

Are you an OMG member? If so I’ll forward you the reviews. Both submissions went to the AB and the reviews were posted both on ab@omg.org and mars@omg.org. If you have access to those mailing list you’ll be able to see them. I also recommend you take a look at those.

It was impossible as the other vendors did not want to agree on such a low bound. The 24 bytes was the least we could agree on. This is why there is an evaluation on wire-efficiency. This matters were discussed at length, but again I don’t think you attended those meetings, thus you are missing part of the history and the context. In any case, all of those documents are on the OMG archives, thus if of interest you to reconstruct it. Just search for presentation I did on XRCE for almost a year. starting from 2015!

Yes, that is correct as it is since the very beginning that we are trying to do a joint submission. They’ve refused with futile arguments – if you ask me. We have put lots of effort to trying to join but that has not been corresponded. A pity that you were not in the Bruxelles meeting, otherwise you would have had a taste of it.

Again, you did not attend the end-less arguments we had during the RFP drafting. RTI does not want peer-to-peer in XRCE because they fear it could become as substitute for DDSI-RTPS. Again, this is not something I am inferring, but something that was openly debated during the RFP drafting. We don’t have any issue with that as we think that having a more efficient protocol than DDSI-RTPS for some use cases would be extremely useful.

For me that disqualifies completely the submission as in LowPAN environments nobody can afford to use TCP/IP…

I am glad that this helped clarifying the situation, please don’t hesitate to ask any other question. Concerning the code availability we are working on that. I’ll keep you posted.

A+

Dear All,

I wanted to let you know that we have just released under Apache 2 a peer-to-peer implementation of our zenoh protocol called Zeno-He (Zenoh Helium). This implementation fits in about 1KByte of RAM and has 4 bytes wire overhead. This implementation not only is incredibly resource efficient but it is also blazing fast as it delivers incredible point-to-point throughput and low latency.

The project website is available at http://zenoh.io and the source code at https://github.com/atolab/Zeno-He.

We will be releasing a brokering system by the end of the year, likewise we be glad to help-out integrating zenoh as one of the protocols supported by ROS2. This could allow to bring ROS2 on micro-controllers!

N.B. For those of you that are familiar with XRCE, zenoh is the protocol we are proposing for standardisation. But as the standard is not finalised yet, we will keep referring it as zenoh.

A+

Kydos,

Thank you for publishing the Zeno-He library so we can all begin to interact with it. It is especially useful for the ROS 2 user community to be aware of the effort since it implements the ATLab XRCE proposal.

I know I would be extremely interested in someone benchmarking Zeno and the proposed epromisa XRCE implementation, and perhaps can find time to do that.

This is really appreciated @kydos, thanks for making this available.

At present it’s unlicensed code though (atolab/Zeno-He#1).

@gavanderhoorn thanks for catching that. I’ve just committed the Apache 2 LICENSE file.

Please let us know if you have any questions.

Ciao!

I’ve finally found the time to read through both submissions (and just in time, to, with the OMG meeting next week!). Here are my notes that I took for myself as I was reading through them, along with some thoughts at the end.

RTIandCo submission

  • Client-server protocol to allow a resource-constrained device to interact with a DDS domain via a gateway (the “Agent” server).
  • Use of the client-server (broker) architecture is what allows the low resource usage.
  • The specification defines a simplified object model that acts as a facade to the standard DDS object model, enabling lower resource use to access a DDS domain.
  • Most DDS configuration is assumed to be doable on the agent (DDS side), so configuration options on the XRCE side are limited. This contributes to the simplified object model.
  • Access control, access rights, and managing disconnected clients are new features (over base DDS?) included in the facade object model.
  • Management of disconnected devices is handled using a session concept that persists across connections between the client and the server (e.g. when the client goes to sleep).
  • A pull mode is available for clients that do not want data coming in randomly. The client can query the object model on the server rather than changes being updated and pushed out in real-time.
  • The specification can be used for anything from extremely simple, pre-configured clients up to fully capable DDS devices (why these cannot just use DDS is not made clear).
  • The object model is resource-based: DDS-XRCE types, DataWriters, DataReaders and so on are represented as resources with a name, properties and behaviour.
    • Resource implementation is outside the scope of this document.
    • Resources may be shared or dedicated. e.g. Multiple clients might share a single DataWriter on the server.
  • Clients can only talk to each other via the DDS domain. i.e. Client 1 -> Server -> DDS domain -> Server -> Client 2. Multiple servers may also be involved.
  • Data can be sent as a single sample, a sequence of samples, either of these with metadata, or packaged data.
  • References to objects on the server can be made using a name (but it must be pre-defined?), an XML string, or a binary XCDR-serialised reference (although this is not available for all object types).
  • Clients can choose a QoS profile that is pre-defined on the server using a named reference. Or they can provide a QoS profile via DDS-XML that they wish the server to use for them. A combination of these is also possible.
  • All operations on the server are authenticated, and require a ClientKey. This is also used to identify clients.
    • Obviously authentication could be as broad as “anyone welcome”.
    • Creation and configuration of the ClientKey is out of scope (not great for interoperability).
  • Although the specification calls for authentication, it may be easy for a developer who is not careful and uses credentials widely to create clients that step on each other, messing with each other’s objects on the server.
  • In many ways this specification feels like a remote control for DDS, rather than a low-resource protocol and middleware in its own right.
  • The protocol is targetted at networks with a minimum of 40 Kbps of bandwidth, so you can give up on your 14.4 Kbps modem now.
  • A design goal of the protocol was that a complete implementation require “less than 100 KB of code”.
  • Clients absolutely cannot operate on their own; they must have the server available to function. No peer-to-peer communication is possible.
  • No vendor-neutral API is proposed.
  • The transport requirements are fairly strict. Fortunately most transports these days provide them. The requirements include:
    • Must be able to deliver messages of 64 bytes.
    • Message integrity must be guaranteed (but not reliability; messages may be dropped).
    • Must provide transport level security.
  • The protocol consists of a session, which carries one or more message streams with independent reliability settings. Each stream consists of ordered messages with sequence numbers so dropped messages can be detected and message order can be restored if the transport changes it.
  • The reliability setting of a stream is determined by the stream ID, rather than being a separate flag header or something like that. Streams with an ID in a certain range have a certain type of reliability. (Effectively the first bit of the session ID is a flag for reliable or not.)
  • Each message contains one or more sub-messages.
    • This structure reduces some resource usage, e.g. a single header can apply to many sub-messages, or a single message can operate on multiple resources on the server.
  • The payloads of most submessages are XCDR-encoded binary data.
    • The payload can be up to 32 KB.
  • Message overhead is between 8 and 12 bytes, with an additional 4 bytes for every additional sub-message.
  • The interaction model is purposely simple, allowing for pre-configuration to replace DDS’s discovery, etc. It is possible to rapidly initiate a session and begin writing data, assuming the server is available, configured correctly and connected to the DDS domain.
  • A fairly well-thought-out heartbeat system is available to maintain reliable communication.
  • The discussion of overhead should have also considered low-overhead transports such as IEEE 802.15.4-based transports. TCP may be an average case, a good case, or a bad case for relative overhead but because no data is provided it is hard to say. (My own brief research suggests that TCP is not a good choice for evaluation.) Message overhead should be compared to the commonly expected payload size rather than the transport size, since the transport used is up to the implementer.
    • Some of the arguments against reducing overhead are not strong. Reducing the number of possible stream IDs (and thus the number of possible streams) is arguably not a problem; how many streams is a small device likely to need in the common use cases? 256 seems like a lot of data for a device when the common example of a DDS-XRCE device given is “a temperature sensor”. Needing 8 bits for the sub-message type to allow future evolution of the protocol smells like aligning things on an 8 bit boundary; dropping 4 bits would certainly leave only two slots for new sub-message types, but dropping 3 would leave 18 and dropping 2 would leave 50.
    • Ultimately the message overhead discussion comes down to knowing what the use case is. Does an extra byte here or there matter that much? For ROS, possibly not.
  • Sample message sizes:
    • 30 bytes to initiate a session.
    • 13 bytes to request to read a single sample of data, followed by 15 bytes reply for the (4 byte) sample.
    • 23 bytes to request multiple samples.
    • 47 bytes to receive a sequence of two 4-byte samples with meta-data (12 bytes per sample).
  • Although XML is syntactically more exact, a more compact and easier to process representation such as JSON have been used instead. But, as noted, there is an existing DDS-XML specification so reusing it makes sense.
  • The demonstration implementation requires a microcontroller with 256 KB of RAM and running an operating system (NuttX). No demo with an OS-less microcontroller is mentioned. You won’t be running this on an Arduino.
  • The protocol is small and simple. It would be easy to implement (they state less than 2000 lines of code). It provides access to the entirety of DDS capabilities, which may be important for ROS, but it does so at the expense (in hardware and run-time costs) of needing a gateway server.

PrismTech submission

  • Despite appearing to be a more complex protocol during the presentations in September, the specification itself is half the length. Less diagrams?
  • This submission is much more formalised than the other.
  • The three main goals of this submission are extremely low footprint (an Arduino Uno is cited), extremely efficient wire protocol (overhead of just a few bytes), and supporting devices that regularly sleep.
  • This submission pays no attention to the API. It is only interested in the wire protocol.
  • Discovery is supported, and is also a separate compliance point so vendors don’t have to implement it if their target platform is too small.
    • Static configuration is possible.
  • Resources are used to represent information to be exchanged, with properties of these available. Resources are identified by a URI; the properties are always accessed via a /property postfix to the URI.
    • Reliable is the default setting.
    • Durable and transient resources are also available.
    • A query syntax that allows filtering resources is provided. For example, all resources where a data member(?) is above a given value. This is equivalent to the DDS filter expression topic subscription.
  • This submission uses an interaction model fundamentally similar to DDS, with DDS-XRCE participants reading and writing data in a data space.
    • An implementation can use a set of brokers, or a pure peer-to-peer infrastructure, or a mixture. XRCE clients can exist and function without any kind of special server.
  • The message header is a single byte, with 5 bits for message ID. This allows up to 32 message types.
    • Messages may be decorated with additional markers.
    • Variable length encoding is used for things like message length and integers.
    • Sequences and strings also have an encoding specified; XCDR is apparently not used.
  • Message payload may be any size within the limits of the transport.
  • Following discovery (or startup for a static configuration), a session is established between every pair of XRCE applications talking to each other. Part of opening a session includes ensuring that both sides can handle the same range of sequence sizes to avoid sequence number roll-over problems.
    • Sessions are kept alive as long as a message is exchanged during the specified lease period. There is a keepalive message that can be used when nothing else is sent. Both sides must actively maintain the session.
    • Sessions can exist across multiple transports, so it is possible to have multiple connections at the transport level using different transports and merge them into a single session, allowing the best transport at the time to be used (e.g. UDP for best-effort data and TCP for reliable data).
    • Multiple sessions cannot exist on the same connection because sessions are uniquely identified by the locator (i.e. address of the client). However since multiple readers and writers can exist within a single session this is not a significant limitation.
  • Authentication is included in the protocol, but the details are left up to the implementation.
  • After establishing a session, resources can be created using special messages. An atomic approach is supported, with all resources being requested and then a final commit message being sent to actually trigger their creation.
  • Data samples can be sent singly, in a stream, or in batches.
  • Data can be pulled or pushed.
  • Data fragmentation is supported allowing samples of arbitrary size.
  • There is a message available for round-trip latency estimation.
  • It is not clear how sleep cycles combine with the peer-to-peer operation mode. If one client sleeps, then wakes up and asks for data from another client (which it couldn’t receive earlier due to being asleep) but the publisher of that data is asleep, the system will deadlock.
  • Sample message sizes:
    • 3 bytes for discovery probe.
    • 4 bytes plus data size for a data sample.

My thoughts

The PrismTech submission is undoubtedly more complex, but it is also undoubtedly more powerful - although how much more depends on your use case. Most significantly, it supports discovery and DDS-XRCE applications do not need a server running to communicate even amongst themselves. The RTIandCo submission, on the other hand, is simpler but does not support any form of P2P communication, requiring a server to always exist even if you only have DDS-XRCE applications. Both would require some kind of gateway (which is explitly present in the RTIandCo submission) to talk to DDS-RTPS, but while the PrismTech one would require the data to be unpacked and repacked, the RTIandCo one probably would not because it uses XCDR for DDS-XRCE.

The PrismTech submission is superior for tiny-scale devices. There are many examples of these in use today, such as sensor motes. But for the ROS use case, are such tiny devices relevant? Regarding which is more suitable for ROS, this is not a straightforward question. PrismTech’s submission is more suited to implementing ROS on top of as a standalone rmw implementation because it would not require that a server always be present. On the other hand, it lacks a lot of the QoS capability of DDS, which the RTIandCo submission supports. But the RTIandCo submission is more like rosserial, rather than the fully decentratlised communications middlewhere that the PrismTech submission is. This doesn’t mean that an rmw could be built on top, but it would not be as straight forward to use, requiring additional functionality in roslaunch.

Based on the presentation, I got the impression that the PrismTech submission was very complex with many branching paths in processing a message, and the RTIandCo submission is relatively simple. Reading the specifications made clear that the RTIandCo submission is simple: it’s a simple protocol for a single task (proxying data between a DDS domain and a device). It would be easy to implement, but has drawbacks like needing a server for it to work at all. On the other hand, reading the PrismTech submission made clear that their protocol is not that complex. It’s not as simple as RTIandCo’s, but it’s straightforward, well thought out, and clearly designed for very small scale devices. Its decentralised nature would make it easier to use in a system where it is the only protocol in use, but if you want to mix RTPS and XRCE then you would need a gateway, and the gateway would necessarily be less efficient than that in the RTIandCo proposal. However, it would also be much less of a single point of failure.

A relevant question is, given that the PrismTech submission doesn’t support aspects of DDS like QoS (except for reliability), what is the benefit (aside from overhead) compared with using a subset of RTPS?

3 Likes