I’ve finally found the time to read through both submissions (and just in time, to, with the OMG meeting next week!). Here are my notes that I took for myself as I was reading through them, along with some thoughts at the end.
- Client-server protocol to allow a resource-constrained device to interact with a DDS domain via a gateway (the “Agent” server).
- Use of the client-server (broker) architecture is what allows the low resource usage.
- The specification defines a simplified object model that acts as a facade to the standard DDS object model, enabling lower resource use to access a DDS domain.
- Most DDS configuration is assumed to be doable on the agent (DDS side), so configuration options on the XRCE side are limited. This contributes to the simplified object model.
- Access control, access rights, and managing disconnected clients are new features (over base DDS?) included in the facade object model.
- Management of disconnected devices is handled using a session concept that persists across connections between the client and the server (e.g. when the client goes to sleep).
- A pull mode is available for clients that do not want data coming in randomly. The client can query the object model on the server rather than changes being updated and pushed out in real-time.
- The specification can be used for anything from extremely simple, pre-configured clients up to fully capable DDS devices (why these cannot just use DDS is not made clear).
- The object model is resource-based: DDS-XRCE types, DataWriters, DataReaders and so on are represented as resources with a name, properties and behaviour.
- Resource implementation is outside the scope of this document.
- Resources may be shared or dedicated. e.g. Multiple clients might share a single DataWriter on the server.
- Clients can only talk to each other via the DDS domain. i.e. Client 1 -> Server -> DDS domain -> Server -> Client 2. Multiple servers may also be involved.
- Data can be sent as a single sample, a sequence of samples, either of these with metadata, or packaged data.
- References to objects on the server can be made using a name (but it must be pre-defined?), an XML string, or a binary XCDR-serialised reference (although this is not available for all object types).
- Clients can choose a QoS profile that is pre-defined on the server using a named reference. Or they can provide a QoS profile via DDS-XML that they wish the server to use for them. A combination of these is also possible.
- All operations on the server are authenticated, and require a ClientKey. This is also used to identify clients.
- Obviously authentication could be as broad as “anyone welcome”.
- Creation and configuration of the ClientKey is out of scope (not great for interoperability).
- Although the specification calls for authentication, it may be easy for a developer who is not careful and uses credentials widely to create clients that step on each other, messing with each other’s objects on the server.
- In many ways this specification feels like a remote control for DDS, rather than a low-resource protocol and middleware in its own right.
- The protocol is targetted at networks with a minimum of 40 Kbps of bandwidth, so you can give up on your 14.4 Kbps modem now.
- A design goal of the protocol was that a complete implementation require “less than 100 KB of code”.
- Clients absolutely cannot operate on their own; they must have the server available to function. No peer-to-peer communication is possible.
- No vendor-neutral API is proposed.
- The transport requirements are fairly strict. Fortunately most transports these days provide them. The requirements include:
- Must be able to deliver messages of 64 bytes.
- Message integrity must be guaranteed (but not reliability; messages may be dropped).
- Must provide transport level security.
- The protocol consists of a session, which carries one or more message streams with independent reliability settings. Each stream consists of ordered messages with sequence numbers so dropped messages can be detected and message order can be restored if the transport changes it.
- The reliability setting of a stream is determined by the stream ID, rather than being a separate flag header or something like that. Streams with an ID in a certain range have a certain type of reliability. (Effectively the first bit of the session ID is a flag for reliable or not.)
- Each message contains one or more sub-messages.
- This structure reduces some resource usage, e.g. a single header can apply to many sub-messages, or a single message can operate on multiple resources on the server.
- The payloads of most submessages are XCDR-encoded binary data.
- The payload can be up to 32 KB.
- Message overhead is between 8 and 12 bytes, with an additional 4 bytes for every additional sub-message.
- The interaction model is purposely simple, allowing for pre-configuration to replace DDS’s discovery, etc. It is possible to rapidly initiate a session and begin writing data, assuming the server is available, configured correctly and connected to the DDS domain.
- A fairly well-thought-out heartbeat system is available to maintain reliable communication.
- The discussion of overhead should have also considered low-overhead transports such as IEEE 802.15.4-based transports. TCP may be an average case, a good case, or a bad case for relative overhead but because no data is provided it is hard to say. (My own brief research suggests that TCP is not a good choice for evaluation.) Message overhead should be compared to the commonly expected payload size rather than the transport size, since the transport used is up to the implementer.
- Some of the arguments against reducing overhead are not strong. Reducing the number of possible stream IDs (and thus the number of possible streams) is arguably not a problem; how many streams is a small device likely to need in the common use cases? 256 seems like a lot of data for a device when the common example of a DDS-XRCE device given is “a temperature sensor”. Needing 8 bits for the sub-message type to allow future evolution of the protocol smells like aligning things on an 8 bit boundary; dropping 4 bits would certainly leave only two slots for new sub-message types, but dropping 3 would leave 18 and dropping 2 would leave 50.
- Ultimately the message overhead discussion comes down to knowing what the use case is. Does an extra byte here or there matter that much? For ROS, possibly not.
- Sample message sizes:
- 30 bytes to initiate a session.
- 13 bytes to request to read a single sample of data, followed by 15 bytes reply for the (4 byte) sample.
- 23 bytes to request multiple samples.
- 47 bytes to receive a sequence of two 4-byte samples with meta-data (12 bytes per sample).
- Although XML is syntactically more exact, a more compact and easier to process representation such as JSON have been used instead. But, as noted, there is an existing DDS-XML specification so reusing it makes sense.
- The demonstration implementation requires a microcontroller with 256 KB of RAM and running an operating system (NuttX). No demo with an OS-less microcontroller is mentioned. You won’t be running this on an Arduino.
- The protocol is small and simple. It would be easy to implement (they state less than 2000 lines of code). It provides access to the entirety of DDS capabilities, which may be important for ROS, but it does so at the expense (in hardware and run-time costs) of needing a gateway server.
- Despite appearing to be a more complex protocol during the presentations in September, the specification itself is half the length. Less diagrams?
- This submission is much more formalised than the other.
- The three main goals of this submission are extremely low footprint (an Arduino Uno is cited), extremely efficient wire protocol (overhead of just a few bytes), and supporting devices that regularly sleep.
- This submission pays no attention to the API. It is only interested in the wire protocol.
- Discovery is supported, and is also a separate compliance point so vendors don’t have to implement it if their target platform is too small.
- Static configuration is possible.
- Resources are used to represent information to be exchanged, with properties of these available. Resources are identified by a URI; the properties are always accessed via a
/property postfix to the URI.
- Reliable is the default setting.
- Durable and transient resources are also available.
- A query syntax that allows filtering resources is provided. For example, all resources where a data member(?) is above a given value. This is equivalent to the DDS filter expression topic subscription.
- This submission uses an interaction model fundamentally similar to DDS, with DDS-XRCE participants reading and writing data in a data space.
- An implementation can use a set of brokers, or a pure peer-to-peer infrastructure, or a mixture. XRCE clients can exist and function without any kind of special server.
- The message header is a single byte, with 5 bits for message ID. This allows up to 32 message types.
- Messages may be decorated with additional markers.
- Variable length encoding is used for things like message length and integers.
- Sequences and strings also have an encoding specified; XCDR is apparently not used.
- Message payload may be any size within the limits of the transport.
- Following discovery (or startup for a static configuration), a session is established between every pair of XRCE applications talking to each other. Part of opening a session includes ensuring that both sides can handle the same range of sequence sizes to avoid sequence number roll-over problems.
- Sessions are kept alive as long as a message is exchanged during the specified lease period. There is a keepalive message that can be used when nothing else is sent. Both sides must actively maintain the session.
- Sessions can exist across multiple transports, so it is possible to have multiple connections at the transport level using different transports and merge them into a single session, allowing the best transport at the time to be used (e.g. UDP for best-effort data and TCP for reliable data).
- Multiple sessions cannot exist on the same connection because sessions are uniquely identified by the locator (i.e. address of the client). However since multiple readers and writers can exist within a single session this is not a significant limitation.
- Authentication is included in the protocol, but the details are left up to the implementation.
- After establishing a session, resources can be created using special messages. An atomic approach is supported, with all resources being requested and then a final commit message being sent to actually trigger their creation.
- Data samples can be sent singly, in a stream, or in batches.
- Data can be pulled or pushed.
- Data fragmentation is supported allowing samples of arbitrary size.
- There is a message available for round-trip latency estimation.
- It is not clear how sleep cycles combine with the peer-to-peer operation mode. If one client sleeps, then wakes up and asks for data from another client (which it couldn’t receive earlier due to being asleep) but the publisher of that data is asleep, the system will deadlock.
- Sample message sizes:
- 3 bytes for discovery probe.
- 4 bytes plus data size for a data sample.
The PrismTech submission is undoubtedly more complex, but it is also undoubtedly more powerful - although how much more depends on your use case. Most significantly, it supports discovery and DDS-XRCE applications do not need a server running to communicate even amongst themselves. The RTIandCo submission, on the other hand, is simpler but does not support any form of P2P communication, requiring a server to always exist even if you only have DDS-XRCE applications. Both would require some kind of gateway (which is explitly present in the RTIandCo submission) to talk to DDS-RTPS, but while the PrismTech one would require the data to be unpacked and repacked, the RTIandCo one probably would not because it uses XCDR for DDS-XRCE.
The PrismTech submission is superior for tiny-scale devices. There are many examples of these in use today, such as sensor motes. But for the ROS use case, are such tiny devices relevant? Regarding which is more suitable for ROS, this is not a straightforward question. PrismTech’s submission is more suited to implementing ROS on top of as a standalone rmw implementation because it would not require that a server always be present. On the other hand, it lacks a lot of the QoS capability of DDS, which the RTIandCo submission supports. But the RTIandCo submission is more like
rosserial, rather than the fully decentratlised communications middlewhere that the PrismTech submission is. This doesn’t mean that an rmw could be built on top, but it would not be as straight forward to use, requiring additional functionality in
Based on the presentation, I got the impression that the PrismTech submission was very complex with many branching paths in processing a message, and the RTIandCo submission is relatively simple. Reading the specifications made clear that the RTIandCo submission is simple: it’s a simple protocol for a single task (proxying data between a DDS domain and a device). It would be easy to implement, but has drawbacks like needing a server for it to work at all. On the other hand, reading the PrismTech submission made clear that their protocol is not that complex. It’s not as simple as RTIandCo’s, but it’s straightforward, well thought out, and clearly designed for very small scale devices. Its decentralised nature would make it easier to use in a system where it is the only protocol in use, but if you want to mix RTPS and XRCE then you would need a gateway, and the gateway would necessarily be less efficient than that in the RTIandCo proposal. However, it would also be much less of a single point of failure.
A relevant question is, given that the PrismTech submission doesn’t support aspects of DDS like QoS (except for reliability), what is the benefit (aside from overhead) compared with using a subset of RTPS?