Design By Contract

I am interested in this topic as well. But I would prefer to discuss only DbC in this thread. If you can convince me that I prioritize it higher than DbC I will be with you :wink:

This does not fit into this thread as well but what is the benefit of erlang VM integration?

For more information about the DDS DEADLINE QoS policy refer to page 95 in the DDS specification v1.4.

This relates to the way the message definition language is used in ROS. It defines data types, not node interfaces. Contracts are much more likely to be specific to node interfaces than to the messages, which are intended to be generic and highly reusable. Because ROS doesn’t currently have a node interface definition language, there is not yet a suitable place to specify contracts.

Furthermore, some contracts may be specific to a particular implementation of a node, and so wouldn’t fit in a node interface specification intended to be reused by many different implementations (although then I would argue that the nodes with different contracts should not be considered interchangeable and so should be using different interfaces).

Contracts aim is to enforce complex dynamic properties of a system.
Types aim is to enforce simple (usually) static properties of a system.

Therefore I am of this point of view : haskell - Comparing design by contract to type systems - Stack Overflow
Notice how the complexity increase from top left to bottom right.

So, before trying to do something complex (which means heavy maintenance, and likely to be unused until it is perfectly optimized), I would focus on the doable, lighter side of things.

I also agree with @gbiggs and would like first to see more strict enforced message types, before thinking about their combination in an IDL, how this would behave dynamically, and how to enforce some behavior and prevent others…
Right now the message field type is too ambiguous (“node N can subscribe to a message M with a field int, but actually there will ever be only even numbers there…” except the developer of N don’t know that, unless he goes through the code of all nodes publishing M)

This does not fit into this thread as well but what is the benefit of erlang VM integration?

Don’t try to reinvent the wheel, reuse 30 years of expertise in distributed system programming. There are already a bunch of people working on these questions in a distributed setting, and some tools available : Types (or lack thereof) | Learn You Some Erlang for Great Good!

By the way, isn’t this thread some kind of X-Y problem. Which problem exactly are you planning to solve with DbC ?

Always nice to see another Erlang fan! That language is such a pleasure to program in.

Some would say that this is one thing contracts are meant to check… but I think that depends on where you draw the line between what is a contract and what is the type. But I think you are correct in saying that this sort of information really needs to be available and checkable. It is helpful to have in documentation but far more beneficial to developers (and safer) for it to be automatically checkable.

Types/Contracts : In my mind, ‘Design by Contract’ was an informal concept/practice introduced a few years ago because type systems of most languages at that time was insufficient to guarantee correct program behavior. But it is fundamentally the same thing…
Except that we have researched type theory for a while now, whereas ‘contract theory’ is probably not what you would expect after learning about DbC…

These days I am following dependent types and experiments to bring them into distributed systems.

1 Like

That summarizes the difference between types and contracts. And that is exactly about why I didn’t propose types here: In my experience the hard to find defects tend to have their root cause in implicit, incomplete or missing definitions of the dynamic characteristics of here in ROS, node interactions.

Unfortunatelly that is exactly what I found out when looking into the ROS2 sources. One could add deadlines for topics that (a) do not change or (b) do change over node runtime they could be (a) defined and/or (b) updated via the rmw C API which wrapps the DDS DynamicData API or the statically generated DDS functionality from the IDL definitions. However as you said: The interface considers the aspects of the message description languages IDL only, not a node description language. And a node description language would be required to add functionality which would be most benefitial.

That’s right. The question should be: “How can I prevent from introducing defects into/detecting defects in distributed ROS systems which have their root cause in the dynamic interaction of several ROS nodes?” I am biased and did not propose physical continuous integration because that seems hard to implement for distributed systems. DbC or actually model checking based on kind of a node description language seems to be cheaper to me.

I might be stating the obvious here, but still worth reminding everyone I think…

How can I prevent from introducing defects into distributed ROS systems which have their root cause in the dynamic interaction of several ROS nodes?

  • Don’t build a distributed (==multiprocess) system if you don’t have to. Programming language elements (functions, classes, libraries, packages) are made for composing correctly in all sorts of ways, and there is usually theoretical background, tooling, conventions, processes, to help you satisfy the cognitive biases you didn’t know you had. No distributed software system that allows you to control the distribution graph, has anything equivalent to that currently. ROS is no exception (actually erlang might be the only exception).
    Example : A whole part of Operating System design is to prevent processes interactions, and most recent OSes are preempting ? This is opposed to the features suitable for a distributed system, which by definition needs process cooperation, and where controlling when each process can be interrupted, or not, is really useful. In one process, in one language, all these problems vanish.

  • If you have to build a distributed system, congratulations, you are doing distributed system research. This is not robotics and there is a different set of assumptions coming along in that context.
    Example : most existing and widely-used distributed software systems rely on the fact that a message, a unit of computation “task”, is atomic and idempotent. That requirement usually cannot be met in a robotic platform, because of side effects on the real world, the whole point of it. Painful lesson after a year working on GitHub - asmodehn/celeros: Celery ROS python interface - .

For the rest of us having to do both distribution and real world side-effect, I feel the most promising way, is still integrating side-effects into the theory. But, as far as I know, it is still a software research topic on its own.

Regarding ROS, the best bet is likely to integrate/interface/implement ROS with the existing programming language that provide the feature that you need, instead of trying to integrate “that awesome language feature” into ROS (because it implies re-implementation and proactive maintenance from ROS community for something that is not purely robotics related)

For DbC, I’m thinking if you get around implementing a Eiffel-based ROS interface/integration/implementation, you might find some interesting changes needed in ROS itself, even in REPs, in order to make that possible without compromising Eiffel. I’m thinking these changes would likely be worth it for ROS, especially in the long run.
Disclaimer: I’m currently following the same path with Python, improving how ROS integrate with it along the way, and finding basic problems where I didn’t expect to…

But ultimately, writing a “solid” software project is a matter of computer science and software engineering expertise, so general software theory, knowledge and tools apply there. It’s not a problem specific to robotics, and therefore robotic science and tools (like ROS) are not focusing on it.

ROS is a multi process system on the robot/(multi-)processor level itself. I guess you mean multi device system instead of multiprocess system. Right, distributed systems are beyond robotics. However robotics and distributed systems have already merged into distributed robotics (Kiva robots in an Amazon warehouse in 2011). Isn’t it the time to ease the development of such systems (and single robots) by providing better framework capabilities?

As Amazon already did non real time distributed robotics in 2011 I think considerations w.r.t. preemption and inter-process interaction beyond the robot level is more critical in real-time (soft/firm/hard) applications.

I would argue that if a framework implementation lacks conceptual features this cannot be compensated with the choice of suitable programming languages which address lower levels of abstractions only. But you are right, it is better to improve a framework w.r.t. to it’s given capabilities instead of trying to integrate language features (at least in the short term).

Instead of waiting for more capabilities in ROS2 it is probably more valuable to add more capabilities to the current ROS1 framework. Without DbC on the ROS node level one can verify the ROS node interface with rostest. Currently the set of reusable test nodes is limited. What about adding more generic test nodes like fake topic publishers (draft state) to ros_comm? (The example tests can be run with catkin_make run_tests_rostest_rostest_test_faketopicpublisher.test and catkin_make run_tests_rostest_rostest_test_faketopicpublisher0.test in the catkin workspace. Quick start guide about the other generic test nodes of rostest.).

Actually I do mean multiprocess. But I do not mean “We cannot make it work/do what we want” or even “We cannot make it do what we want all the time”. I mean “We cannot be sure that it will never do things that weren’t intended”.
And I know for a fact that there are more robots out in the world, doing unexpected dangerous thing, because they weren’t programmed with total safety in mind, than most people know about. All it takes is an unchecked integer to wrap around, and disaster strikes in the real world. Make it distributed, and the disaster likeliness increases exponentially.

So sure we can build robots, and distributed systems, but, when it come to a robot that can poke the eye of a child because it looks like it’s a button to press, it’s different than a harmless backend database cluster in the basement, so you better be sure of what you’re programming… and for distributed systems (multiprocess) the theory is quite new, so most language/frameworks won’t help you there.

Definitely yes, but instead of adding potentially heavy features, without being sure they will be used and maintained, I would first focus on doing like https://jepsen.io/, that is, provide tools that show people working in robotics, what and where the problems are in the system they build. Actually probably doing the same as what works for security hackers : tell people their system is broken/unsafe, nobody cares. Make anyone (including their customer) able to break it, and then they react… and some might listen.

I personally fully agree with this statement, but I think most people are focusing on ROS2 these days, which means even less maintenance resource for ROS1, so we need to be careful that what we add is really worth it.

I would also agree there. You can always send a PR to add the tests node you miss to rostest, and discuss it with the maintainers :slight_smile:
And you can also write a package for the specific nodes you need. I started doing that for my own needs in GitHub - pyros-dev/pyros-test: Test package for Pyros.
But these days I am thinking we need something more like a ROS Simian Army :

  • some package that randomly kill and restart nodes, probably based on launch files…
  • some package that randomly sends messages around, like a ros-hypothesis that would generate any valid message based on a ROS definition, to test your nodes against. I have already implemented most of this one, as part of other projects, but I still need to make it a package on its own, whenever I get the time and motivation…
  • probably a few more…
2 Likes

I cannot visualize a possible bad case situation better than you did (I have to remember that one for the future.) However even if you are working in an environment where your work could potentially lead to disastrous situations you should change your mind set from “prevent from/find every bug” (which could lead to something like an displaced child’s eye for sure, but what cannot be prevented from with 100% probability for sure as well) to “become better in preventing from/finding the most important bugs”… better than suffering a depression.

Thanks for that hint.

Good to not provide an USB port…

I am not going to PR into ROS1 :wink: Right now I am fine with a fork of ros_comm/rostest for dummy, fake, spy and mock nodes. (In case I consider your package as template.)

Having something like that would be great.

I began a new ROS package roschaos. The package is in an early stage but it can already be used to kill local ROS node processes randomly using a command line interface. To get feedback and proposals for improvement right from the beginning I decided to make the project public already in this early stage. However there are a lot of features missing (refer to issues). Feel free to contribute to get more features implemented.

BTW: Thanks @gavanderhoorn for your answers on answers.ros.org like this one which helped a lot to get started.

I put generic test nodes which act as dummy or fake nodes for faking, not verifying (according to the general terminology of test doubles in software engineering) into a package rosfake. However integrating custom verification nodes into rostest seems to be not straightforward according to the comment to this question on answers.ros.org. @asmodehn What is your approach of integrating custom verification nodes?

It makes sense to put spy and mock nodes for verifying into a package rosmock. However to get something like that generic is harder because e.g. the package depends on the test framework used to assert.

What license do you use for pyros-test? What test framework do you use to assert? If one would use Python unittest which is widely supported “to beeing integrated” into other frameworks one could think about to merge generic verification nodes into a kind of rosmock. PRs into rostest take too long to be accepted (if they get accepted at all). I think the functionality of rostest (providing the test framework) and the verification part (generic fake nodes → rosfake, generic verification → rosmock) would better be separated anyway…

Quick replies :

  • pyros-test is MIT or BSD, along these lines, I just haven’t taken the time to put a file there… I ll try to do it soon.
  • python makes things simpler than C++ as there are defacto standard test frameworks. So I am trying to support whatever basic python and ROS support ( unittest, doctest, and nose ) and pytest. unittest already includes a mock library by the way in python3, which is just the mock library in python2 .
  • pyros-test is currently very very simple (probably too simple to need a separate package), and I’d like to eventually improve it when I get the chance…

But IMHO you re probably better of extracting what I already started in https://github.com/pyros-dev/pyros-msgs and https://github.com/pyros-dev/pyros-schemas : have a look at property based testing and hypothesis, it should be simple enough to automatically generate fake nodes based on an existing message definition and then send fake messages around :slight_smile: .

Some example of hypothesis use here : https://github.com/pyros-dev/pyros-schemas/blob/nested_merged/tests/test_pyros_schemas/hypothesis_example.py and an example of generating messages with it there https://github.com/pyros-dev/pyros-schemas/blob/nested_merged/tests/test_pyros_schemas/test_basic_fields.py

By the way, roschaos looks fun, and there is probably a clever way to integrate it with roslaunch or feed it launch/test files… I ll play with it when I get some time.

1 Like

Wow, a ros-hypotesis could be very powerful and effective. I knew about property based testing (e.g. RapidCheck for C++) but did not think about to adapt it to ROS so far. Creating a framework for property based ROS node testing could be a hard task I guess. What use cases are you thinking about exactly? I would love to contribute to it :blush:

While I’m a big fan of automatic checking, I wonder which problem(s) this proposal tries to solve.

This is not to say there are no problems. I just think it would help the discussion a lot if we knew what people here are interested in, in terms of outward behavior of system.

The proposed contracts relate to a) rates and b) response times. From my own work, I know that rate information is necessary but not sufficient for determining whether a system can have sampling effects. I am also strongly of the opinion that rates should be a property of a system, not a component.

I don’t know of much utility, but a great deal of problems, in specifying response times for a distributed system, particularly one with one very little real-time support, such as ROS or ROS2. If you really care about that, put your stuff into one process and ensure response times in the usual means, which have little do to with ROS.

Initially my intention to start this thread was to discuss if and how formal specification and formal verification of ROS 2 node/nodelet interfaces by means of DbC could be applied to/implemented in ROS 2. I am interested in DbC because it has the big advantage that it could speed up the integration of ROS 2 nodes/nodelets within a system because it prevents from struggling with bugs which relation to the interface based interaction between nodes/nodelets. However I do not consider DbC as measure for formal verification in the testing context but in the debugging context instead because it can hardly be accurate enough w.r.t. timing, as you said, especially in real-time systems. (I do not know about any tracing tools for ROS which are usual tools for real-time related verification.) in the debugging context a comparably rough estimate is often sufficient. Assuming “system” means the overall sum of ROS nodes/nodelets this thread is not about “outward behavior of system” but its “internal” integration only. If your system exposes interfaces in terms of ROS interfaces which interacts with another system DbC could address “outward behaviour” of the single sub-systems.

As DbC would be hard to implement in ROS 2 and it’s benefits are not considered relevant enough in comparison to other quality improving measures like developing and using tools like a “ROS Simian Army” the direction of the thread turned into the direction of what tools are missing and could be helpful to verify a ROS based system. (Not in terms of verifying real-time behaviour.)

For me the proposed contracts are more about valid and invalid values/value ranges of the node/nodelet interfacess like topics.

However w.r.t. rates and response times I would consider different “classes”:

  1. the “incoming” rates one node/nodelet expect from other nodes to receive, the nodes/nodelets “outgoing” rates which are expected from other nodes, the response time of one node
  2. the rates and response time of a component
  3. the rates and response time of a system

If 1) the node/nodelets do not satisfy “rough” rate or response time requirements there is a pretty good chance that 2) the component or 3) the system will not behave like you would like it to behave as well.

“I am also strongly of the opinion that rates should be a property of a system, not a component.” → People having a background in safety critical, real-time embedded system development could disagree here.

If we care about rates and response times we are already putting nodelets into a single process if this is possible. (If nodes are distributed over different machines I do not know about any way to improve rates and response times anyway.)