Design By Contract

Yes, it’s specification and model checking. If the effort required for the potential users is prohibitive depens heavily on the domain and environmental conditions they are acting in. I think one should give every possible user as much optional technical possibilities to work with as possible. If users make use of the concepts offered is their choice.

I agree that static strong typing is very important. However from an integration point of view I wouldn’t consider “Design By Contract” less important. DbC helps to avoid “higher level” interaction issues in addition to typing issues. But as DbC would require many features to be most effective w.r.t. to effort static strong typing could probably be achieved faster.

Freedom-from-choice supporters would probably disagree with you here.

From a maintenance pov this is also not a very popular sentiment.

1 Like

I am one of those “Freedom-from-choice” supporters :wink:

I am a freedom-from-choice supported as well :slight_smile: . I would stick to freedom-from-choice w.r.t. all “internals” of a framework. However from a framework user perspective it is probably not always possible or reasonable to beeing forced to define contracts in practice.

For everyone which is interested to get hands dirty: There is a short introductional overview about ROS2 and tutorials from Erle Robotics on their docs.

I am interested in this topic as well. But I would prefer to discuss only DbC in this thread. If you can convince me that I prioritize it higher than DbC I will be with you :wink:

This does not fit into this thread as well but what is the benefit of erlang VM integration?

For more information about the DDS DEADLINE QoS policy refer to page 95 in the DDS specification v1.4.

This relates to the way the message definition language is used in ROS. It defines data types, not node interfaces. Contracts are much more likely to be specific to node interfaces than to the messages, which are intended to be generic and highly reusable. Because ROS doesn’t currently have a node interface definition language, there is not yet a suitable place to specify contracts.

Furthermore, some contracts may be specific to a particular implementation of a node, and so wouldn’t fit in a node interface specification intended to be reused by many different implementations (although then I would argue that the nodes with different contracts should not be considered interchangeable and so should be using different interfaces).

Contracts aim is to enforce complex dynamic properties of a system.
Types aim is to enforce simple (usually) static properties of a system.

Therefore I am of this point of view : haskell - Comparing design by contract to type systems - Stack Overflow
Notice how the complexity increase from top left to bottom right.

So, before trying to do something complex (which means heavy maintenance, and likely to be unused until it is perfectly optimized), I would focus on the doable, lighter side of things.

I also agree with @gbiggs and would like first to see more strict enforced message types, before thinking about their combination in an IDL, how this would behave dynamically, and how to enforce some behavior and prevent others…
Right now the message field type is too ambiguous (“node N can subscribe to a message M with a field int, but actually there will ever be only even numbers there…” except the developer of N don’t know that, unless he goes through the code of all nodes publishing M)

This does not fit into this thread as well but what is the benefit of erlang VM integration?

Don’t try to reinvent the wheel, reuse 30 years of expertise in distributed system programming. There are already a bunch of people working on these questions in a distributed setting, and some tools available : Types (or lack thereof) | Learn You Some Erlang for Great Good!

By the way, isn’t this thread some kind of X-Y problem. Which problem exactly are you planning to solve with DbC ?

Always nice to see another Erlang fan! That language is such a pleasure to program in.

Some would say that this is one thing contracts are meant to check… but I think that depends on where you draw the line between what is a contract and what is the type. But I think you are correct in saying that this sort of information really needs to be available and checkable. It is helpful to have in documentation but far more beneficial to developers (and safer) for it to be automatically checkable.

Types/Contracts : In my mind, ‘Design by Contract’ was an informal concept/practice introduced a few years ago because type systems of most languages at that time was insufficient to guarantee correct program behavior. But it is fundamentally the same thing…
Except that we have researched type theory for a while now, whereas ‘contract theory’ is probably not what you would expect after learning about DbC…

These days I am following dependent types and experiments to bring them into distributed systems.

1 Like

That summarizes the difference between types and contracts. And that is exactly about why I didn’t propose types here: In my experience the hard to find defects tend to have their root cause in implicit, incomplete or missing definitions of the dynamic characteristics of here in ROS, node interactions.

Unfortunatelly that is exactly what I found out when looking into the ROS2 sources. One could add deadlines for topics that (a) do not change or (b) do change over node runtime they could be (a) defined and/or (b) updated via the rmw C API which wrapps the DDS DynamicData API or the statically generated DDS functionality from the IDL definitions. However as you said: The interface considers the aspects of the message description languages IDL only, not a node description language. And a node description language would be required to add functionality which would be most benefitial.

That’s right. The question should be: “How can I prevent from introducing defects into/detecting defects in distributed ROS systems which have their root cause in the dynamic interaction of several ROS nodes?” I am biased and did not propose physical continuous integration because that seems hard to implement for distributed systems. DbC or actually model checking based on kind of a node description language seems to be cheaper to me.

I might be stating the obvious here, but still worth reminding everyone I think…

How can I prevent from introducing defects into distributed ROS systems which have their root cause in the dynamic interaction of several ROS nodes?

  • Don’t build a distributed (==multiprocess) system if you don’t have to. Programming language elements (functions, classes, libraries, packages) are made for composing correctly in all sorts of ways, and there is usually theoretical background, tooling, conventions, processes, to help you satisfy the cognitive biases you didn’t know you had. No distributed software system that allows you to control the distribution graph, has anything equivalent to that currently. ROS is no exception (actually erlang might be the only exception).
    Example : A whole part of Operating System design is to prevent processes interactions, and most recent OSes are preempting ? This is opposed to the features suitable for a distributed system, which by definition needs process cooperation, and where controlling when each process can be interrupted, or not, is really useful. In one process, in one language, all these problems vanish.

  • If you have to build a distributed system, congratulations, you are doing distributed system research. This is not robotics and there is a different set of assumptions coming along in that context.
    Example : most existing and widely-used distributed software systems rely on the fact that a message, a unit of computation “task”, is atomic and idempotent. That requirement usually cannot be met in a robotic platform, because of side effects on the real world, the whole point of it. Painful lesson after a year working on GitHub - asmodehn/celeros: Celery ROS python interface - .

For the rest of us having to do both distribution and real world side-effect, I feel the most promising way, is still integrating side-effects into the theory. But, as far as I know, it is still a software research topic on its own.

Regarding ROS, the best bet is likely to integrate/interface/implement ROS with the existing programming language that provide the feature that you need, instead of trying to integrate “that awesome language feature” into ROS (because it implies re-implementation and proactive maintenance from ROS community for something that is not purely robotics related)

For DbC, I’m thinking if you get around implementing a Eiffel-based ROS interface/integration/implementation, you might find some interesting changes needed in ROS itself, even in REPs, in order to make that possible without compromising Eiffel. I’m thinking these changes would likely be worth it for ROS, especially in the long run.
Disclaimer: I’m currently following the same path with Python, improving how ROS integrate with it along the way, and finding basic problems where I didn’t expect to…

But ultimately, writing a “solid” software project is a matter of computer science and software engineering expertise, so general software theory, knowledge and tools apply there. It’s not a problem specific to robotics, and therefore robotic science and tools (like ROS) are not focusing on it.

ROS is a multi process system on the robot/(multi-)processor level itself. I guess you mean multi device system instead of multiprocess system. Right, distributed systems are beyond robotics. However robotics and distributed systems have already merged into distributed robotics (Kiva robots in an Amazon warehouse in 2011). Isn’t it the time to ease the development of such systems (and single robots) by providing better framework capabilities?

As Amazon already did non real time distributed robotics in 2011 I think considerations w.r.t. preemption and inter-process interaction beyond the robot level is more critical in real-time (soft/firm/hard) applications.

I would argue that if a framework implementation lacks conceptual features this cannot be compensated with the choice of suitable programming languages which address lower levels of abstractions only. But you are right, it is better to improve a framework w.r.t. to it’s given capabilities instead of trying to integrate language features (at least in the short term).

Instead of waiting for more capabilities in ROS2 it is probably more valuable to add more capabilities to the current ROS1 framework. Without DbC on the ROS node level one can verify the ROS node interface with rostest. Currently the set of reusable test nodes is limited. What about adding more generic test nodes like fake topic publishers (draft state) to ros_comm? (The example tests can be run with catkin_make run_tests_rostest_rostest_test_faketopicpublisher.test and catkin_make run_tests_rostest_rostest_test_faketopicpublisher0.test in the catkin workspace. Quick start guide about the other generic test nodes of rostest.).

Actually I do mean multiprocess. But I do not mean “We cannot make it work/do what we want” or even “We cannot make it do what we want all the time”. I mean “We cannot be sure that it will never do things that weren’t intended”.
And I know for a fact that there are more robots out in the world, doing unexpected dangerous thing, because they weren’t programmed with total safety in mind, than most people know about. All it takes is an unchecked integer to wrap around, and disaster strikes in the real world. Make it distributed, and the disaster likeliness increases exponentially.

So sure we can build robots, and distributed systems, but, when it come to a robot that can poke the eye of a child because it looks like it’s a button to press, it’s different than a harmless backend database cluster in the basement, so you better be sure of what you’re programming… and for distributed systems (multiprocess) the theory is quite new, so most language/frameworks won’t help you there.

Definitely yes, but instead of adding potentially heavy features, without being sure they will be used and maintained, I would first focus on doing like https://jepsen.io/, that is, provide tools that show people working in robotics, what and where the problems are in the system they build. Actually probably doing the same as what works for security hackers : tell people their system is broken/unsafe, nobody cares. Make anyone (including their customer) able to break it, and then they react… and some might listen.

I personally fully agree with this statement, but I think most people are focusing on ROS2 these days, which means even less maintenance resource for ROS1, so we need to be careful that what we add is really worth it.

I would also agree there. You can always send a PR to add the tests node you miss to rostest, and discuss it with the maintainers :slight_smile:
And you can also write a package for the specific nodes you need. I started doing that for my own needs in GitHub - pyros-dev/pyros-test: Test package for Pyros.
But these days I am thinking we need something more like a ROS Simian Army :

  • some package that randomly kill and restart nodes, probably based on launch files…
  • some package that randomly sends messages around, like a ros-hypothesis that would generate any valid message based on a ROS definition, to test your nodes against. I have already implemented most of this one, as part of other projects, but I still need to make it a package on its own, whenever I get the time and motivation…
  • probably a few more…
2 Likes

I cannot visualize a possible bad case situation better than you did (I have to remember that one for the future.) However even if you are working in an environment where your work could potentially lead to disastrous situations you should change your mind set from “prevent from/find every bug” (which could lead to something like an displaced child’s eye for sure, but what cannot be prevented from with 100% probability for sure as well) to “become better in preventing from/finding the most important bugs”… better than suffering a depression.

Thanks for that hint.

Good to not provide an USB port…

I am not going to PR into ROS1 :wink: Right now I am fine with a fork of ros_comm/rostest for dummy, fake, spy and mock nodes. (In case I consider your package as template.)

Having something like that would be great.

I began a new ROS package roschaos. The package is in an early stage but it can already be used to kill local ROS node processes randomly using a command line interface. To get feedback and proposals for improvement right from the beginning I decided to make the project public already in this early stage. However there are a lot of features missing (refer to issues). Feel free to contribute to get more features implemented.

BTW: Thanks @gavanderhoorn for your answers on answers.ros.org like this one which helped a lot to get started.