ROS Resources: Documentation | Support | Discussion Forum | Service Status | Q&A answers.ros.org

Integration testing in ROS (2)


#1

Hello QA Community. This is a 2 fold post.

We are currently working on a library of algorithms for autonomous (https://gitlab.com/AutowareAuto/AutowareAuto) driving based on ROS 2.

For this library we detailed how we do unit and static analysis testing: https://autowareauto.gitlab.io/AutowareAuto/how-to-write-tests-and-measure-coverage.html and are quite happy with it.

What we would like to improve though is an integration testing. As a first step we consolidated (https://gitlab.com/AutowareAuto/AutowareAuto/tree/master/src/tools/integration_tests) and documented how integration testing is being currently done for ROS 2: https://autowareauto.gitlab.io/AutowareAuto/integration-testing.html.

However there are several problems with herein described integration_test framework:

  1. The tested components cannot be started in a deterministic sequence since legacy launch is being used
  2. The framework lacks the ability of an orchestrated startup and coordinating of different components
  3. There is many flaky tests and it is hard to attribute flakiness to either bad tests, unreliable testing framework or CI system that does not provide guarantees.

To improve upon above we would like to propose the following improvements

  1. Move integration_test framework to roslaunch2. This should take care of the deterministic startup and state transitioning within the nodes.
  2. Determine whether a test needs certain guarantees (e.g. timing). In yes it should then probably run on the dedicated hardware, and if not, a cloud CI like ci.ros2.org is OK. This should eliminate the concept of flaky tests. Tests should either be passing or failing.
  3. Add more automated debugging tools to the framework, eg., tshark for network packets capturing, perf, memory tools, valgrind and other tools for profiling.

We would like to hear your opinion about above points and especially what other features you would like to see in a framework for integration testing.


Secondly, we would also like to propose and get your thoughts on the additional types of unit and integration tests to be written.

Additional Integration Tests

  1. Fault injection tests
    These tests aim at increasing the code coverage by introducing fault into the code path, in particular error handling code path which are rarely executed in normal tests.

    There are three potential places where faults can be injected in runtime:

    1. Data source, where the data is collected. Eg. simulated sensor failure, corrupted data or duplicate data.
    2. Communication. Eg. UDP packets get lost, duplicated or order of arrival are reversed.
    3. System hardware, eg. memory data corruption, unstable time source or memory allocation failure.

    If fault injection test is adopted, testing code must be removed in release. Otherwise it could be utilized to perform attack.

  2. Random input tests
    ROS 2 nodes are tested against independent random input data. The output, if it exists, doesn’t have to be meaningful as long as the program handles it correctly. If an unexpected exception arises then it means there’s a fault in the program. Random input tests are also used to avoid biased testing.

  3. Chaos tests
    In distributed systems, chaos tests are introduced to test the system’s capability of withstanding turbulent conditions in production. It’s could be both hardware or software based test. It works best on systems with redundancy.

    1. Start the whole stack and define a “steady state” as normal behavior.
    2. Introduce some real world possible failures like disk full, power outage or network going down.
    3. Test if the services of other components can be uninterrupted or switched to redundancy.
      The harder it is to disrupt the steady state, the more confident we are at the robustness of the system. If weakness is uncovered, now we have a concrete target to improve.

Additional Unit Tests

  1. Property based-tests
    Also named as QuickCheck. Property is here defined as a high level behavior of specification of behavior that should hold for a wide range of data (ScalaTest). With property-based tests, developers don’t specify the test samples. Instead they write the rule of test and tools will generate the test samples automatically and randomly.

    Example:
    https://hypothesis.readthedocs.io/en/latest/quickstart.html#writing-tests

    Very good article about property based test in Go.

  2. Mutation tests
    Certain statements of code are changed to check if the test can find the error. This can simulate typical coding mistakes like wrong operator or variable name.

    Types of mutation tests:

    • Statement mutation:
      Cut or paste some lines of code. Most likely it wouldn’t compile. This is highly manual rather than automated.
    • Value mutation:
      Values of primary parameters are modified.
    • Decision Mutation
      Control flow is reversed.

    Good resource: https://www.guru99.com/mutation-testing.html
    A C++ mutation framework: https://github.com/nlohmann/mutate_cpp

We would like to start adding above tests for a LiDAR perception stack in AutowareAuto as a proof of concept and then make them part of ci.ros2.org. We would like to hear your opinion:

  1. Do above tests make sense?
  2. Are any particular flavors of tests missing?
  3. Is it possible to pack them all into one single framework (we struggle with this thought honestly)?
  4. What other use case do you have that need better integration and unit testing?
  5. Are there other integration testing framework from non-robotics world worth to inspect?

#2

I do not have experience with ROS 2, but your points regarding integration testing sound valid.

Do above tests make sense?

Yes, at least for the major part. I have been working on random testing and property-based testing myself, at the node/integration level, although for ROS 1.

I had this idea of expanding mutation testing in ROS, in which the ROS primitives could be mutated themselves in a meaningful way (i.e. redirecting topics, messing with queue sizes, changing callback functions to a simple skip or abort), although I am not sure how useful that would be in practice.

Is it possible to pack them all into one single framework?

Is there any advantage to packing all these different kinds of tests into a single framework? It could prove hard to do so in a way that is intuitive for the user.


#3

@mkhansen could you provide some input here since you are currently doing integration tests for navigation stack?