Adding clang thread safety analysis for ROS2 core packages

I’ve hit a wall with enabling this - given the following constraint

  • we do not want to make our own Mutex wrapper, in favor of using std::mutex

High level problem -

  1. To get the capability annotations on std::mutex, I need to use libc++ (LLVM)
  2. On Linux, by default, all libraries use libstd++ (GNU)
  3. The build, using libc++ on an annotated package, can succeed just fine, BUT
  4. Any std objects passed across .so boundaries into this library now have runtime undefined behavior, because you call libc++ functions on libstdc++-created objects

Specific example case that I ran into -

  • I add thread safety annotation to rmw_fastrtps_shared_cpp
    • link it to libc++ to have std::mutex be analyzable
  • upstream, an rclcpp test creates a Node, which eventually resolves to fastrtps_shared_cpp::rmw_node.cpp::__rmw_create_node, which:
    • creates a ParticipantAttributes from fastrtps (which has not been modified, so it uses libstdc++ std::vector creation)
    • calls vector::resize on the member vector (which jumps into the libc++ implementation)
    • program hangs
    • slight variations on operations can cause a crash

Workarounds I’ve tried:

  • colcon build --cmake-args -DCMAKE_CXX_FLAGS=-stdlib=libc++ (mixin approach)
    • FAILS (on Linux) because poco_vendor is not built in the ROS2 workspace, and we get stdlib linker errors against it
  • rmw_shared_cpp-extras.cmake with the -stdlib=libc++ block
    • FAILS for same reason

In conclusion, here are a list of options I have come up with to move forward (I am also very open to other suggestions)

  1. When we run this analysis, libc++ linking flag that is turned on by the build caller, not specified at the package level
    • Static analysis runs on compilation, giving useful warning messages
    • We cannot expect linking or running code to work
    • So, somehow disable linking?
  2. Build all code from source, allowing forcing the standard library for all code
    • looking at libpoco specificially here
  3. Run this analysis on Mac only, where all the code will be linked against LLVM libc++
  4. Remove the initially stated constraint and provide our own annotated mutex wrapper, removing the need for libc++
  5. Modify Fast-RTPS interfaces to remove the need to interact with std:: containers directly
    • This may not be the only place where std:: objects are used across API boundaries, but it’s the place I identified in my first annotations

@tfoote thoughts?

I think this analysis is valuable, but it’s definitely got its technical hangups here

@Thomas_Moulard