Compilation Bloat Issue

Hi all,

We have noticed build time and space bloat with humble compared to galactic. In our internal CI/CD pipeline, we see build time increasing from 1.5 hr to 2 hr when trying to upgrade to humble. Developers have also noticed some targets taking 10~15 mins longer to build locally. This not only impacts our CI/CD cost but also dramatically slows down our development cadence.

While we understand this problem might be unique to our project setup, we have also observed similar trends in OSRF docker containers when building ros2/examples. When building galactic branch from ros2/example (plus cherry-picking Remove use of get_callback_groups(). (#320) commit for humble to account for API change), there is

  • 59% increase in overall build time (from 231.6s to 367.4s)
  • 73% increase in overall build folder size (from 147.3MB to 255.2MB)
  • 52% increase in clcpp::Node::create_subscription<$> instantiation time
  • 98% increase in rclcpp.hpp inclusion time
  • The list goes on…

In general, we can see that template instantiations and header inclusions have gotten a lot more expensive with ClangBuildAnalyzer across the board. The full build analysis reports can be found at the bottom.

There is currently one github issue in rclcpp repo related to this problem. However, there doesn’t seem to be any tracking or active strategies for the bloat. We think this issue requires both near-term actions for mitigating the bloat as well as long-term tracking strategies for future releases.

Although some areas of build time/space increase can be traced to new features added to humble such as Type Adaptation, general code and template bloat seem to be a widespread issue. We think some potential solutions might be compiler optimizations such as explicit template instantiation and code/packaging refactoring. We are testing out some suggested solutions such as directly including relevant headers rather than including rclcpp.hpp everywhere, explicit template instantiations, etc. Some of them show promising initial results.

We are interested in learning if any other projects have observed or been impacted by the compilation bloat as well as kick-starting the discussion around how to tackle this problem. We look forward to discussing potential solutions as well as ways to track this problem going forward with the community.

Thank you!

humble-build-analysis-ros2-example:

**** Time summary:
Compilation (64 times):
  Parsing (frontend):          336.4 s
  Codegen & opts (backend):     31.0 s

**** Files that took longest to parse (compiler frontend):
 13775 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_member_function_with_topic_statistics.dir/member_function_with_topic_statistics.cpp.o
 13758 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_not_composable.dir/not_composable.cpp.o
 13541 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_member_function.dir/member_function.cpp.o
 13390 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/ping_pong.dir/src/examples_rclcpp_cbg_executor/pong_node.cpp.o
 13159 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_member_function_with_unique_network_flow_endpoints.dir/member_function_with_unique_network_flow_endpoints.cpp.o
 13041 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/ping.dir/src/examples_rclcpp_cbg_executor/ping_node.cpp.o
 13026 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/ping_pong.dir/src/examples_rclcpp_cbg_executor/ping_node.cpp.o
 12750 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_composition/CMakeFiles/composition_nodes.dir/src/subscriber_node.cpp.o
 12305 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/pong.dir/src/examples_rclcpp_cbg_executor/pong_node.cpp.o
 12041 ms: /home/yhsu/Downloads/build/examples_rclcpp_multithreaded_executor/CMakeFiles/multithreaded_executor.dir/multithreaded_executor.cpp.o

**** Files that took longest to codegen (compiler backend):
  2231 ms: /home/yhsu/Downloads/build/examples_rclcpp_multithreaded_executor/CMakeFiles/multithreaded_executor.dir/multithreaded_executor.cpp.o
  2150 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/pong.dir/src/examples_rclcpp_cbg_executor/pong_node.cpp.o
  2094 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/ping_pong.dir/src/examples_rclcpp_cbg_executor/ping_node.cpp.o
  2048 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_composition/CMakeFiles/composition_nodes.dir/src/subscriber_node.cpp.o
  1981 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_member_function_with_unique_network_flow_endpoints.dir/member_function_with_unique_network_flow_endpoints.cpp.o
  1782 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_lambda.dir/lambda.cpp.o
  1746 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/ping.dir/src/examples_rclcpp_cbg_executor/ping_node.cpp.o
  1530 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_not_composable.dir/not_composable.cpp.o
  1469 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_composition/CMakeFiles/composition_nodes.dir/src/publisher_node.cpp.o
  1436 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/ping_pong.dir/src/examples_rclcpp_cbg_executor/pong_node.cpp.o

**** Templates that took longest to instantiate:
 10564 ms: rclcpp::Subscription<std_msgs::msg::String_<std::allocator<void>>, s... (7 times, avg 1509 ms)
 10557 ms: std::make_shared<rclcpp::Subscription<std_msgs::msg::String_<std::al... (7 times, avg 1508 ms)
 10547 ms: std::allocate_shared<rclcpp::Subscription<std_msgs::msg::String_<std... (7 times, avg 1506 ms)
 10543 ms: std::shared_ptr<rclcpp::Subscription<std_msgs::msg::String_<std::all... (7 times, avg 1506 ms)
 10539 ms: std::__shared_ptr<rclcpp::Subscription<std_msgs::msg::String_<std::a... (7 times, avg 1505 ms)
 10529 ms: std::__shared_count<__gnu_cxx::_S_atomic>::__shared_count<rclcpp::Su... (7 times, avg 1504 ms)
 10474 ms: std::_Sp_counted_ptr_inplace<rclcpp::Subscription<std_msgs::msg::Str... (7 times, avg 1496 ms)
 10438 ms: std::allocator_traits<std::allocator<rclcpp::Subscription<std_msgs::... (7 times, avg 1491 ms)
 10435 ms: __gnu_cxx::new_allocator<rclcpp::Subscription<std_msgs::msg::String_... (7 times, avg 1490 ms)
 10430 ms: rclcpp::Subscription<std_msgs::msg::String_<std::allocator<void>>, s... (7 times, avg 1490 ms)
  8432 ms: rclcpp::Subscription<rcl_interfaces::msg::ParameterEvent_<std::alloc... (32 times, avg 263 ms)
  8200 ms: rclcpp::Client<rcl_interfaces::srv::SetParametersAtomically> (32 times, avg 256 ms)
  8177 ms: std::unordered_map<long, std::pair<std::chrono::time_point<std::chro... (32 times, avg 255 ms)
  8120 ms: std::_Hashtable<long, std::pair<const long, std::pair<std::chrono::t... (32 times, avg 253 ms)
  7886 ms: rclcpp::Client<rcl_interfaces::srv::GetParameters> (32 times, avg 246 ms)
  7857 ms: std::unordered_map<long, std::pair<std::chrono::time_point<std::chro... (32 times, avg 245 ms)
  7786 ms: std::_Hashtable<long, std::pair<const long, std::pair<std::chrono::t... (32 times, avg 243 ms)
  7766 ms: std::__detail::_Insert<long, std::pair<const long, std::pair<std::ch... (32 times, avg 242 ms)
  7759 ms: std::__detail::_Insert_base<long, std::pair<const long, std::pair<st... (32 times, avg 242 ms)
  7667 ms: std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_... (32 times, avg 239 ms)
  7580 ms: std::__detail::_Hash_node<std::pair<const long, std::pair<std::chron... (32 times, avg 236 ms)
  7562 ms: std::__detail::_Hash_node_value<std::pair<const long, std::pair<std:... (32 times, avg 236 ms)
  7560 ms: std::__detail::_Hash_node_value_base<std::pair<const long, std::pair... (32 times, avg 236 ms)
  7542 ms: __gnu_cxx::__aligned_buffer<std::pair<const long, std::pair<std::chr... (32 times, avg 235 ms)
  7529 ms: std::pair<const long, std::pair<std::chrono::time_point<std::chrono:... (32 times, avg 235 ms)
  7391 ms: rclcpp::AnySubscriptionCallback<rcl_interfaces::msg::ParameterEvent_... (32 times, avg 230 ms)
  7364 ms: rclcpp::Client<rcl_interfaces::srv::ListParameters> (32 times, avg 230 ms)
  7335 ms: std::pair<std::chrono::time_point<std::chrono::system_clock, std::ch... (32 times, avg 229 ms)
  7335 ms: std::unordered_map<long, std::pair<std::chrono::time_point<std::chro... (32 times, avg 229 ms)
  7301 ms: std::_Hashtable<long, std::pair<const long, std::pair<std::chrono::t... (32 times, avg 228 ms)

**** Template sets that took longest to instantiate:
 55573 ms: std::variant<$> (434 times, avg 128 ms)
 52791 ms: std::unordered_map<$> (495 times, avg 106 ms)
 52751 ms: std::_Hashtable<$> (623 times, avg 84 ms)
 49765 ms: std::__detail::__variant::_Variant_base<$> (434 times, avg 114 ms)
 45090 ms: std::__detail::_Insert<$> (560 times, avg 80 ms)
 44910 ms: std::__detail::_Insert_base<$> (560 times, avg 80 ms)
 44320 ms: rclcpp::Client<$> (193 times, avg 229 ms)
 44245 ms: std::pair<$> (902 times, avg 49 ms)
 44051 ms: std::__detail::_Hashtable_alloc<$> (559 times, avg 78 ms)
 42826 ms: std::__detail::_Hash_node<$> (527 times, avg 81 ms)
 42789 ms: std::__detail::_Hash_node_value<$> (529 times, avg 80 ms)
 42735 ms: std::__detail::_Hash_node_value_base<$> (527 times, avg 81 ms)
 42664 ms: __gnu_cxx::__aligned_buffer<$> (529 times, avg 80 ms)
 32399 ms: std::__and_<$> (15415 times, avg 2 ms)
 32326 ms: rclcpp::Node::create_subscription<$> (11 times, avg 2938 ms)
 32284 ms: rclcpp::create_subscription<$> (11 times, avg 2934 ms)
 32280 ms: rclcpp::detail::create_subscription<$> (11 times, avg 2934 ms)
 30751 ms: std::tuple<$> (1662 times, avg 18 ms)
 30530 ms: std::shared_ptr<$>::shared_ptr<$> (498 times, avg 61 ms)
 30354 ms: std::make_shared<$> (269 times, avg 112 ms)
 30346 ms: std::is_trivially_destructible<$> (2184 times, avg 13 ms)
 30282 ms: std::__shared_ptr<$>::__shared_ptr<$> (400 times, avg 75 ms)
 30101 ms: std::allocate_shared<$> (290 times, avg 103 ms)
 29883 ms: std::__shared_count<$>::__shared_count<$> (380 times, avg 78 ms)
 29455 ms: std::unique_ptr<$> (1467 times, avg 20 ms)
 27800 ms: std::_Sp_counted_ptr_inplace<$>::_Sp_counted_ptr_inplace<$> (290 times, avg 95 ms)
 26798 ms: std::allocator_traits<$>::construct<$> (556 times, avg 48 ms)
 26400 ms: __gnu_cxx::new_allocator<$>::construct<$> (400 times, avg 66 ms)
 25182 ms: std::__uniq_ptr_data<$> (1467 times, avg 17 ms)
 24897 ms: std::__uniq_ptr_impl<$> (1467 times, avg 16 ms)

**** Functions that took longest to compile:
    81 ms: main (/root/examples/rclcpp/actions/minimal_action_client/not_composable_with_cancel.cpp)
    38 ms: std::_Tuple_impl<0ul, MinimalTimer*>::_Tuple_impl<MinimalTimer*>(Min... (/root/examples/rclcpp/timers/minimal_timer/member_function.cpp)
    30 ms: rclcpp::QOSEventHandlerBase::set_on_ready_callback(std::function<voi... (/root/examples/rclcpp/topics/minimal_publisher/not_composable.cpp)
    28 ms: std::__cxx11::basic_string<char, std::char_traits<char>, std::alloca... (/root/examples/rclcpp/actions/minimal_action_client/not_composable_with_cancel.cpp)
    28 ms: std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<rm... (/root/examples/rclcpp/topics/minimal_publisher/member_function_with_unique_network_flow_endpoints.cpp)
    28 ms: std::enable_if<std::is_same<bool, bool>::value, bool const&>::type r... (/root/examples/rclcpp/topics/minimal_publisher/member_function.cpp)
    28 ms: void std::vector<unsigned long, std::allocator<unsigned long> >::_M_... (/root/examples/rclcpp/composition/minimal_composition/src/publisher_node.cpp)
    28 ms: std::__shared_ptr<rclcpp::node_interfaces::NodeTimersInterface, (__g... (/root/examples/rclcpp/topics/minimal_publisher/member_function_with_unique_network_flow_endpoints.cpp)
    28 ms: __gnu_cxx::new_allocator<rclcpp::QosPolicyKind>::new_allocator(__gnu... (/root/examples/rclcpp/topics/minimal_publisher/member_function_with_unique_network_flow_endpoints.cpp)
    28 ms: rclcpp::PublisherOptionsWithAllocator<std::allocator<void> >::get_al... (/root/examples/rclcpp/topics/minimal_publisher/not_composable.cpp)
    26 ms: std::allocator_traits<std::allocator<std::_Sp_counted_deleter<rclcpp... (/root/examples/rclcpp/topics/minimal_publisher/member_function_with_unique_network_flow_endpoints.cpp)
    26 ms: rclcpp::GenericTimer<std::_Bind<void (MinimalPublisherWithUniqueNetw... (/root/examples/rclcpp/topics/minimal_publisher/member_function_with_unique_network_flow_endpoints.cpp)
    25 ms: std::__shared_ptr<rclcpp::QOSEventHandler<std::function<void (rmw_qo... (/root/examples/rclcpp/topics/minimal_publisher/member_function_with_unique_network_flow_endpoints.cpp)
    25 ms: main (/root/examples/rclcpp/services/minimal_client/main.cpp)
    25 ms: std::shared_ptr<rclcpp::node_interfaces::NodeTopicsInterface> rclcpp... (/root/examples/rclcpp/topics/minimal_publisher/member_function.cpp)
    24 ms: std::_Sp_counted_ptr_inplace<rmw_offered_deadline_missed_status_s, s... (/root/examples/rclcpp/topics/minimal_publisher/not_composable.cpp)
    24 ms: std::shared_ptr<rclcpp::Publisher<std_msgs::msg::String_<std::alloca... (/root/examples/rclcpp/topics/minimal_publisher/not_composable.cpp)
    24 ms: std::_Sp_counted_ptr_inplace<rclcpp::QOSEventHandler<std::function<v... (/root/examples/rclcpp/composition/minimal_composition/src/publisher_node.cpp)
    24 ms: std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<rc... (/root/examples/rclcpp/composition/minimal_composition/src/publisher_node.cpp)
    24 ms: std::_Sp_ebo_helper<0, std::allocator<rclcpp::WallTimer<std::_Bind<v... (/root/examples/rclcpp/composition/minimal_composition/src/publisher_node.cpp)
    23 ms: std::enable_if<(rclcpp::node_interfaces::has_node_parameters_interfa... (/root/examples/rclcpp/composition/minimal_composition/src/publisher_node.cpp)
    23 ms: std::enable_if<(rclcpp::node_interfaces::has_node_parameters_interfa... (/root/examples/rclcpp/topics/minimal_publisher/lambda.cpp)
    23 ms: rclcpp::QOSEventHandlerBase::set_on_ready_callback(std::function<voi... (/root/examples/rclcpp/topics/minimal_publisher/member_function_with_unique_network_flow_endpoints.cpp)
    22 ms: main (/root/examples/rclcpp/actions/minimal_action_client/not_composable_with_feedback.cpp)
    22 ms: std::_Function_base::_Base_manager<std::_Bind<std::shared_ptr<rclcpp... (/root/examples/rclcpp/composition/minimal_composition/src/publisher_node.cpp)
    21 ms: rclcpp::detail::apply_qos_override(rclcpp::QosPolicyKind, rclcpp::Pa... (/root/examples/rclcpp/topics/minimal_publisher/member_function_with_unique_network_flow_endpoints.cpp)
    21 ms: std::pair<std::__detail::_Node_iterator<std::pair<rcl_publisher_even... (/root/examples/rclcpp/composition/minimal_composition/src/publisher_node.cpp)
    21 ms: __gnu_cxx::__aligned_buffer<example_interfaces::action::Fibonacci_Fe... (/root/examples/rclcpp/actions/minimal_action_server/member_functions.cpp)
    21 ms: std::_Sp_counted_ptr_inplace<rclcpp::QOSEventHandler<std::function<v... (/root/examples/rclcpp/topics/minimal_publisher/not_composable.cpp)
    21 ms: main (/root/examples/rclcpp/actions/minimal_action_client/not_composable.cpp)

**** Function sets that took longest to compile / optimize:
   262 ms: std::pair<$> std::_Hashtable<$>::_M_emplace<$>(std::integral_constan... (149 times, avg 1 ms)
   206 ms: std::__detail::_Hash_node<$>* std::__detail::_Hashtable_alloc<$>::_M... (149 times, avg 1 ms)
   178 ms: std::vector<$>::vector(std::vector<$> const&) (135 times, avg 1 ms)
   156 ms: std::enable_if<$>::type rclcpp::detail::declare_qos_parameters<$>(rc... (14 times, avg 11 ms)
   145 ms: rclcpp::QOSEventHandlerBase::set_on_ready_callback(std::function<voi... (14 times, avg 10 ms)
   139 ms: rclcpp::QOSEventHandler<$>::QOSEventHandler<$>(std::function<$> cons... (42 times, avg 3 ms)
   113 ms: std::function<$>::function(std::function<$> const&) (205 times, avg 0 ms)
   110 ms: void rclcpp::experimental::IntraProcessManager::do_intra_process_pub... (17 times, avg 6 ms)
    91 ms: std::vector<$>::~vector() (79 times, avg 1 ms)
    86 ms: void rclcpp::experimental::IntraProcessManager::add_owned_msg_to_buf... (17 times, avg 5 ms)
    85 ms: rclcpp::QOSEventHandler<$>::QOSEventHandler<$>(std::function<$> cons... (36 times, avg 2 ms)
    81 ms: std::__shared_count<$>::__shared_count<$>(rclcpp::QOSEventHandler<$>... (42 times, avg 1 ms)
    81 ms: std::allocator_traits<$>::allocate(std::allocator<$>&, unsigned long) (5 times, avg 16 ms)
    80 ms: std::shared_ptr<$> rclcpp::experimental::IntraProcessManager::do_int... (17 times, avg 4 ms)
    76 ms: rclcpp::QOSEventHandler<$>::execute(std::shared_ptr<$>&) (94 times, avg 0 ms)
    75 ms: std::_Function_base::_Base_manager<$>::_M_manager(std::_Any_data&, s... (132 times, avg 0 ms)
    73 ms: std::_Sp_counted_ptr_inplace<$>::_Impl::_Impl(std::allocator<$>) (6 times, avg 12 ms)
    73 ms: std::enable_if<$>::type rclcpp::detail::declare_qos_parameters<$>(rc... (9 times, avg 8 ms)
    72 ms: std::__detail::_Hash_node<$>* std::__detail::_Hashtable_alloc<$>::_M... (17 times, avg 4 ms)
    71 ms: std::_Vector_base<$>::~_Vector_base() (108 times, avg 0 ms)
    60 ms: std::__cxx11::basic_string<$>::basic_string<$>(char const*, std::all... (24 times, avg 2 ms)
    59 ms: rclcpp::experimental::SubscriptionIntraProcessBase::set_on_ready_cal... (9 times, avg 6 ms)
    59 ms: void std::vector<$>::_M_range_insert<$>(__gnu_cxx::__normal_iterator... (14 times, avg 4 ms)
    57 ms: std::__shared_count<$>::__shared_count<$>(std::allocator<$>*&, std::... (41 times, avg 1 ms)
    54 ms: rclcpp::Subscription<$>::Subscription(rclcpp::node_interfaces::NodeB... (9 times, avg 6 ms)
    53 ms: rclcpp::QOSEventHandler<$>::take_data() (76 times, avg 0 ms)
    53 ms: rclcpp::QOSEventHandler<$>::QOSEventHandler<$>(rclcpp::Publisher<$>:... (17 times, avg 3 ms)
    51 ms: std::enable_if<$>::type rclcpp::ParameterValue::get<$>() const (36 times, avg 1 ms)
    51 ms: std::_Function_handler<$>::_M_manager(std::_Any_data&, std::_Any_dat... (95 times, avg 0 ms)
    51 ms: void rclcpp::SubscriptionBase::add_event_handler<$>(std::function<$>... (36 times, avg 1 ms)

*** Expensive headers:
192821 ms: /opt/ros/humble/include/rclcpp/rclcpp/rclcpp.hpp (included 32 times, avg 6025 ms), included via:
  not_composable.cpp.o  (9132 ms)
  member_function.cpp.o  (8383 ms)
  subscriber_node.cpp.o subscriber_node.hpp  (8051 ms)
  member_function_with_topic_statistics.cpp.o  (7962 ms)
  pong_node.cpp.o pong_node.hpp  (7571 ms)
  member_function.cpp.o  (7494 ms)
  ...

70655 ms: /opt/ros/humble/include/rclcpp/rclcpp/executor.hpp (included 32 times, avg 2207 ms), included via:
  member_function.cpp.o rclcpp.hpp executors.hpp multi_threaded_executor.hpp  (3226 ms)
  subscriber_node.cpp.o subscriber_node.hpp rclcpp.hpp executors.hpp multi_threaded_executor.hpp  (3117 ms)
  pong_node.cpp.o pong_node.hpp rclcpp.hpp executors.hpp multi_threaded_executor.hpp  (3079 ms)
  member_function_with_topic_statistics.cpp.o rclcpp.hpp executors.hpp multi_threaded_executor.hpp  (3035 ms)
  not_composable.cpp.o rclcpp.hpp executors.hpp multi_threaded_executor.hpp  (2790 ms)
  pong_node.cpp.o pong_node.hpp rclcpp.hpp executors.hpp multi_threaded_executor.hpp  (2725 ms)
  ...

18151 ms: /root/examples/rclcpp/executors/cbg_executor/include/examples_rclcpp_cbg_executor/ping_node.hpp (included 4 times, avg 4537 ms), included via:
  ping_node.cpp.o  (9020 ms)
  ping_node.cpp.o  (8772 ms)
  ping.cpp.o  (184 ms)
  ping_pong.cpp.o  (174 ms)

16482 ms: /root/examples/rclcpp/executors/cbg_executor/include/examples_rclcpp_cbg_executor/pong_node.hpp (included 4 times, avg 4120 ms), included via:
  pong_node.cpp.o  (8654 ms)
  pong_node.cpp.o  (7694 ms)
  pong.cpp.o  (133 ms)
  ping_pong.cpp.o  (0 ms)

11173 ms: /usr/include/c++/11/future (included 32 times, avg 349 ms), included via:
  not_composable.cpp.o rclcpp.hpp executors.hpp  (866 ms)
  lambda.cpp.o rclcpp.hpp executors.hpp  (761 ms)
  member_functions.cpp.o  (697 ms)
  member_function_with_topic_statistics.cpp.o rclcpp.hpp executors.hpp  (679 ms)
  member_function.cpp.o rclcpp.hpp executors.hpp  (649 ms)
  subscriber_node.cpp.o subscriber_node.hpp rclcpp.hpp executors.hpp  (582 ms)
  ...

10888 ms: /root/examples/rclcpp/composition/minimal_composition/include/minimal_composition/subscriber_node.hpp (included 3 times, avg 3629 ms), included via:
  subscriber_node.cpp.o  (8236 ms)
  standalone_subscriber.cpp.o  (2606 ms)
  composed.cpp.o  (45 ms)

10452 ms: /root/examples/rclcpp/composition/minimal_composition/include/minimal_composition/publisher_node.hpp (included 3 times, avg 3484 ms), included via:
  publisher_node.cpp.o  (5270 ms)
  standalone_publisher.cpp.o  (2597 ms)
  composed.cpp.o  (2584 ms)

9424 ms: /usr/include/c++/11/memory (included 32 times, avg 294 ms), included via:
  not_composable.cpp.o rclcpp.hpp  (758 ms)
  lambda.cpp.o  (724 ms)
  ping_node.cpp.o ping_node.hpp  (637 ms)
  member_function_with_topic_statistics.cpp.o  (601 ms)
  ping_node.cpp.o ping_node.hpp  (599 ms)
  main.cpp.o  (542 ms)
  ...

8877 ms: /usr/include/c++/11/functional (included 32 times, avg 277 ms), included via:
  member_function.cpp.o  (1059 ms)
  member_function_with_unique_network_flow_endpoints.cpp.o  (976 ms)
  member_functions.cpp.o  (661 ms)
  member_functions.cpp.o  (604 ms)
  not_composable_with_cancel.cpp.o fibonacci.hpp fibonacci__struct.hpp algorithm glue_algorithm_defs.h  (538 ms)
  pong.cpp.o  (440 ms)
  ...

7111 ms: /usr/include/c++/11/string (included 32 times, avg 222 ms), included via:
  ping_pong.cpp.o condition_variable std_mutex.h system_error stdexcept  (435 ms)
  pong.cpp.o condition_variable std_mutex.h system_error stdexcept  (432 ms)
  ping.cpp.o condition_variable std_mutex.h system_error stdexcept  (428 ms)
  ping_node.cpp.o ping_node.hpp  (330 ms)
  not_composable.cpp.o rclcpp.hpp executors.hpp future mutex system_error stdexcept  (318 ms)
  member_function_with_topic_statistics.cpp.o rclcpp.hpp executors.hpp future mutex system_error stdexcept  (316 ms)
  ...

galactic-build-analysis-ros2-example:

**** Time summary:
Compilation (64 times):
  Parsing (frontend):          206.5 s
  Codegen & opts (backend):     25.1 s

**** Files that took longest to parse (compiler frontend):
  8333 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/ping_pong.dir/src/examples_rclcpp_cbg_executor/pong_node.cpp.o
  8105 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/ping.dir/src/examples_rclcpp_cbg_executor/ping_node.cpp.o
  8086 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_member_function.dir/member_function.cpp.o
  8085 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_lambda.dir/lambda.cpp.o
  7997 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/ping_pong.dir/src/examples_rclcpp_cbg_executor/ping_node.cpp.o
  7962 ms: /home/yhsu/Downloads/build/examples_rclcpp_multithreaded_executor/CMakeFiles/multithreaded_executor.dir/multithreaded_executor.cpp.o
  7680 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/pong.dir/src/examples_rclcpp_cbg_executor/pong_node.cpp.o
  7288 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_composition/CMakeFiles/composition_nodes.dir/src/subscriber_node.cpp.o
  7195 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_member_function_with_topic_statistics.dir/member_function_with_topic_statistics.cpp.o
  7009 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_member_function_with_unique_network_flow_endpoints.dir/member_function_with_unique_network_flow_endpoints.cpp.o

**** Files that took longest to codegen (compiler backend):
  1659 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/ping_pong.dir/src/examples_rclcpp_cbg_executor/ping_node.cpp.o
  1578 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_member_function_with_topic_statistics.dir/member_function_with_topic_statistics.cpp.o
  1535 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/pong.dir/src/examples_rclcpp_cbg_executor/pong_node.cpp.o
  1534 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_member_function_with_unique_network_flow_endpoints.dir/member_function_with_unique_network_flow_endpoints.cpp.o
  1528 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_not_composable.dir/not_composable.cpp.o
  1510 ms: /home/yhsu/Downloads/build/examples_rclcpp_cbg_executor/CMakeFiles/ping_pong.dir/src/examples_rclcpp_cbg_executor/pong_node.cpp.o
  1491 ms: /home/yhsu/Downloads/build/examples_rclcpp_multithreaded_executor/CMakeFiles/multithreaded_executor.dir/multithreaded_executor.cpp.o
  1461 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_lambda.dir/lambda.cpp.o
  1444 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_subscriber/CMakeFiles/subscriber_member_function.dir/member_function.cpp.o
  1414 ms: /home/yhsu/Downloads/build/examples_rclcpp_minimal_composition/CMakeFiles/composition_nodes.dir/src/subscriber_node.cpp.o

**** Templates that took longest to instantiate:
  6592 ms: rclcpp::Subscription<std_msgs::msg::String_<std::allocator<void> >, ... (7 times, avg 941 ms)
  6587 ms: std::make_shared<rclcpp::Subscription<std_msgs::msg::String_<std::al... (7 times, avg 941 ms)
  6567 ms: std::allocate_shared<rclcpp::Subscription<std_msgs::msg::String_<std... (7 times, avg 938 ms)
  6540 ms: std::shared_ptr<rclcpp::Subscription<std_msgs::msg::String_<std::all... (7 times, avg 934 ms)
  6537 ms: std::__shared_ptr<rclcpp::Subscription<std_msgs::msg::String_<std::a... (7 times, avg 933 ms)
  6529 ms: std::__shared_count<__gnu_cxx::_S_atomic>::__shared_count<rclcpp::Su... (7 times, avg 932 ms)
  6487 ms: std::_Sp_counted_ptr_inplace<rclcpp::Subscription<std_msgs::msg::Str... (7 times, avg 926 ms)
  6417 ms: std::allocator_traits<std::allocator<rclcpp::Subscription<std_msgs::... (7 times, avg 916 ms)
  6402 ms: __gnu_cxx::new_allocator<rclcpp::Subscription<std_msgs::msg::String_... (7 times, avg 914 ms)
  6398 ms: rclcpp::Subscription<std_msgs::msg::String_<std::allocator<void> >, ... (7 times, avg 914 ms)
  5080 ms: rclcpp::detail::create_publisher<statistics_msgs::msg::MetricsMessag... (11 times, avg 461 ms)
  5021 ms: rclcpp::Subscription<rcl_interfaces::msg::ParameterEvent_<std::alloc... (32 times, avg 156 ms)
  4965 ms: rclcpp::create_publisher_factory<statistics_msgs::msg::MetricsMessag... (11 times, avg 451 ms)
  4385 ms: std::make_shared<rclcpp::Publisher<statistics_msgs::msg::MetricsMess... (11 times, avg 398 ms)
  4374 ms: std::allocate_shared<rclcpp::Publisher<statistics_msgs::msg::Metrics... (11 times, avg 397 ms)
  4371 ms: std::shared_ptr<rclcpp::Publisher<statistics_msgs::msg::MetricsMessa... (11 times, avg 397 ms)
  4368 ms: std::__shared_ptr<rclcpp::Publisher<statistics_msgs::msg::MetricsMes... (11 times, avg 397 ms)
  4333 ms: std::__shared_count<__gnu_cxx::_S_atomic>::__shared_count<rclcpp::Pu... (11 times, avg 393 ms)
  4277 ms: std::_Sp_counted_ptr_inplace<rclcpp::Publisher<statistics_msgs::msg:... (11 times, avg 388 ms)
  4241 ms: std::allocator_traits<std::allocator<rclcpp::Publisher<statistics_ms... (11 times, avg 385 ms)
  4237 ms: __gnu_cxx::new_allocator<rclcpp::Publisher<statistics_msgs::msg::Met... (11 times, avg 385 ms)
  4235 ms: rclcpp::Publisher<statistics_msgs::msg::MetricsMessage_<std::allocat... (11 times, avg 385 ms)
  3363 ms: rclcpp::Node::create_publisher<std_msgs::msg::String_<std::allocator... (6 times, avg 560 ms)
  3362 ms: rclcpp::create_publisher<std_msgs::msg::String_<std::allocator<void>... (6 times, avg 560 ms)
  3361 ms: rclcpp::detail::create_publisher<std_msgs::msg::String_<std::allocat... (6 times, avg 560 ms)
  3254 ms: rclcpp::create_publisher_factory<std_msgs::msg::String_<std::allocat... (6 times, avg 542 ms)
  3130 ms: rclcpp::AnySubscriptionCallback<rcl_interfaces::msg::ParameterEvent_... (32 times, avg 97 ms)
  3053 ms: rclcpp::Node::create_subscription<std_msgs::msg::Int32_<std::allocat... (2 times, avg 1526 ms)
  3051 ms: rclcpp::create_subscription<std_msgs::msg::Int32_<std::allocator<voi... (2 times, avg 1525 ms)
  3051 ms: rclcpp::detail::create_subscription<std_msgs::msg::Int32_<std::alloc... (2 times, avg 1525 ms)

**** Template sets that took longest to instantiate:
 23894 ms: std::make_shared<$> (270 times, avg 88 ms)
 23823 ms: std::shared_ptr<$>::shared_ptr<$> (403 times, avg 59 ms)
 23652 ms: std::__shared_ptr<$>::__shared_ptr<$> (360 times, avg 65 ms)
 23486 ms: std::allocate_shared<$> (270 times, avg 86 ms)
 23251 ms: std::__shared_count<$>::__shared_count<$> (304 times, avg 76 ms)
 21498 ms: std::_Sp_counted_ptr_inplace<$>::_Sp_counted_ptr_inplace<$> (270 times, avg 79 ms)
 21249 ms: rclcpp::Node::create_subscription<$> (11 times, avg 1931 ms)
 21237 ms: rclcpp::create_subscription<$> (11 times, avg 1930 ms)
 21234 ms: rclcpp::detail::create_subscription<$> (11 times, avg 1930 ms)
 20500 ms: std::allocator_traits<$>::construct<$> (444 times, avg 46 ms)
 20256 ms: __gnu_cxx::new_allocator<$>::construct<$> (202 times, avg 100 ms)
 17417 ms: std::unique_ptr<$> (1097 times, avg 15 ms)
 15551 ms: std::shared_ptr<$> (3295 times, avg 4 ms)
 15296 ms: std::__uniq_ptr_impl<$> (1097 times, avg 13 ms)
 13566 ms: rclcpp::create_subscription_factory<$> (11 times, avg 1233 ms)
 11041 ms: std::__and_<$> (6155 times, avg 1 ms)
 10266 ms: rclcpp::detail::create_publisher<$> (21 times, avg 488 ms)
  9969 ms: rclcpp::create_publisher_factory<$> (21 times, avg 474 ms)
  9357 ms: rclcpp::Subscription<$>::make_shared<$> (11 times, avg 850 ms)
  9108 ms: rclcpp::Subscription<$>::Subscription (11 times, avg 828 ms)
  8279 ms: rclcpp::Publisher<$>::Publisher (21 times, avg 394 ms)
  7585 ms: std::tuple<$> (1196 times, avg 6 ms)
  7488 ms: std::__shared_ptr<$> (3295 times, avg 2 ms)
  7133 ms: rclcpp::Subscription<$> (48 times, avg 148 ms)
  6213 ms: rclcpp::AnySubscriptionCallback<$> (64 times, avg 97 ms)
  6042 ms: std::variant<$> (64 times, avg 94 ms)
  5682 ms: std::__or_<$> (2388 times, avg 2 ms)
  5188 ms: rclcpp::Node::create_publisher<$> (10 times, avg 518 ms)
  5186 ms: rclcpp::create_publisher<$> (10 times, avg 518 ms)
  5103 ms: std::__detail::__variant::_Variant_base<$> (64 times, avg 79 ms)

**** Functions that took longest to compile:
    35 ms: main (/root/examples/rclcpp/actions/minimal_action_client/not_composable_with_feedback.cpp)
    32 ms: main (/root/examples/rclcpp/services/minimal_client/main.cpp)
    32 ms: std::shared_ptr<rmw_qos_incompatible_event_status_t> std::static_poi... (/root/examples/rclcpp/topics/minimal_publisher/member_function.cpp)
    30 ms: rclcpp::QOSEventHandler<std::function<void (rmw_qos_incompatible_eve... (/root/examples/rclcpp/topics/minimal_publisher/member_function.cpp)
    28 ms: std::_Bit_iterator std::__niter_wrap<std::_Bit_iterator>(std::_Bit_i... (/root/examples/rclcpp/topics/minimal_publisher/member_function_with_unique_network_flow_endpoints.cpp)
    28 ms: std::shared_ptr<std_msgs::msg::String_<std::allocator<void> > > std:... (/root/examples/rclcpp/topics/minimal_publisher/not_composable.cpp)
    26 ms: main (/root/examples/rclcpp/executors/cbg_executor/src/ping_pong.cpp)
    26 ms: main (/root/examples/rclcpp/executors/cbg_executor/src/pong.cpp)
    25 ms: main (/root/examples/rclcpp/actions/minimal_action_client/not_composable_with_cancel.cpp)
    24 ms: std::__detail::_Hash_code_base<unsigned long, std::pair<unsigned lon... (/root/examples/rclcpp/topics/minimal_publisher/not_composable.cpp)
    24 ms: std::weak_ptr<rclcpp::experimental::SubscriptionIntraProcessBase>::l... (/root/examples/rclcpp/topics/minimal_publisher/not_composable.cpp)
    22 ms: main (/root/examples/rclcpp/topics/minimal_publisher/not_composable.cpp)
    22 ms: std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<rmw_... (/root/examples/rclcpp/topics/minimal_publisher/member_function.cpp)
    22 ms: std::exception::exception() (/root/examples/rclcpp/topics/minimal_publisher/member_function.cpp)
    21 ms: void std::vector<std::shared_ptr<rclcpp::QOSEventHandlerBase>, std::... (/root/examples/rclcpp/topics/minimal_publisher/member_function.cpp)
    21 ms: examples_rclcpp_cbg_executor::PingNode::print_statistics(std::chrono... (/root/examples/rclcpp/executors/cbg_executor/src/examples_rclcpp_cbg_executor/ping_node.cpp)
    21 ms: std::_Sp_ebo_helper<0, std::allocator<rclcpp::QOSEventHandler<std::f... (/root/examples/rclcpp/topics/minimal_publisher/member_function.cpp)
    20 ms: main (/root/examples/rclcpp/executors/cbg_executor/src/ping.cpp)
    20 ms: __gnu_cxx::__exchange_and_add(int volatile*, int) (/root/examples/rclcpp/executors/cbg_executor/src/ping_pong.cpp)
    20 ms: main (/root/examples/rclcpp/actions/minimal_action_client/not_composable.cpp)
    20 ms: std::function<void (rmw_qos_incompatible_event_status_t&)>::operator... (/root/examples/rclcpp/topics/minimal_publisher/lambda.cpp)
    20 ms: std::allocator<long>::~allocator() (/root/examples/rclcpp/actions/minimal_action_server/member_functions.cpp)
    18 ms: std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<rc... (/root/examples/rclcpp/topics/minimal_publisher/lambda.cpp)
    18 ms: MinimalPublisherWithUniqueNetworkFlowEndpoints::print_network_flow_e... (/root/examples/rclcpp/topics/minimal_publisher/member_function_with_unique_network_flow_endpoints.cpp)
    18 ms: std::_Sp_counted_ptr_inplace<rclcpp::Publisher<std_msgs::msg::String... (/root/examples/rclcpp/topics/minimal_publisher/member_function.cpp)
    18 ms: __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<std::allocator... (/root/examples/rclcpp/topics/minimal_publisher/not_composable.cpp)
    17 ms: std::enable_if<(rclcpp::node_interfaces::has_node_parameters_interfa... (/root/examples/rclcpp/topics/minimal_publisher/member_function.cpp)
    17 ms: rcl_publisher_options_t rclcpp::PublisherOptionsWithAllocator<std::a... (/root/examples/rclcpp/topics/minimal_publisher/not_composable.cpp)
    17 ms: main (/root/examples/rclcpp/timers/minimal_timer/lambda.cpp)
    16 ms: std::pair<std::type_index const, std::shared_ptr<void> > const& std:... (/root/examples/rclcpp/topics/minimal_publisher/lambda.cpp)

**** Function sets that took longest to compile / optimize:
   139 ms: rclcpp::QOSEventHandler<$>::QOSEventHandler<$>(std::function<$> cons... (42 times, avg 3 ms)
   136 ms: std::vector<$>::vector(std::vector<$> const&) (135 times, avg 1 ms)
   119 ms: std::enable_if<$>::type rclcpp::detail::declare_qos_parameters<$>(rc... (14 times, avg 8 ms)
   116 ms: void rclcpp::experimental::IntraProcessManager::do_intra_process_pub... (17 times, avg 6 ms)
    90 ms: std::shared_ptr<$> rclcpp::experimental::IntraProcessManager::do_int... (17 times, avg 5 ms)
    87 ms: rclcpp::QOSEventHandler<$>::QOSEventHandler<$>(std::function<$> cons... (36 times, avg 2 ms)
    78 ms: rclcpp::QOSEventHandler<$>::take_data() (95 times, avg 0 ms)
    77 ms: std::enable_if<$>::type rclcpp::detail::declare_qos_parameters<$>(rc... (9 times, avg 8 ms)
    72 ms: rclcpp::QOSEventHandler<$>::execute(std::shared_ptr<$>&) (95 times, avg 0 ms)
    66 ms: void rclcpp::experimental::IntraProcessManager::add_owned_msg_to_buf... (17 times, avg 3 ms)
    63 ms: rclcpp::experimental::buffers::RingBufferImplementation<$>::dequeue() (18 times, avg 3 ms)
    62 ms: std::__detail::_Hash_code_base<$>::_M_bucket_index(std::__detail::_H... (51 times, avg 1 ms)
    62 ms: std::unique_ptr<$>::~unique_ptr() (92 times, avg 0 ms)
    58 ms: std::pair<$> std::_Hashtable<$>::_M_emplace<$>(std::integral_constan... (45 times, avg 1 ms)
    55 ms: rclcpp::Subscription<$>::Subscription(rclcpp::node_interfaces::NodeB... (9 times, avg 6 ms)
    54 ms: rclcpp::Publisher<$>::Publisher(rclcpp::node_interfaces::NodeBaseInt... (17 times, avg 3 ms)
    54 ms: std::__detail::_Hash_node<$>* std::__detail::_Hashtable_alloc<$>::_M... (45 times, avg 1 ms)
    53 ms: std::__shared_count<$>::__shared_count<$>(rclcpp::QOSEventHandler<$>... (42 times, avg 1 ms)
    49 ms: std::_Sp_counted_ptr_inplace<$>::_Sp_counted_ptr_inplace<$>(std::all... (41 times, avg 1 ms)
    47 ms: rcl_publisher_options_t rclcpp::PublisherOptionsWithAllocator<$>::to... (17 times, avg 2 ms)
    45 ms: std::__shared_count<$>::__shared_count<$>(rclcpp::QOSEventHandler<$>... (36 times, avg 1 ms)
    44 ms: std::_Hashtable<$>::_M_insert_unique_node(unsigned long, unsigned lo... (25 times, avg 1 ms)
    43 ms: void std::vector<$>::_M_range_insert<$>(__gnu_cxx::__normal_iterator... (14 times, avg 3 ms)
    43 ms: void rclcpp::SubscriptionBase::add_event_handler<$>(std::function<$>... (36 times, avg 1 ms)
    43 ms: __gnu_cxx::new_allocator<$>::~new_allocator() (7 times, avg 6 ms)
    43 ms: rclcpp::Publisher<$>::post_init_setup(rclcpp::node_interfaces::NodeB... (17 times, avg 2 ms)
    42 ms: std::__shared_count<$>::__shared_count<$>(std::allocator<$>*&, std::... (41 times, avg 1 ms)
    42 ms: rclcpp::QOSEventHandler<$>::QOSEventHandler<$>(rclcpp::Publisher<$>:... (17 times, avg 2 ms)
    41 ms: std::shared_ptr<$> rclcpp::detail::create_subscription<$>(rclcpp::No... (5 times, avg 8 ms)
    37 ms: std::weak_ptr<$>::lock() const (2 times, avg 18 ms)

*** Expensive headers:
97469 ms: /opt/ros/galactic/include/rclcpp/rclcpp.hpp (included 32 times, avg 3045 ms), included via:
  publisher_node.cpp.o publisher_node.hpp  (4609 ms)
  pong_node.cpp.o pong_node.hpp  (4195 ms)
  pong_node.cpp.o pong_node.hpp  (4161 ms)
  ping_node.cpp.o ping_node.hpp  (4110 ms)
  lambda.cpp.o  (3959 ms)
  member_function.cpp.o  (3953 ms)
  ...

47112 ms: /opt/ros/galactic/include/rclcpp/executor.hpp (included 32 times, avg 1472 ms), included via:
  pong_node.cpp.o pong_node.hpp rclcpp.hpp executors.hpp multi_threaded_executor.hpp  (2087 ms)
  lambda.cpp.o rclcpp.hpp executors.hpp multi_threaded_executor.hpp  (2083 ms)
  lambda.cpp.o rclcpp.hpp executors.hpp multi_threaded_executor.hpp  (2051 ms)
  publisher_node.cpp.o publisher_node.hpp rclcpp.hpp executors.hpp multi_threaded_executor.hpp  (1959 ms)
  ping_node.cpp.o ping_node.hpp rclcpp.hpp executors.hpp multi_threaded_executor.hpp  (1924 ms)
  pong_node.cpp.o pong_node.hpp rclcpp.hpp executors.hpp multi_threaded_executor.hpp  (1911 ms)
  ...

10549 ms: /root/examples/rclcpp/executors/cbg_executor/include/examples_rclcpp_cbg_executor/ping_node.hpp (included 4 times, avg 2637 ms), included via:
  ping_node.cpp.o  (5133 ms)
  ping_node.cpp.o  (5040 ms)
  ping.cpp.o  (194 ms)
  ping_pong.cpp.o  (180 ms)

10379 ms: /usr/include/c++/9/memory (included 32 times, avg 324 ms), included via:
  main.cpp.o  (898 ms)
  not_composable.cpp.o  (803 ms)
  ping_node.cpp.o ping_node.hpp  (676 ms)
  not_composable_with_feedback.cpp.o  (602 ms)
  subscriber_node.cpp.o subscriber_node.hpp rclcpp.hpp  (595 ms)
  pong_node.cpp.o pong_node.hpp  (594 ms)
  ...

10254 ms: /root/examples/rclcpp/executors/cbg_executor/include/examples_rclcpp_cbg_executor/pong_node.hpp (included 3 times, avg 3418 ms), included via:
  pong_node.cpp.o  (5151 ms)
  pong_node.cpp.o  (4918 ms)
  pong.cpp.o  (185 ms)

9730 ms: /usr/include/c++/9/functional (included 32 times, avg 304 ms), included via:
  member_function.cpp.o  (1067 ms)
  multithreaded_executor.cpp.o  (994 ms)
  member_functions.cpp.o  (921 ms)
  member_functions.cpp.o  (812 ms)
  not_composable.cpp.o fibonacci.hpp fibonacci__struct.hpp bounded_vector.hpp algorithm glue_algorithm_defs.h  (743 ms)
  not_composable_with_cancel.cpp.o fibonacci.hpp fibonacci__struct.hpp bounded_vector.hpp algorithm glue_algorithm_defs.h  (659 ms)
  ...

7139 ms: /root/examples/rclcpp/composition/minimal_composition/include/minimal_composition/publisher_node.hpp (included 3 times, avg 2379 ms), included via:
  publisher_node.cpp.o  (4642 ms)
  standalone_publisher.cpp.o  (1266 ms)
  composed.cpp.o  (1230 ms)

6701 ms: /usr/include/c++/9/future (included 32 times, avg 209 ms), included via:
  member_function.cpp.o rclcpp.hpp executors.hpp  (380 ms)
  member_function.cpp.o rclcpp.hpp executors.hpp  (353 ms)
  member_functions.cpp.o  (322 ms)
  member_function_with_unique_network_flow_endpoints.cpp.o rclcpp.hpp executors.hpp  (305 ms)
  subscriber_node.cpp.o subscriber_node.hpp rclcpp.hpp executors.hpp  (300 ms)
  main.cpp.o rclcpp.hpp executors.hpp  (291 ms)
  ...

6585 ms: /usr/include/c++/9/string (included 32 times, avg 205 ms), included via:
  member_functions.cpp.o functional tuple array stdexcept  (426 ms)
  multithreaded_executor.cpp.o functional tuple array stdexcept  (341 ms)
  main.cpp.o memory unique_ptr.h tuple array stdexcept  (339 ms)
  ping.cpp.o condition_variable std_mutex.h system_error stdexcept  (324 ms)
  not_composable.cpp.o fibonacci.hpp fibonacci__struct.hpp bounded_vector.hpp algorithm glue_algorithm_defs.h functional tuple array stdexcept  (300 ms)
  member_functions.cpp.o functional tuple array stdexcept  (290 ms)
  ...

5424 ms: /opt/ros/galactic/include/example_interfaces/action/fibonacci.hpp (included 6 times, avg 904 ms), included via:
  not_composable.cpp.o  (1627 ms)
  not_composable_with_cancel.cpp.o  (1491 ms)
  not_composable.cpp.o  (820 ms)
  not_composable_with_feedback.cpp.o  (667 ms)
  member_functions.cpp.o  (542 ms)
  member_functions.cpp.o  (275 ms)
  ...

3 Likes

out of curiosity, to create this internal distro with CI/CD, basically what you do is something like this?

  1. fix certain ROS distro version. (this is basically not upgraded unless system update required for the product?)
  2. apply internal fixes to this internal repository. (developers need to build specific packages individually and test locally, this is the major problem?)
  3. Nightly build & Release for Application development.

Developers have also noticed some targets taking 10~15 mins longer to build locally.

this means that developers use actual target devices or boards to build the binary in local?

thanks,
tomoya

hi!

while i don’t have any benchmarks on hand to share, and our project was still considerably smaller when we were still on galactic, we did notice long build times.
We also used the clang build analyzer and arrived at create_subscription taking a long time to instantiate (especially since the same subscription is instantiated many times in separate nodes…).
I was already experimenting with wrapping the most common create_subscription calls in a library which all nodes link against, to at least only instantiate the call once, altough i had to postpone evaluating this idea in detail so far…

TLDR: We also suffer from long compile times, and found rclcpp to be a major contributor. Thanks for bringing this to everyones attention!

We build ROS 2 core packages from sources, fixing the version/hash for each repository, and then we deploy the built artifacts to an internal registry.
This is based on the ROS 2 Conan Integration we presented some time ago.
These packages are built very sporadically, let’s say less than once a week, as we rebuild them only when we cherry-pick new ROS PRs into our forks.
As said, these are the core packages, e.g. rclcpp, geometry2, etc. We treat these as any other third-party software that our applications use (such as boost, tensorflow, etc).

Then, we have our robot applications that depend on ROS 2, for which we are actively measuring the compile-time and trying to reduce it.
Differently from the ROS 2 core packages, the robot applications are built hundreds of times everyday (by all our developers, their PRs and our CI/CD jobs).
As mentioned in the OP by Jeffery, we notice 25%-50% increase in compile-time in most of our ROS 2 robotics applications if we replace ROS 2 Galactic with ROS 2 Humble.

Deploying the Humble update as-is today, would cost us a lot of extra-money spent on AWS and various infrastructures.
This is a deal-breaker and we will not be able to move to Humble until we get the compile-time back to reasonable numbers.

1 Like

@alsora thanks for the explanation, i see what the pain here.

i am not really sure how and where exactly to address this issue, but willing to help on this since we might have the same situation. in the mean time, i will keep my eyes on Compilation performance is rather slow · Issue #1949 · ros2/rclcpp · GitHub

thanks,

Side note: PickNik trimmed 5 minutes off a 45-minute Docker build time by disabling man-db in a Dockerfile

# Disable man-db to speed up installation (skip "Processing triggers for man-db")
# https://unix.stackexchange.com/a/53050
# We use a pipe in this stage and want to make sure pipe failures are caught
SHELL [ "/bin/bash", "-o", "pipefail", "-c" ]
RUN echo 'man-db man-db/auto-update boolean false' | debconf-set-selections

I believe this is most important when lots of debians are being installed.

1 Like

Apologize for the long delay.

We have been trying out a few ideas for reducing build time on our side, but unfortunately, none has produced noticeable improvements in our CI/CD environment.

We have also devoted energy to investigating the root cause of increasing build time. We add a few custom examples to ROS 2/example Galactic to amplify impacts related to subscription. The examples are then built against every single Rclcpp commit since Galactic while keeping the rest of the Packages Rolling.

We can easily make out 3 distinct jumps at, 2e4c97a, 679fb2b, and d8e1aed and a noticeable drop at 9c1c989 even with the noise in the data.

Among the three jumps, 2e4c97a’s impact is on an arguably different magnitude increasing the build time by ~18%. 2e4c97a happens to be the commit for Type Adaption which is one area we originally suspected in our original post. The drop at 9c1c989 corresponds to moving event callback binding to PublisherBase and SubscriptionBase. Both cases confirm that templates have a lot to do with build time changes. However, we can also see that build time trends upward over time with the rest of the commits. We have also tried building the examples against Galactic packages and the average build time is approximated 36s. It indicates that packages besides Rclcpp also contribute to the build time increase.

With our production CI/CD pipeline building internal projects, we see even more dramatic increases. In the worst-case scenario, we see build time increased by 40%. It shows that certain hierarchies can be impacted even more than the examples used above.

With the new data, we think

  1. Compilation metrics should be established and tracked for ROS-2 packages, especially the ones consuming ROS-2 core.
  2. Add build-time investigation and improvement to the roadmap for the next ROS-2 release. As demonstrated in the chart above, there are certainly some refactoring directions that can help reduce the increased build time.
9 Likes