Ran the code in composition mode (both depth_to_pointcloud and astra_driver as libs, using UniquePtr to pass the Image), and that didn’t seem to make much difference.
Switched back to 2-node mode, and connected via Ethernet instead of Wifi (disabled Wifi, connected a dummy cable to another machine in the lab), and the performance was way better. So clearly that ~40% on wl
was getting in the way.