We did testing on the CPU side of things, but this may be related Reconsidering 1-to-1 mapping of ROS nodes to DDS participants . I can imagine that this mapping also influences the “size” of nodes. If the DDS part of the application is repeated for every single node (each node gets a participant) I suppose this will increase the size of the application in memory as well.
Did you manage to conclude anything else from your research that might allow for optimization?