Is Data oriented programming relevant?

Hi I just heard of Data oriented programming while watching a talk on rust game dev and I was wondering if this was relevant here. We have been using OOP in our project and as far as I understood Data Oriented Programming avoid loading too much stuff in your CPU cache.
Do you guys also use OOP ? Why and if you have been using Data oriented programming in the past could you give a comparaison ?

1 Like

Do you guys also use OOP

Well, of the core client libraries (roscpp, rospy, and roslisp) two out of three are object oriented, and those two are the two most widely used client libraries. Many of the non-core libraries are object-oriented, too, and the concept of messages in ROS is very object-oriented-friendly. So I would be confident in saying the vast majority of folks are using object-oriented programming for ROS.

Why and if you have been using Data oriented programming in the past could you give a comparison

I haven’t used data-oriented programming explicitly, but only because I haven’t felt a need to. It feels more like a closer-to-hardware optimization strategy to me, so while there are some interesting use cases in ROS (maybe the hardware acceleration or MicroROS folks would find in valuable?), I’m comfortable making the assumption that I have the computation resources I need and I can focus on solving the robotics problem.

1 Like

Why and if you have been using Data oriented programming in the past could you give a comparaison ?

An example where to compare data oriented programming with object oriented
programming is the waitset.

Hereby the waitset is a variation of the reactor pattern
where one waits for a multitude of events in one single blocking call. If the
call is returning it provides a list of all events which have occurred.
Actually, the waitset is also state based but for fun and simplicity
lets assume it just listens to events.

A simple implementation could now use an array inside the waitset where all
objects are stored and whenever an event is signaled one iterates over the whole
array, collects the objects which signaled the event and return them to the user.

The code could look like:

for(auto & object : arrayOfObjects) {
  if ( object.hasEventOccurred ) {
    addToEventArray(object);
  }
}

The problem with this is we are only interested in one bool of the underlying
object (hasEventOccurred) but in our arrayOfObjects are also all the other
object informations stored. When we now iterate over this arrayOfObjects we
load a lot of variables into the CPU cache which we actually do not need and
this can be incredible expensive.
Take a look at the Latency Numbers Every Programmer Should Know

Just assume such an object has a size of several hundered bytes and this list
contains 1.000 objects. Then the CPU has to perform a lot of very time expensive
loads into the CPU cache.

A smarter way would be to use a struct of arrays instead of an array of structs
see AoS_and_SoA.
So we create an objectArray where every member is stored in its own array inside
this struct and our code looks like:

uint64_t objectArraySize = objectArray.size;
for(uint64_t objectId = 0; objectId < objectArray; ++objectId) {
  if ( objectArray.hasEventOccurred[objectId]) { // every member is an array not the objectArray itself
    addToEventArray(objectId);
  }
}

Now we only return a list of objectIds to the user and the user can decide on
which data they want to start working.

With the same assumptions as above we may have to load this hasEventOccurred
array only once into the CPU cache and can then work on it. The struct of arrays
approach can optimize such algorithms by a factor of 100 (yes factor!) and this
can become important when you have large system with many events.

But this has also its drawbacks. Usually such designs are harder to maintain since it
violates partially the Single Responsibility Principle
so one tries to mix object oriented programming with data oriented one. For instance
in the case of this WaitSet example one could keep all the object members
in the class together and just outsources the hasEventOccurred in a separate
array since this is the bottleneck.

If you would like to dig into a real life implementation, take a look at the iceoryx
waitset: https://github.com/eclipse-iceoryx/iceoryx/blob/master/iceoryx_posh/include/iceoryx_posh/popo/wait_set.hpp
We use a condition_listener.hpp and condition_notifier.hpp to signal events
and they share the condition_variable_data.hpp, see: https://github.com/eclipse-iceoryx/iceoryx/tree/master/iceoryx_posh/include/iceoryx_posh/internal/popo/building_blocks

4 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.