Handle unique parameters for robot instances

We have a software with a lot of parameters and have multiple versions of the same robot. Most the parameters are “general”, meaning they are the same for all of of the robots, some of them are “per series”, meaning each robot series will share the same values (e.g. joint limits, feature flags, etc…), and some of them are “per instance”, meaning each physical robot will have a unique value (e.g. calibration values).

I tried to architecture something that allows us this flexibility while not increasing the maintenance cost of changing a value.

We use ROS1 (for now). I created a hierarchy of YAML file and use YTT. YTT allows to “code” inside the YAML file, and most importantly for us, to merge and override values.

It roughly looks like this:

./acme_config
  ./global
    ./domain.yaml
    ./limits.yaml
    ./controllers.yaml
  ./series
    ./1.0
      ./limits.yaml
    ./1.1
      ./limits.yaml
      ./domain.yaml
# On the robot itself
/etc/acme
  ./calibration.yaml
  ./limits.yaml

I run YTT on this to output a single YAML file that I load using the rosparam directive of roslaunch.

I’ve simplified a bit, for example YTT allow two layers, one called “data-values” that can then be accessed in the more regular YAML file, which I try to use to keep the values separated from the “structure” of the parameters. Also, I can make it so the 1.1 series inherits from the 1.0 series values.

What I don’t like about this architecture is that it adds distance between the parameters and their actual use in the rosnodes. Also, using the “data-values” I can handle backward-compatibility but it’s not always easy.

Did someone else encountered this problem? What are your findings? Any clue how ROS2 would make this easier/more difficult? I guess the Python launch files would add a bit of flexibility.

I am not certain but don’t workspaces and namespaces solve this problem naturally ?

1 Like

I don’t see how. What do you mean?
ROS namespaces allow for multiple node to have separated parameters/topic/services and avoid conflicts. Catkin workspaces allow to correclty handle the dependency chain between packages when building them.

What I want in the end is for each physical robot to have proper parameters. Let’s say I have the following parameters: allowed_position_error, max_vel, odometry_calibration_factor.

I could create a .launch file for each physical robot and fill the values. However, in the end allowed_position_error will probably be the same for 95% of my robots, max_vel will be the same for all robot of series 1.0, an other value for all robots of series 1.1, and only odometry_calibration_factor will be unique per physical robot. The system I described is meant to reduce the maintenance cost of this, so I can deploy a unique software on all my robots and have the correct values.

I understand your specifications better now, I thought this was a dependency related issue, it is not. I don’t have a ROS2 solution for this.

In ROS 1 it is easy. We have launch files with multiple rosparam load commands that load the config from the most to the least general. It can even merge dicts naturally (but it can only add to them, not remove). The only problem is with lists - these got overwritten instead of appended.

The launch files have an arg which determines robot id and generation from it hostname (but you could use e.g. env vars for that,.too).

2 Likes

Your hierarchical approach is something I’ve seen being used by several robotics companies and I agree that it’s the right one, which is why I based our configuration management capability on it, too. We use this figure to explain it, which just illustrates part of the same hierarchy you already put in words:

Our capabilities all have their own web components you can embed anywhere on the web, and in this case that web components allows you to view and edit these layers of config files (yaml, json, or .env format). Whenever you make a change, it will propagate to all affected robots. So it’s similar to your approach but adds the ability to make changes remotely from the web. So far do only have two layers, fleet and robot, but we want to add the ability to specify additional ones soon.

Feel free to try it out. Would love to get your thoughts and hear what you think is still missing from it. DM me if you need help.

Thanks! I doubt this is compatible with our workflow (e.g. our robots are 100% offline) but I’ll have a look at the features and architecture.

EDIT: I miss-quoted peci1, I wanted to quote @chfritz

We use a similar hierarchy in ROS2. Our order loading order is default config for the node, robot model specific config, location specific config, robot instance specific config. The last parameter loaded is the one used.

We use environment variables to define model, location, robot instance.

In the top of the ros2 launch file we define the files like this:

location_specific_config = os.environ.get("LOCATION_SPECIFIC_CONFIG")
    if not location_specific_config:
        error_message = "no location_specific_config defined so can not pick override file to launch"
        print(error_message)
        raise Exception("error_message")
    location_override_config = os.path.join(
        "/etc/opt/acme/location",
        location_specific_config,
        "location_override.yaml",
    )
    print("Using location specific configs: " + location_override_config)

Then when we launch the nodes we pass all the config files

acme_node_yaml = os.path.join(
        get_package_share_directory("acme_node"),
        "config",
        "config.yaml",
    )
    ld.add_action(
        Node(
            package="acme_node",
            executable="acme_node",
            name="acme_node",
            parameters=[acme_node_yaml, robot_model_override_config, location_override_config, robot_instance_override_config],
        )
    )

It seems to work well for us but I’m sure it is not perfect! Would love feedback!

2 Likes

It would be nice if such hierarchical parameter loading were a built-in feature of the ROS launch.

A new design and discussion for the parameter interface would be highly appreciated. My take is that Ansible’s inventory model is pretty good and must be considered in discussions. Hierarchical host (robot) and group variables would fit robotic use cases very well as seen in the cases above.

1 Like

I asked practically the same question here.

What I ended up going with (and have seen multiple companies use) is a form of the hierarchical parameter loading as the others have suggested. I’d be interested in hearing in more details from someone who uses the cloud-based approach @smac mentioned in his answer (e.g. Ansible)

The built-in capability to pass overriding yaml files in the launch file (as mentioned by @johnjamesmiller) is very useful but that’s only part of the puzzle.
The second part of the puzzle is how to store your config yaml files to satisfy the following requirements:

1- You can specify “You robot have hardware version X, you are at Y customer, you are instance Z, etc… give me the right configuration for these constraints”.

2- You can easily deploy changes to config to all your robots, especially at the more general levels where multiple machines would need updating.

3- You can sync your config changes to your main code releases

You could probably argue for more requirements but I would say these are the essentials. I’m interested in hearing from others what techniques they use to satisfy these requirements.

One solution could be to create different repositories for each config level with one branch per e.g robot instance. However this solution in itself doesn’t really satisfy all the points above, e.g for #2, you need some extra automation/tool on top to efficiently deploy the config to a fleet

1 Like

@tnajjar just thinking out load - could your use case be solved by putting all your configs in one GitHub repo that Ansible would have access to? Use the same branch/tag release strategies as your code. Directory structure could be

  • models_specific_configs
    — a.yaml
    — t.yaml
  • location_specific_configs
    — n.yaml
    — e.yaml
  • robot_specific_configs
    — hostname-x.yaml
    — hostname-y.yaml

Using Ansible you could deploy as many or as few of the configs to each robot as you want. Robot would only load configs that match env variables set by Ansible configs?

Thanks for the proposition @johnjamesmiller, I don’t see any obvious shortcomings with this approach but then again I have no experience with Ansible or similar so it’s hard to tell. Maybe I’ll gather some experience and try that out in the next couple of months and chime back in with my experience.

Nope, topics here are closed after 30 days of inactivity :frowning: So rather make a reminder to make a post every 25 days =)

1 Like

I’m hoping that the community will not let this topic die so easily :wink: . But seriously, I know it’s only been a couple of days but I already expected more traction on this topic since it’s an issue that any company quickly faces as soon as they have more than a couple robots in the fields at different customers.

1 Like

By loading the yaml file in sequence would already perform the parameter override as mentioned above. The way to get the destination of the config file could be relative to a ROS package (get_package_share_directory) from ament index cpp/python, or it could be a predefined location where the configurations are stored.

Whether each robot instance only has its own set of parameters or has access to the parameters of all the other robot instances, a predefined location or ROS package would serve its purpose very well IMHO.

The rosparam loads as @peci1 uses and the ros2 equivalent by @johnjamesmiller works, but I’d miss the flexibility of YTT data/values.

For example I can set a given joint max velocity, and it will change this parameter for multiple nodes with different parameter names (e.g. MoveIt). I can also calculate some parameters with others (e.g. my max linear velocity given my max wheel velocity).

This increase maintainability. But maybe this is a bit off-topic?

I advise against using ansible for configuration management. Ansible is push-based, meaning all your robots need to be online in the moment you run the playbook. In practice, that is almost never the case once your fleet has grown past, say, 10 units. So then you spend a lot of time tracking down the ones you missed (not too much fun when you have 100+ robots).
The better approach is pull-based. That’s what we did in the Transitive capability and its also the model Puppet uses. It’s declarative rather than procedural: you just specify the “should be” state and the agent running on the robot will check that and “make it so” (to quote Picard). This way, you can edit config whenever and even robots that are offline will get it when they wake up.

5 Likes

Ansible can be pull based as well.
Generally speaking, its a nicer language than puppet.

Thank you to @Hugal31, @tnajjar, @chfritz, and anyone I missed for laying out requirements!

Summarizing requirements

  1. Low maintenance - Configs that apply to one or many robots should be changed once. Configs used in multiple places (joint max velocity applying to multiple joints) can be changed in one place, Configs can be calculated from other configs (max linear velocity given max wheel velocity)
  2. Get the correct config for Hardware X, Customer Y, instance Z
  3. Easily deploy changes to configs to all your robots. Not impacted by robots being offline. Requires either pull from robot or push with smart automatic retry.
  4. Sync config changes to your main code releases

Adding a few requirements of my own

  1. Versioned and traceable - Know what config was on what robot when
  2. Robust - auto rollback config and code if one fails.
  3. Scalable - deployment times from 1 robot to X robots does not increase linearly
  4. Testable - If it deploys in dev and test I know it will pass in production

What other requirements would you add?

4 Likes

What would people think about keeping the configs closer to the code during deployment?

Assuming you deploy using docker I am imagining a process that for each robot would generate a unique docker file and push a unique image.

The docker file would be pretty straight forward

from myapp:1.0
COPY model_specific_configs/{$model}.yaml /etc/opt/myapp/model_specific_config.yaml
COPY customer_specific_configs/{$customer}.yaml /etc/opt/myapp/customer_specific_config.yaml
COPY robot_specific_configs/{$target_hostname}.yaml /etc/opt/myapp/robot_specific_config.yaml

On the robot side - assuming you have some job or script to pull and apply latest images:

docker pull myapp:1.0

would turn into

docker pull myapp:1.0-${hostname}

Other management tools that deploy docker today could insert the target hostname

Some challenges that come to mind:

  • pushing a bunch of big images - With the way I understand docker layers to work I believe you would actually just be pushing the config layers.
  • number of tags - As of 2015 there was not documented limit to number of tags - just some things that time out in docker and Hub UI when they try to pull all tags for an image and you have >100 tags. I wonder if this is still an issue 9 years later?