Robotics Distributed System based on Kubernetes


where one node is in container(inside k8s) and the other node is outside the k8s.

this other node is in the same cloud network? asking instances that run k8s and instance that runs other node can connect?

Not in the same network. Yes, instance that runs other node can connect. This node means a ros2 process. For example, I wan na use cloud to control turtlebot when the turtlebot is connnected to the cloud vpn.

I am not sure whta exactly your network configuration is, but you could try to deploy the pods with host network enable, then other node can access the pod since it is not running on CNI. host network sample is here.

@tomoyafujita Hi Tomoya, we tried OpenVPN to connect ROS2 nodes in different networks but found that it is not working because OpenVPN doesn’t support multicast by default. Can I ask how you resolved this issue?


Can I ask how you resolved this issue?

Actually I have not tried to connect sites between edge and cloud via OpenVPN with ROS2 application. Sorry I do not have answer right now.

Maybe this here would be handy…?

This is a newer FastRTPS release (currently not in any ros distro by default → so you need to build it yourself). It is designed for networks with no multicast support… So probably it would fit?

Please keep me updated on this. I’m very interested in this topic :slightly_smiling_face:

@flo @yjkim046

yes, that probably works. but it is server/client architecture. (cannot fail independently, of course this is dependent on use case.)

I was thinking that,

  1. fleet control is single frontend server for administration. (This is confirmed by federated network via Kubernetes.)
  2. edge cluster system to connect directly in edge site. (edge distributed system)
  3. brain is in cloud, edge devices have access to brain for intelligent process.

2 and 3 is different architecture, but co-exist. I wonder if 2 and 3 mixed and everything connected via ROS2 application layer. we could do that, but need to consider if this is suitable for the application.

I get your point. With this you get a single point of failure again (basically ROS1)… If the server is missing the whole system would go bonkers…

I believe, that the (main) server from the DDS Discovery would need to be always inside the robot. So you still have an autonomous “basic” robot that can be extended with multiple resources but maintain integrity during offline phases…
You would then need a clever discovery algorithm that matches/combines these resources… which brings us back to multicast… damn… :smiley:

Maybe @Jaime_Martin_Losa or one of his colleagues from eProsima has a smart idea regarding this?


Hi @flo,

You can set up several redundant discovery servers.

See here:


Hi @Jaime_Martin_Losa,

Thanks for the fast heads up. But I’m unsure if this is a 100% fit. The usecase would be that multiple discovery servers can be merged.

So we have i.e. 3 servers like this:

Server A: nodeA, nodeB (inside local robot)
Server B: nodeC (inside factory network)
Server C: nodeD (cloud network)

Server A is inside the robot. But B might be in the onsite edge network and C in the Cloud.
I ideally don’t want to have server C include all nodes from multiple robots. It should just add its functionality to the DDS for the robot that requests access. It should even be reached from multiple robots at the same time that do not see each other (fleetmanagement i.e.).
And also combination A+B, A+C and A+B+C should be possible.
At least, if I can wish me my perfect networking world… :smiley:

Is it possible to have discovery servers with different nodes inside them and (re-)connect to them at once without switching between them?
Redundancy is typically a 1:1 mirror of the same functionality as you expect the system to work like before when one server instance fails.

Hi @flo,

Right, it is possible, and it is the regular behavior. When you connect to two discovery servers, you get the union of the information stored in both.



thank you for the follow up. This makes it much clearer to me now! :slight_smile:
Any chance you can provide us with an easy ROS2 example? My DDS knowledge is building up right now with the micro-ROS stack, but I’m far away from mastering it. So an example would speed things up for us without becoming a master in DDS.

The example would then probably answer the question for me how I can manually register my application at my dedicated discovery server. If I do everything “manually” I would prevent adding my local nodes to server C and therefor other robots connecting to C would not see my nodes on server A…? :slight_smile:

I would highly appreciate if you can boost this development by supplying or pointing us into the right direction with a few small ROS2 examples! :star_struck:

Hi Tomoya,

close to you (in Tokyo) are my friends from Rapyuta Robotics that used k8s / Openshift to build a commercial robotic PaaS.
At ZHAW (in Zurich) we also build robotic applications using Kubernetes. We’ll be happy to exchange experiences.
Here’s a little demo video using orchestration to run a distributed navigation app:

@flo @Jaime_Martin_Losa

I would highly appreciate if you can boost this development by supplying or pointing us into the right direction with a few small ROS2 examples!

i would really appreciate it too!!!


Rapyuta Robotics that used k8s / Openshift to build a commercial robotic PaaS.

yes, i am aware of that.

they actually use OpenServiceBroker API to support Platform Broker.


I am willing to share experience too!! thanks,

1 Like

Ive played with ROS2 on k8s/openshift including discoveryserver
The SPDP makes assumptions about the eth endpoint. I got around this as well using downward api and got stuff working using a server; However the resilience is a point of concern. the underlying implementation assumes that IPs don’t change, this assumption is invalid for container orchestration environments where the same pod (think of this as a compute node) may come back to life (reboot) with a different IP address.
Additionally the discoveryserver also now bhaves as a ROS1 master and needs its own resilience.
Another major show-stopper is the lack of support for DNS based FQDN. K8s does provide stable netwrok identity IP and dns using a service abstarction . This however is via DNSMASQ - the pod internal IP is != the stable IP , this causes issues too


Here I provide a minimal example using Fast-RTPS client-server discovery that hope suit your needs:

ros2 client-server minimal example

To summarize, we introduce a fleet of two robots. Each robot has a sensor, actuator, and control nodes. The control node manages local behavior and receives commands from a fleet manager node.

In order to minimize network traffic, we want the sensor and actuator topics to be restricted to the robots. That is only sensors, actuators, and local control exchange this type of message.

On the other hand, we want local control nodes to exchange command topics among them and the fleet server but never with sensors or actuators.


Hello @tomoyafujita, thanks for sharing this material. I wonder if there is an opportunity for us to collaborate on a POC based on this architecture.

W.r.t. deplyoment and maintenance (Upgrade, Rollback, Monitoring, etc.) of distributed systems based on Kubernetes KubeEdge is a quite interesting CNCF sandbox project.

KubeEdge is an open source system for extending native containerized application orchestration capabilities to hosts at Edge.

ATM it supports MQTT based communication only. However could probably be a nice blueprint for ROS2 based distributed systems deployed to the edge as well.


happy to hear that, let’s have a quick talk about configuration and requirement! maybe there would be some area we could work together.