NASA JPL has open-sourced ROSA, an AI agent designed to interact with ROS-based systems using natural language queries.

Hi ROS Community! :wave:

I’m a Data Scientist, Software Engineer, and Robotics Integration Lead at NASA Jet Propulsion Laboratory. Over the past year, our team has been working on the development of an AI agent that is fluent in ROS and can be used by developers of all skill levels.

ROSA (ROS Agent) is augmented with all manner of tools, including ROS1 and ROS2 tools, math tools, and more. It can also be adapted to new robots with unique capabilities by simply augmenting the core agent with new tools and system prompts. (see Developer Docs and Custom Agents).

Simply type in your query and ROSA will use its tools to satisfy it. Some example queries include:

  • “Give me a list of nodes, categorize them into navigation, perception, control, and other
  • “Show me a list of topics that have publishers but no subscribers”
  • “Set the /velocity parameter to 1.5
  • “Echo the /robot/status topic”
  • “What is the message type of the /robot/status topic?”
  • “Describe the turtlesim package and its dependencies.”

Additionally, we added a quick and easy way to demo the agents capabilities by creating a custom agent for TurtleSim! This agent serves as both an easy-to-use demo (in Docker), and as a how-to guide for creating your own custom agents.

In the future, we plan to release agents for JPL’s Open Source Rover and Boston Dynamics Spot robot, to name a few.

We would love to get your opinions and we are open to contributions from the community!

Check out the ROSA project on Github: GitHub - nasa-jpl/rosa: ROSA is an AI Agent designed to interact with ROS-based robotics systems using natural language queries. ROSA helps robot developers inspect, diagnose, understand, and operate robots.

You can also follow me on X / Twitter for updates, release schedule, and more!

15 Likes

I would be very excited to see this for the JPL Open-Source Rover and happy to help since I’m quite familiar with that project :smile:

What’s the major advantage of using a tool like this over just copy-pasting some files into or asking such questions to, say, ChatGPT? Is it just easier since it can talk to the running ROS system directly or do the custom plugins provide something you couldn’t achieve ‘manually’? I’m especially interested for more complicated robotic systems, for example manipulators with collision scenes. The turtlebot is, by design, quite simple and so might not demonstrate the power of using this tool.

1 Like

I’m familiar with your work :+1: I’m collaborating with Lan B. Dang to bring this functionality to OSR, and I think joining forces would be fantastic! :muscle:

ROSA offers several advantages over manual processes:

  • First, the agent can select from a wide range of tools to answer a query. It can utilize multiple tools simultaneously and even execute them in parallel. The responses are then synthesized into a final result, or the agent can continue using additional tools if the query hasn’t been fully addressed yet. This is made possible by its “reasoning - action - observation” loop.

  • You can also integrate custom tools for tasks beyond information retrieval. For example, we’ve developed a custom ROSA-Spot agent that can sit, stand, and even walk around. Essentially, we’ve given ROSA the same control capabilities you’d get with an Xbox controller (with plenty of safety measures in place to mitigate the risk of unwanted behavior). You can instruct it to “stand up, walk forward and to the left about 3 meters, then sit down,” and it will follow through. (short demo video attached to this post)

  • Additionally, ROSA can produce semi-structured output from fuzzy templates. For instance, the following system report was generated primarily by feeding diagnostics, environment variables, and a depth image into the model, along with system prompts and a very generic system report template.


(btw, I didn’t have to parse the response to create this report, the models output is being directly rendered as Markdown)

As you can see, ROSA is also multi-modal, capable of capturing depth images from a RealSense camera, describing what it observes, and even estimating distances.

These are just a few examples of the more complex agents we’ve developed using ROSA. There’s so much more that can be done, and we’re just getting started :blush:

1 Like

Hi @RobRoyce!

Congratulations on your release! ROSA seems to be very similar to RAI, our open source, multi-modal AI agent framework which is a subject of incoming ROSCon talk this year.

Update:
We decided to go for a beta release, (see GitHub - RobotecAI/rai: RAI is a multi-vendor agent framework for ROS 2, utilizing Langchain and ROS tools to perform ROS 2 actions, defined scenarios, free interface execution, log summaries, voice interaction and more. The project is to be released for ROSCon 2024.), with features such as a node for voice communication with robot (both ways), ros2 action calling, a real and simulated robot demo, various ROS 2 tools and log summaries. We are connecting to collaborate with ROSA.

3 Likes