`ros2ai` next-gen command line interface

Hi ROS users,

I could allocate some spare time during thanks-giving holiday to come up with idea and implementation.
I would like to introduce ros2ai next-gen command line interface. (currently this is just a hobby and my personal project.)

please see demo, really easy to see what we can do

(https://github.com/fujitatomoya/ros2ai/assets/43395114/78a0799b-40e3-4dc8-99cb-488994e94769)

If you are interested, please reach out to me :smile:

I got some more ideas, so I will keep working on this :+1:

thanks,
Tomoya

6 Likes

This is a pretty good endeavour @tomoyafujita, thanks for sharing! This has potential in my opinion.

I find myself typing parts of my ROS graphs into foundational models’ chat-like UIs quite often lately. I find these great for troubleshooting through lots of ROS data and/or summarization (which I simply can’t digest). The extension you propose can bridge that gap very nicely and avoid having to copy and paste unneccesarily.

I don’t have much bandwidth these days for side projects but I’ll throw some ideas in case some in the community may want to grab and implement them on top of your disclosure:

  • I see there’s exec, query and status subverbs implemented for now. How about a init subverb which initiates a new conversation and packs the most representative (if not all) ROS graph data and passes it over for digestion, so that future queries take that into account.
  • Subverbs update and summarize come to mind as well, which could build upon the init one proposed above (and with a similar mindset).
  • Currently the proposed implementation makes use of the API, which is costly AFAIK. Something interesting would be to try out the hack we introduced back while developing PentestGPT. In a nutshell, instead of using the API, in earlier prototypes, we simulated a browser-interactions through an accepted/initiated COOKIE. We exposed this to the environmental variable CHATGPT_COOKIE instead (see here for a branch where this was prototyped). Extraction of the cookie could be automated (tested only in some browsers) easily.

You don’t suggest working around the Terms and Conditions of the service, do you, Victor? :slight_smile:

@vmayoral thank you so much for sharing your thoughts and ideas.

How about a init subverb which initiates a new conversation and packs the most representative (if not all) ROS graph data and passes it over for digestion, so that future queries take that into account.

yeah, that would be interesting. so that AI can help user more precisely and dedicated to the environment.

btw, i consider some ideas about session with context-aware mode, i gotta spend some more time to figure that out, instead of playing NintendoSwitch :computer: (should be PlayStation :sweat_smile: ) …

Subverbs update and summarize come to mind as well, which could build upon the init one proposed above (and with a similar mindset).

I see, this would be useful too, good idea.

Actually I am kinda inclined to have much less subcommands in the future but only ros2 ai, so that user just relies on calling simple commands without being bothered by sub-commands and options…

which is costly AFAIK

true, I personally use my own API key. I need funding :yen: :laughing:

it would be really nice if we could have community support for this, we can get more scaled data for ROS 2 and that could be really useful for ROS 2 community as well.

to be honest, the pain is not the cost right now, but latency. API is kinda lagging… i guess, i need to find some better fine tuning…

all that said, i just have some ideas but concrete, i will post the issues and slide to share more details :smile:

thanks.

AFAIK the terms don’t establish that you can’t input the data by other means-than-your-keyboard with their Free and Pro accounts, but I’d be happy to learn about it if that’s the case. Note that these accounts are limited to specific amount of interactions per hour, which is a significant restriction (as opposed to the unlimited APIs), but I believe fits this use case (I don’t see myself asking questions about my graph every minute after all). As I said, this is something I’m already doing manually and I’d be surprised other don’t.

What’s proposed is a reasonable use: A simple click-button tool that pre-fills the conversation for me with the right context sounds fair. Instead of dumping manually things, you’re just feeding the data programatically, but you can always turn to the UIs (e.g. your phone, browser, etc.) to continue with the querying. Also, given the competitive landscape of foundational language models being increasingly more capable and more accessible, I think it’s on their interest to facilitate use cases like this, specially with paid and more capable accounts.

Interesting. I’ve been paying attention to locally run smaller foundational models for various robotic use cases. This could overcome the limitation you’re observing. Maybe keeping plug-and-play (from a foundational models’ perspective) interface within ros2ai would be helpful?

1 Like

Special thanks to The Construct @ricardotellez about ros2ai podcast interview :exclamation:

It was really good opportunity to summarize thoughts, and expanding ideas with different perspective :+1:

I do need to study and learn more, but i will try to keep this up :rocket:

some updates (from last time)

thanks,