Generative AI in ROS 2 Codebases? Oh My!

I think it’s important to give maintainers the option to reject contributions outright as well. Not sure if that’s already a universal expectation across the repos in question. It would give a low-maintainer-effort solution to the problem of bad-faith submissions.

For these two, I see them as compromises from my original thoughts. While I think its nice to meet in the middleground, I do wonder the ramifications if we only have approximate understanding of where the AI code exists. If we need to rip it out for legal reasons down the line, that would only give us the option to rip out entire PRs worth of new features instead of being able to specifically identify functions and blocks which are able to be more surgically addressed.

Maybe that’s a worthwhile trade off, but something to be noted. Approximation makes it easier on the developer but in the case that we need to remove it, also remove a ton (potentially years worth) of features, fixes, and capabilities that all go in full. Exactness is harder on the developer but in the case of needing to remove it, a workflow of exact sections of the code to update over a migration process can be addressed in a concentrated development effort.

As long as that’s a tradeoff that we collectively think is worthwhile, then I don’t object to any of the specifics of these points.

While I agree in concept, I think setting bars like that is just asking for people to ignore the rule overall. Setting less specific requirements like “high test coverage sufficient to promise correctness” or along those lines I think makes it more likely it’ll actually be considered. I would set the bar in Nav2 to be near-100% but leaving it up to each repository’s policies I think might be good.

I do think though an additional rule is worthwhile: AI can be used to write the tests for code or the code itself, but not both for the same lines being exercised. Otherwise, you’re testing code you don’t even know does what you want it to do, but it does whatever it does ‘correctly’.

I think that’s already a right that all maintainer have :slight_smile:

I added this.

If you can’t briefly describe in words the changes your PR is making to source code, then you almost certainly don’t understand the code itself. We’re not asking for an essay, just for a hand-written overview of the changes being proposed.

So, if necessary in the future, we can write a tool that goes through all PRs in a repository and finds those that mention a particular generative AI tool that has been found via the courts or passed legislation to be impossible to use and still comply with the license of the repository, and then re-review or rewrite those PRs.

Since not all repositories use squash-commits for merging PRs, I added “merged” to make it clear that it’s the commit messages that actually get merged into the main branch that need the note.

There is no difference, which is why I would expect that a line of source written using auto-complete to be tested, just as I expect the same now for someone using plain old tag-based autocomplete tools.

Exactly what it says. If your code contains untestable source in it, that’s a red flag for future maintenance.

If we’re unhappy with 100%, we could specify “the level of coverage required by the project” to allow each project to define its own coverage criteria.

No, because manual testing cannot be automated in a CI system.

I added this exception to the documentation one.

If the PR submitter is clearly lying, then the maintainer is free to reject the PR out-of-hand. They’ve always had that freedom, and always will, and will not abuse it if they want to stay as a maintainer for a project they don’t own. If the PR submitter is lying because they can’t be bothered to follow the project’s procedures, then their commitment to following other procedures unrelated to generated code or documentation (such as writing good tests) also comes into doubt.

Doing in-person code walkthroughs is a common practice as part of reviewing large and complex PRs, such as major new features, especially for organisations with multiple developers working on a team. Its a practice that should be more frequently adopted in open-source development now that we have all these amazing, free teleconference tools. I think it’s fair to make it known to submitters before hand that they may be asked to meet with a maintainer to do a code walkthrough, especially if their PR is complex and contains generated code.

It’s clearly a mix of generated and hand-written code, and so the PR description would document that by saying what tool was used, what to generate, and approximately how much of the code was not written by hand.

I agree that it wouldn’t be as easy to rip out legally-compromised code cleanly, and that worries me, too, but I think it’s important to keep the process smooth for submitters using these tools in the face of a hypothetical, rather than taking the extreme approach. It’s a trade off, like you say.

This is an important point and worth consideration, I think.

1 Like

Hmm, it seems JetBrains goes a bit against the effort of marking lines/PRs with the names of the LLMs used:

The JetBrains AI service provides significant flexibility in terms of the models we can offer as part of AI Assistant. As we are not locked into using any specific vendor, we will be able to evolve our use of models as technology advances in this rapidly changing area. This gives us the ability to choose the best model or approach to solve your problem.

Thanks for the changes, @gbiggs! I much prefer this version.

I’d very much prefer that. I love high test coverages, though sometimes reality just doesn’t like ideologies.

I’m fine with keeping the possibility, just saying that it might be a large burden for maintainers - especially if they’re not doing it as a full-time job. There’s also the difference between a member of organization that you can expect some common ground (and maybe close timezone) vs. a random people on the internet.

Just thinking, we might need a LLM based tool to find the problematic commits if such a need for rollback happens, if using “natural” English.
The need of a standard syntax as proposed by @Katherine_Scott might be nice, but then we’ll have to keep a curated list of options to fill, and also make the tools to check for typos to prevent unparsable texts.

Either way, we’ll need to invest quite some effort to make it “usable” in case such need arises.

Now that I think of it, Maybe it’ll be nice to have a “requirements” section and “recommendation” section, especially for project maintainers.

E.g. from the current proposal by @gbiggs:
Requirement: To cite where (and how) AIs are used with a specified format, to avoid potential lawsuits.
Recommendation: Every thing else. Maintainer sanity, project management, or ISO certifications. But it’s up to the maintainer to choose what to apply to their project.

Also, there’s an elephant in the room: if a few years later we have to change or remove a great chunk of AI generated code, is that even possible?
Imagine 2 LTS releases later, we have to change or remove 20% of the code since the adoption of this policy. (Basically ROS2 L → Iron) While this is a worst case scenario, (I think tech megacorps will defend against said lawsuit heavily) do we even have the resource to do it?
BTW, in this scenario pretty much all other open source projects will suffer from the same lawsuit, so developers will be very busy.

And finally: is it possible to bring a lawyer specialized in source code copyright to give us some general directions, or at least a peer review? @smac mentioned Bosch and Samsung having a focus on licensing and compliance, maybe they can help?

This is exactly why its so important to have this policy. It could be so much worse without one if we can’t trust anything and everything is polluted with lawsuit potential contributions.

Yeah, it sucks and it would put us back… say… 2 years in your example as a community / industry. This is one of the main reasons I do not use generative AI in my programming nor would I recommend anyone else to until this becomes settled law so we know that you’re not actually making open-source implode due to liability concerns. There’s a real future that could exist that generative AI code is so embedded in major open-source projects like ROS that for legal reasons, major companies can’t touch them with a 10-foot pole, resulting in them all dying due to licensing issues. That’s not so farfetched. The best case scenario is that we’re only put back 2 years if it turns out all of this work comes out to be illegal. I hope that’s not the case, but from what I’m reading of the current lawsuits, they definitely have a point.

It seems somewhat careless to think its too much burden to include some basic details about what they’re doing when the repercussions are this massive. Not all tools are the right tools for a particular job: Open-Source + Generative AI doesn’t seem to be there – yet.

I’m sure that’ll be part of the steps here, but keep in mind this isn’t settled law and its very complicated. Its not something that we as the ROS community are going to be able to get the definitive this-is-the-answer answer when there are large numbers of lawsuits still pending on this topic. If it was clear to anyone what the answer was, we wouldn’t be having this discussion :upside_down_face: The best we can do is come up with policies about how we’re going to handle it with an eye towards what we’d do in the worst-case scenario. If it becomes settled law that this is A-OK, then we can revisit and relax the requirements.

Part of me that’s idealist agrees, but the other part feels that the genie is already out of the bottle. Meaning:

  1. It is already used too widely
  2. There will always be people or communities ignoring it (cough China cough)
  3. The discrepancy between those how’re using and those who aren’t will only become larger

But then again, that’s not the point of this thread, nor what I want to see as a conclusion from the ROS community.

Yes I know, I was just thinking what we realistically need in case of contingency. It might be better to bring a specialist early on, than to speculate. At the very least, I am not a lawyer and don’t qualify to be quoted in this matter :wink:.

1 Like