Hey folks, over the course of 2025 it’s become clear that we need policy tools to help maintainers push back against unwanted AI generated content (aka slop) in our project. We have a 64-comment thread on that already.

I’ve created a draft policy guiding how we expect contributors to use AI tools, and creating guidelines for how maintainers and moderators should handle unwanted contributions. I’ve pasted the markdown inline below as part of this Discourse post, but it’s also available rendered on github or in the associated PR.

The policy text contains chunks of text from the Fedora proposal for AI policy, which was pretty darn good, plus my own recent addition of the guideline that new contributors should keep patches small. I came upon the number of 150 lines of non-test code additions to come up with an objective standard, but that is completely arbitrary, and informed only by eyeballing recent diffstats on main.

I’d love to get your high-level feedback on the RFC, and any low-level text edits on the PR.


LLVM AI Tool Use Policy

LLVM’s policy on AI-assisted tooling is fundamentally liberal – We want to
enable contributors to use the latest and greatest tools available. However,
human oversight remains critical. The contributor is always the author and is
fully accountable for their contributions.

  • You are responsible for your contributions. AI-generated content must be
    treated as a suggestion, not as final code or text. It is your responsibility
    to review, test, and understand everything you submit. Submitting unverified or
    low-quality machine-generated content (sometimes called “AI
    slop
    ”) creates an unfair review burden on the community and is not
    an acceptable contribution. Contributors should review and understand their own
    submissions before asking the community to review their code.

  • Start with small contributions: Open source communities operate on trust
    and reputation. Reviewing large contributions is expensive, and AI tools tend
    to generate large contributions. We encourage new contributors to keep their
    first contributions small, specifically below 150 additional lines of
    non-test code insertions, until they build personal expertise and maintainer
    trust before taking on larger changes.

  • Be transparent about your use of AI. When a contribution has been
    significantly generated by an AI tool, we encourage you to note this in your
    pull request description, commit message, or wherever authorship is normally
    indicated for the work. For instance, use a commit message trailer like
    Assisted-by: . This transparency helps the community
    develop best practices and understand the role of these new tools.

  • LLVM values Your Voice. Clear, concise, and authentic communication is
    our goal. Using AI tools to translate your thoughts or overcome language
    barriers is a welcome and encouraged practice, but keep in mind, we value your
    unique voice and perspective.

  • Limit AI Tools for Reviewing. As with creating code, documentation, and
    other contributions, reviewers may use AI tools to assist in providing
    feedback, but not to wholly automate the review process. Particularly, AI
    should not make the final determination on whether a contribution is accepted
    or not. The same principle of ownership applies to review comment
    contributions as it does to code contributions.

This policy extends beyond code contributions and includes, but is not limited
to, the following kinds of contributions:

  • Code, usually in the form of a pull request
  • RFCs or design proposals
  • Issues or security vulnerabilities
  • Comments and feedback on pull requests

Extractive Changes

Sending patches, PRs, RFCs, comments, etc to LLVM, is not free – it takes a
lot of maintainer time and energy to review those contributions! We see the act
of sending low-quality, un-self-reviewed contributions to the LLVM project as
“extractive.” It is an attempt to extract work from the LLVM project community
in the form of review comments and mentorship, without the contributor putting
in comensurate effort to make their contribution worth reviewing.

Our golden rule is that a contribution should be worth more to the project
than the time it takes to review it. These ideas are captured by this quote
from the book Working in Public by Nadia Eghbal:

"When attention is being appropriated, producers need to weigh the costs and
benefits of the transaction. To assess whether the appropriation of attention
is net-positive, it’s useful to distinguish between extractive and
non-extractive contributions. Extractive contributions are those where the
marginal cost of reviewing and merging that contribution is greater than the
marginal benefit to the project’s producers. In the case of a code
contribution, it might be a pull request that’s too complex or unwieldy to
review, given the potential upside." -- Nadia Eghbal

We encourage contributions that help sustain the project. We want the LLVM
project to be welcoming and open to aspiring compiler engineers who are willing
to invest time and effort to learn and grow, because growing our contributor
base and recruiting new maintainers helps sustain the project over the long
term. We therefore automatically post a greeting comment to pull requests from
new contributors and encourage maintainers to spend their time to help new
contributors learn.

Handling Violations

If a maintainer judges that a contribution is extractive (i.e. it is
generated with tool-assistance or simply requires significant revision), they
should copy-paste the following response, add the extractive label if
applicable, and refrain from further engagement:

This PR appears to be extractive, and requires additional justification for
why it is valuable enough to the project for us to review it. Please see
our developer policy on AI-generated contributions:
https://llvm.org/docs/AIToolPolicy.html

Other reviewers should use the label prioritize their review time.

The best ways to make a change less extractive and more valuable are to reduce
its size or complexity or to increase its usefulness to the community. These
factors are impossible to weigh objectively, and our project policy leaves this
determination up to the maintainers of the project, i.e. those who are doing
the work of sustaining the project.

If a contributor responds but doesn’t make their change meaningfully less
extractive, maintainers should escalate to the relevant moderation or admin
team for the space (GitHub, Discourse, Discord, etc) to lock the conversation.

Copyright

Artificial intelligence systems raise many questions around copyright that have
yet to be answered. Our policy on AI tools is similar to our copyright policy:
Contributors are responsible for ensuring that they have the right to
contribute code under the terms of our license, typically meaning that either
they, their employer, or their collaborators hold the copyright. Using AI tools
to regenerate copyrighted material does not remove the copyright, and
contributors are responsible for ensuring that such material does not appear in
their contributions. Contributions found to violate this policy will be removed
just like any other offending contribution.

Examples

Here are some examples of contributions that demonstrate how to apply
the principles of this policy:

  • This PR contains a proof from Alive2, which is a strong signal of
    value and correctness.
  • This generated documentation was reviewed for correctness by a
    human before being posted.

References

Our policy was informed by experiences in other communities:

20 Likes

Hi Reid,

I admit that I didn’t read all the posts on the megathread, but my 2c is that this issue is real. I think your draft nicely encapsulates a balanced set of feedback and expectations for the community. I’m generally +1 on it. Thank you for helping lift the conversation here,

-Chris

First of all, thank you so much for working on this. I have been struggling myself to keep up with the thread, I appreciate your efforts here.

It is a lot of digest but at least on my first run through, I feel like this really hits all the main concerns and at the same time strikes a balance between being firm yet welcoming at the same time.

Repeating myself from the other thread.

I’ll go against the grain and say that I do not agree with the referenced Fedora policy, and would not support its adoption for LLVM (though I acknowledge that Reid’s rephrasing of it is better than the original). I strongly oppose any policy that frames machine-generated contributions as encouraged practices, as opposed to something that is tolerated under certain circumstances. The wording of that policy prioritizes welcoming AI-assisted contributions over duty-of-care to the users of our software, which to me seems unacceptable.

More broadly, it is not clear to me that there is actually a consensus in [the other] thread on what the right direction to go is. I strong opinions expressed on both sides, as well as a pretty significant rate of messages being hearted by non-participants.

Without a clear consensus on the desired policy, I don’t think we’re ready for word-smithing.

3 Likes
  • Start with small contributions: Open source communities operate on trust
    and reputation. Reviewing large contributions is expensive, and AI tools tend
    to generate large contributions. We encourage new contributors to keep their
    first contributions small, specifically below 150 additional lines of
    non-test code insertions, until they build personal expertise and maintainer
    trust before taking on larger changes.

One of the suggestions on the previous thread that I think deserves serious consideration is to discourage AI use for new contributors entirely. I personally have very little concern with a well-established contributor using AI assisted code generation, as they are fully capable of reviewing and understanding the produced code. The same is not true for a new contributor, which is what results in that aspect being offloaded to the PR reviewer instead. (There is some nuance here – nothing wrong with AI-assisted code base exploration. It’s code generation specifically that’s the problem.)

I think that that this is trying to say the right thing, but the phrasing is very euphemistic, and thus leaves too much room for interpretation. I think that this is actually a point we should be very clear on: While it’s fine to use AI for translation or minor stylistic editing, it should be explicitly forbidden to generate PR descriptions or review responses.

8 Likes

I broadly support the proposed policy. With that said, I do have two comments. Firstly, I think we should somehow clarify that the 150 lines is not some arbitrary hard limit, but more a general guideline. For example, it would not be unreasonable for a new-to-LLVM contributor, who recently started working for a company team, to want to upstream some of their stuff, which might in the end being quite a lot more than 150 lines. Depending on the downstream team’s processes etc, that might well have already gone through several rounds of reviews etc, so there would not be much benefit in requiring the user to contribute several small things just to conform to some fixed limit. I’m sure I’m also not the only one who doesn’t want to start counting lines of code contributed to determine whether the change conforms to this guideline. If we do include this block in the AI policy, I think we should also put it somewhere else, or just point to that other place from the AI policy, since it’s a good guideline, even without AI tool usage.

Secondly, I think the “extractive contribution” topic should be split off into its own discussion. Whilst I understand it is related to the AI tools usage, in that excessive AI tool reliance can lead to an extractive contribution, I think the policy itself is stand-alone and we should be able to point to it without suggesting that the contribution is AI generated. This is already hinted at with the second half of the statement “i.e. it is generated with tool-assistance or simply requires significant revision”.

I loosely agree with the PR descriptions bit, but I’d say with review responses that it is allowed but should be clear that it was AI generated (which in turn implies it may be ignored/disagreed with etc). The reason I say that is that I’ve seen some benefit with CoPilot-generated reviews, some of which is rubbish, but other bits of which are genuinely useful and may point out things that reviewers otherwise don’t spot. I’ve seen its usage even within LLVM already.

I also don’t like the phrasing of Fedora’s policy, and think we need a clean break: just the opening “Artificial Intelligence is a transformative technology” is an unnecessary value judgement, that we, as non-experts in the field, should definitely avoid. If we’re going ahead with some sort of policy, I would start here:

… and phrase it as “The LLVM project discourages the use of AI-assisted tooling in the general case, although contributors are welcome to experiment with it”, which would be more in line with the discussion on the other thread.

That’s definitely not the impression I got from the discussion. I acknowledge I’m pro using AI and that this might put a slant on things, but I certainly didn’t get the impression that there was consensus on discouraging using AI tooling.

2 Likes

I’m supportive of the proposed policy overall. There are things I’d do differently (and I shared some thoughts in the previous thread), but I’m not the one writing it and I’m very grateful @rnk is taking that on! I’ve since discovered the Chromium AI Coding Policy which addresses some of these things in a concise way, though it doesn’t have explicitly guidance on using AI tools to generate commit/PR descriptions (like Nikita, I’m in favour of banning this for cases other than aiding translation or minor wordsmithing).

I do still worry a bit about introducing the broader idea of “extractive” contributions. In particular, I would feel uncomfortable were this policy to be adopted and the label started to be applied to contributions that weren’t clearly low quality / tool generated with insufficient human oversight.

I’ve left a comment in the PR, but because as phrased it seems to contradict the rest I’ll highlight here as well:

If a maintainer judges that a contribution is extractive (i.e. it is
generated with tool-assistance or simply requires significant revision), they […]

To me this reads as indicating any contribution generated with tool assistance is extractive, which is clearly not the intent. Perhaps rephrase as “it is generated with tool-assistance in a way inconsistent with our AI tool usage policy"?”

1 Like

I read “You are responsible for your contributions.“ as strongly agreeing w/ your position. I would not be against making the wording more specific. If you are responsible for understanding what you submit then most of the issue should not be. Maybe I am giving more of a benefit of the doubt here than is warranted.

I agree with @nikic ‘s recommended changes, and would likely change my overall stance to weakly supporting if they were to be incorporated.

I strongly agree – if a person clearly put a lot of effort into making a high quality contribution, it shouldn’t be rejected merely because it’s lots of effort to review and probably not widely used. A good example could be adding a new backend for a niche platform (e.g. m68k or s390x).

I could get on board with discouraging new contributors from using AI tools on the grounds that:

  1. we presume they don’t have the expertise to review their output for correctness
  2. asking maintainers to review AI generated code is not a good use of their time

This case comes to mind as a counterexample. Alex is arguably a known contributor, but if you suppose he was not, I think it’s a patch we’d want to accept.

What I see coming is that in both cases, developer tools are going to start giving developers template text as the basis for both PR comments and commit messages. I don’t see a problem with this so long as people take ownership of the result. Our policy emphasizes the importance of documenting why a change is good. LLMs cannot read minds, they can only guess based on context clues, and they’re usually wrong.

I agree it would be totally toxic for contributors to attempt to automate the review process with LLMs, so I’ll try to edit something into this point to capture that.

I think my ideal policy is one that is as neutral as possible. Maybe LLMs will plateau today and stop getting better, or maybe they’ll become amazing. I’ve been humbled in my attempts to predict the future over the last five years. My goal right now is to try to come up with a policy that is useful in the outcome that LLMs stay bad or get good, and to that end, I agree it makes sense to remove the value judgement.

I think the core problem here is basically this: I believe that our AI policy should be setting people up for success. Making the policy too liberal, by essentially saying “use AI however you like as long as you ‘own’ the result” is setting them up for failure. It is giving people enough rope to hang themselves.

Our AI policy should guide people towards using AI in ways that is unlikely to get labeled as AI slop, resulting in rejected contributions and/or bans from further participation in the LLVM community. Possibly that doesn’t need to take the form of explicitly forbidding certain things, but rather just strongly discouraging them along the lines of “doing these things increases the risk of the contribution being considered AI slop” etc.

I mentioned AI generated PR descriptions and responses to review comments because that’s one of the easiest (and most unnecessary) ways to cross the line. If I see a PR description with the usual hallmarks of AI generation (multiple headings, bullet point lists, emojis, information density so small it can only be found with an electron microscope), I don’t even need to look at the code to reject it as AI slop.

I think we should optimize our AI policy for AI as it is right now, in late 2025. We can and should adjust the policy over time. If in two years AI contributions become indistinguishable from human ones, we’ll want a different policy – we shouldn’t even try to predict these things.

4 Likes

(We’re in agreement, I just wanted to add a complementary perspective)

Specifically for the idea of disallowing PR descriptions primarily authored by AI, the attraction to me is this is potentially a very long-lasting approach where future improvement in LLMs is largely irrelevant. I’m sure LLMs are better today at generating commit messages than they were 6 months ago, and will be better 6 months in the future. But it doesn’t matter. We’re saying that to submit to the project you need to take full ownership of your submission, mark clearly to reviewers anything you’re not sure about and otherwise ensure you fully understand your code. If you’re doing those things, writing the PR description yourself vs letting an LLM do it is a very small marginal cost (LLM help with translation or minor rewording isn’t a concern). If someone is not able to properly describe and motivate their changes, then there’s not a path forwards for a successful review anyway.

For me, this is a very fair “ticket price” for getting valuable time from other human beings and remains so for as long as we rely on human reviews, regardless of further advancement in LLMs.

3 Likes

I don’t want to weigh in on either side of the argument, just want to make a comment that, as always, any and every decision we make will have side effects. And, as usual, we’ll have to make an effort to filter the (good and bad) corner cases individually.

So the question isn’t “what can we do to catch all good cases?” or “how can I prune all bad cases?” but “which choice leaves out more bad cases than good cases?”.

Having used copilot to write code, I agree with @nikic that asking me if I’m competent enough to validate the AI code is a very valid question, and the answer is most likely “no”.

I use AI for various reasons: I’m lazy, I don’t know how to do it, I want to refactor some large code. In all these cases, my ability to review the tool’s output is diminished, even when I do know what I’m doing.

+1

Eventually consistent.

Hey Reid, thank you for continuing to work on this policy; I think the current policy isn’t working as well as hoped, so clarification is greatly appreciated!

My personal position, based on PRs, comments, and issues I’ve seen in our community already, is to have a blanket ban on use of AI. AI has already been extractive in terms of limited reviewer and triage resources in the community and I don’t think it provides sufficient value to be worth encouraging its use. What’s more, at no point have my original concerns been addressed by our policies: there is no reasonable way for anyone involved in a PR or an issue to know whether or not there are licensing issues with AI-generated content, but we know for a fact that many AI tools are trained on inputs with incompatible licenses to our own. Yes, the same can be said of human-generated content, but we have decades of experience of humans not doing it with the trivial ease with which AI tools have been demonstrated to do it. IMO, it’s reasonable to trust a real human being to put thought into a contribution and that same trust cannot be placed in an AI tool when it comes to verifying that the tool did not produce something with copyright or license issues; neither the reviewer nor the author have the information necessary to determine that.

If a maintainer judges that a contribution is extractive (i.e. it is
generated with tool-assistance or simply requires significant revision), they
should copy-paste the following response, add the extractive label if
applicable, and refrain from further engagement

I like the ease of this approach – it’s a simple copy/paste plus adding a tag. That’s a really nice property! However, I’m worried about the social aspects of this, in a few ways. This has potentially significant friction because it requires someone to make what amounts to an accusation. I think this leads to some contributors being comfortable with the guidance while other contributors won’t feel secure enough to make that accusation. Also, this is going to be very inconsistently applied and I think may lead to unintended outcomes. Consider this (IMO) reasonable scenario: a company is paying someone to implement a feature they want. They use an AI tool to generate a low-quality PR, it gets put up for review and three reviewers are added to it: a maintainer for that area, the lead maintainer, and a coworker. One of the maintainers adds the extractive label to the review, both maintainers back off the review due to bandwidth… and now the coworker is the only one left and accepts the review because they don’t find AI tools to be extractive (they’re required to use AI tools by their management). With our PR workflow, that’s now ready to land, which seems like the exact wrong outcome IMO.

+1 to this.

I think this is a case where having a dead simple policy makes the most sense. I like the general idea of “extractive” and I think that’s a good benchmark to measure against, but I think the policy itself should be more straightforward; I’d prefer a blanket ban on use of AI.

3 Likes

The discussion seems to mix two topics: “low-quality” contributions and AI contributions. As I got burned publicly for something I thought to be a harmless [RFC] in Mesa, I’d like the LLVM community to be more outright if they want to filter out whatever reviewers consider to be “low-quality” contributions or “extractive”. In my eyes Eghbal’s definition is too vague and highly subjective. Such decisions will only end up with a lot of arbitrary results due to reviewer’s subjective criteria in applying the intended filter mechanism (opening up new social issues along the way, with the risk of either overstating their cost of reviewing or understating the benefit). Hence I fully agree with Aaron’s opinion on that matter, mirroring the same problems only from the contributor’s perspective.

As I understand it, the “extractive” label ends up as a shaming mechanism for anything that doesn’t meet the bar of reviewer’s quality standards (whatever these are and which might be largely unknown to the contributor) with possible CoC implications when being “earned” repetitively. There is also risk on the reviewer side by mis-using the label. I don’t know if that is the intended consequence, but I see a lot potential for more drama which isn’t currently as big without that label.

From the contributor side, it would be great to have as much objective criteria to be aware of which are expected to be meet to not face the fear of getting sanctioned, either by being labeled negatively or being banned. I also like to point out that such a decision of labelling a contributor would set a negative precedent for future contributions from the same contributor.

The current AI policy draft speaks of “welcoming and open to aspiring compiler engineers who are willing to invest time and effort to learn and grow” as the target audience for LLVM contributors, while the contributor guidline is more broad and open (“If you are working with LLVM and run into a bug, we definitely want to know about it.”). PC enthusiast users are a significant portion of the LLVM/Clang user base, finding bugs along the way. I’ve reported quite a few over the years, also with the ClangBuiltLinux project. This led to several bug fixes in LLVM or the Kernel, hence shielding yourself from working with end users like me does come with its own costs. Personally, I’d argue that it is in the best interest of the LLVM community to get to know any issues even when there are a lot of false-positives or low-value issue reports among them. With a lower bar of minimum requirements than MRs and with no shaming element like the propose extractive label or the LLVM community will lose to get aware of such issues or contributors outright. In practice, low-quality issues are not acted upon or will get closed pretty soon anyway.

However, if the LLVM community only values human programmer’s and compiler engineer’s input from now on, it would be as simple as adding a one liner in the contributing guidelines (e.g. “Due to reviewer ressource constraints, we reserve the right to only accept contributions made by programmers and compiler engineers with only informed and appropriate AI use.”). That would give you developers all the freedom you want to ignore any low-quality contributions.

On the AI policy subject, my ideas in the linked blog post also apply to LLVM. Just as for the licensing issue, that is not as big as some people make it. The burden is largely on the contributor already (the reviewer still has to object anything openly violating licensing terms, but if that is not obvious, there is no harm for the LLVM project). I also want to remind people that we all got educated by copyrighted material in school and university. :slight_smile: The 1:1 reproduction of copyrighted material is also not a problem in current LLMs any longer according to LLM devs.

1 Like