Skip to content

doc: explain that gfi is for training and add no AI policy#31142

Merged
timhoffm merged 1 commit intomatplotlib:mainfrom
story645:gfi-note
Feb 20, 2026
Merged

doc: explain that gfi is for training and add no AI policy#31142
timhoffm merged 1 commit intomatplotlib:mainfrom
story645:gfi-note

Conversation

@story645
Copy link
Copy Markdown
Member

@story645 story645 commented Feb 12, 2026

PR summary

Jumping off #31131 and related, adds a paragraph to the contributing guide explaining that good first issues are for training rather than b/c they're an urgent technical need. Unsure if it should be broad "use AI tools" or more specific "use AI tools without human in the loop"

attn: @scottshambaugh @melissawm

PR checklist

@story645 story645 added the Documentation: devdocs files in doc/devel label Feb 12, 2026
Copy link
Copy Markdown
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have already explained what a good first issue is in the first paragraph. It is redundant to explain here. I'd just say that AI agents are not to post and link the policy.

If you feel the first paragraph does not adequately explain what a good first issue is, then you could modify that.

@story645
Copy link
Copy Markdown
Member Author

story645 commented Feb 13, 2026

If you feel the first paragraph does not adequately explain what a good first issue is, then you could modify that.

The aim of this new paragraph is to explain the purpose of good first issues - that they're for onboarding - rather than what a good first issue is (a tightly scoped low priority issue). I put it at the end b/c I don't think new contributors necessarily need to understand the organizational reasoning for gfis, but I think we should document the purpose for the sake of explaining the policy.

@story645 story645 force-pushed the gfi-note branch 2 times, most recently from cdb292f to 0940dbc Compare February 13, 2026 05:24
@rcomer
Copy link
Copy Markdown
Member

rcomer commented Feb 13, 2026

This isn’t true of all GFIs: we often also add the “medium difficulty” label, and explain that the issue is suitable for someone new to Matplotlib but not new to development in general.

@story645
Copy link
Copy Markdown
Member Author

story645 commented Feb 13, 2026

we often also add the “medium difficulty” label, and explain that the issue is suitable for someone new to Matplotlib but not new to development in general.

Not sure how this contradicts that the issue's purpose is more so for onboarding than b/c the project feels it's something that really needs to get done? Unless you mean that I should be more specific "opening and seeing through a pull request in Matplotlib"?

@rcomer
Copy link
Copy Markdown
Member

rcomer commented Feb 13, 2026

I think I’m getting hung up on what the purpose of the GFI is

  • The purpose of any issue is to get a thing done.
  • Adding the GFI label signals that the issue should be accessible to someone new. Usually we do not care whether it gets done by someone new or someone already established. The label should help someone who wants to get started here find suitable issues to work on.

So maybe something like “helping people get started” rather than “training on the process…”

Maybe we should also switch out priority/importance for “urgency”. When I was new here I worked on #24148, which I would say was definitely important (CI would eventually have broken without it) but not urgent. Telling someone “here is something unimportant we’d like you to work on” feels like giving them busy work, though maybe there is a cultural/pond difference in how we interpret the terms.

I appreciate that I am somewhat contradicting the fact that I approved #31131, and apologies if I am being a bit incoherent in general. It has been a busy week and my brain is a bit fried!

@rcomer
Copy link
Copy Markdown
Member

rcomer commented Feb 13, 2026

I am also wondering if we should revisit the idea of renaming the label 🤔

Copy link
Copy Markdown
Member

@melissawm melissawm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of agree with @rcomer on the wording, and I think we are using the "good first issue" label for two kinds of issues: good for onboarding new people, and easy/small issues. Both can happen at the same time, but they are different things.

I'd advocate for a label like "onboarding issue" instead of "good first issue" - it would maybe signal a bit better what the issue is for, without actually requiring that it be an easy issue (and we can use the "difficulty: easy" label for those). Onboarding issues can even be small projects, but serving primarily as an onboarding opportunity for a contributor that is motivated enough.

I know we could potentially achieve the same goal with the "good first issue" + "difficulty" labels, but I do think the gfi labels are, at this point, kinda contaminated to be honest. If we want to keep the gfi label to be consistent with other projects, then I'd maybe advocate for a slight rewording of the current proposal.

Comment on lines +277 to +282
Good first issues are a technical solution to the social problem of onboarding new
contributors to the repository; we label tasks good first issues because we think they
are useful for training on the process of opening and seeing through a pull request, not
because we think they are important technical issues that must be resolved. Therefore,
pull requests that use AI tools to fix issues labeled as "good first issues" will be
closed.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of agree that saying these issues are "not important" may give folks the wrong idea, that this is just about busy work. How about something like:

Suggested change
Good first issues are a technical solution to the social problem of onboarding new
contributors to the repository; we label tasks good first issues because we think they
are useful for training on the process of opening and seeing through a pull request, not
because we think they are important technical issues that must be resolved. Therefore,
pull requests that use AI tools to fix issues labeled as "good first issues" will be
closed.
Good first issues are a technical solution to the social problem of onboarding new
contributors to the repository; we label tasks "good first issues" because we think they
are useful for training contributors on the process of opening and seeing through a pull request. Good first issues are also not high-priority, and having the issue fixed is only a secondary goal. Therefore,
pull requests that use AI tools to fix issues labeled as "good first issues" will be
closed.

@jklymak
Copy link
Copy Markdown
Member

jklymak commented Feb 13, 2026

Perhaps this whole section could be much more succinctly stated:

"The Maintainers label Good First Issues because they are self-contained, are usually uncontroversial and with a clear solution, and are a good way to onboard new contributors to the process of creating a Matplotlib Pull Request. Good First Issues are not an appropriate venue for bots or agents (see our AI policy)."

I'd not digress into the differences between a "Medium", "Hard" etc.

@story645
Copy link
Copy Markdown
Member Author

story645 commented Feb 13, 2026

not digress into the differences between a "Medium", "Hard" etc.

I agree they're not in scope for GFI, but I'd break out this scale into a subsection of issues b/c it applies broadly there.

Onboarding issues can even be small projects

Arguably that's gsoc, but I think community practice has pushed towards gfis being onboarding issues (for example, the llvm docs ) so I think being clearer in purpose about gfis & decoupling leveling will get us to roughly the same place as adding a new tag.

@story645 story645 force-pushed the gfi-note branch 2 times, most recently from 2f79bee to 10f0f9b Compare February 13, 2026 19:04
@story645
Copy link
Copy Markdown
Member Author

Ok so I decoupled leveling from GFI - new contributors should be reading the general issue guidance anyway - and tried to condense everything else into what I think is the working definition and purpose of gfi.

@story645 story645 force-pushed the gfi-note branch 2 times, most recently from 571d493 to 8732a98 Compare February 13, 2026 19:12
@tacaswell
Copy link
Copy Markdown
Member

I'd advocate for a label like "onboarding issue" instead of "good first issue" - ...

I really like this reframing.

@story645
Copy link
Copy Markdown
Member Author

appreciate that I am somewhat contradicting the fact that I approved #31131

No problem, context influences reviews, Changed wording in bot from low priority to not urgent.

@story645
Copy link
Copy Markdown
Member Author

story645 commented Feb 13, 2026

I'd advocate for a label like "onboarding issue" instead of "good first issue" - ...

I really like this reframing.

I think this is how projects in other communities are already using gfi (Apache, LLVM, kubernetes)

Part of the point of this PR is to more explicitly document that this is the context in which the good first issue tag should be applied.

it would maybe signal a bit better what the issue is for,

Part of this is also that while I love the term onboarding, I think it's very much internal hr speak rather than outward facing accessible. I think "onboarding" is the criteria that maintainers should use to classify it, but that should translate into "good first issue" for the new contributor trying to find it.

Like what would be an example of a good first issue that's not a good onboarding task, especially with leveling decoupled from gfi?

@rcomer
Copy link
Copy Markdown
Member

rcomer commented Feb 13, 2026

I think all good first issues are good onboarding tasks but the problem is that a lot of would-be contributors (and now bots) read "good first issue" as "easily solvable" which won't always be the case. Unless we change how we're using the label - I haven't really followed what we mean by decoupling the difficulty levels, discussed above.

To get away from the HR-speak, maybe something like "Newcomer Friendly" or "Newcomer Suitable"?

@melissawm
Copy link
Copy Markdown
Member

I'm not attached to "onboarding" at all, other wordings also work here. Also none of my comments here should be blocking anyway so I think we should merge as-is and if this comes up again later we can always revisit 😄

@story645
Copy link
Copy Markdown
Member Author

I haven't really followed what we mean by decoupling the difficulty levels, discussed above.

Just that we should be evaluating difficulty of issue desperately from whether it's a good first task. They're kind of intermingled b/c we rank difficulty based on conceptual understanding of matplotlib, but there are also plenty of technically easy tasks that are illsuited for new contributors b/c of the social side (for example this PR).

Unless we change how we're using the label

I think different folks have been using the label in different ways, but I wanna move to us consistently using it in the onboarding context b/c that seems to be how the community at large is using it. Which is why I'm so hesitant about introducing a new lable we plan to use in the same way most folks use the old label.

@rcomer
Copy link
Copy Markdown
Member

rcomer commented Feb 15, 2026

I don’t think the difficulty label is about how well you know Matplotlib. Often “medium difficulty” is used together with GFI to say you don’t need to know Matplotlib internals, but you do need experience in something more general like testing frameworks, or because the code you’re going to need to study is a bit involved. I’m going to tag @tacaswell here because I think he most often adds that combination of labels.

I think this project is already incredibly good at attracting inexperienced contributors who make a PR for their course credit and don’t stick around. Changing the definition of GFI to helping people with the PR process seems like it is going to codify that those are the people we want.

If a more experienced person happens to take an interest in this project they will likely not be worried about the process of making a PR, but some way to signal which issues are more accessible to them would be helpful. Of course those people will happen by far less often, but when they do they will likely make stronger contributions.

@rcomer
Copy link
Copy Markdown
Member

rcomer commented Feb 15, 2026

As for the issue name, last time we discussed changing it, it was about making it less discoverable so we don’t get several people all wanting to work on it at once. Making it less discoverable may also help with the recent problem of AI spam.

#29686

This would of course mean we would be doing something different from other OSS projects. Perhaps @melissawm can speak to how the name change worked out for Numpy, as that was a few years ago now.

@story645
Copy link
Copy Markdown
Member Author

The problem was that I did not realise how quickly the genuinely accessible ones get picked up.

I think this intersects a little w/ our reluctance to close bad PRs. I'm probably more guilty than most, but we really should be closing the ones where two interactions in it's clear they don't understand what they're doing. And redirect those folks to doing something more productive for the library. Also been thinking of github's new limit PRs feature and asked for an "allow" list version b/c I think that could be a different way to get at this - folks have to show some understanding of the issue and solution in the PR b/f we allow 'em to make a PR. We could also have a manual version of this where we just close PRs that don't show understanding.

@timhoffm
Copy link
Copy Markdown
Member

timhoffm commented Feb 15, 2026

This is a multi-faceted topic and there will not be the one solution. Random thoughts:

  • Maybe we should actually start assigning issues and not consider 3rd party PRs if they are not assigned to the topic. Why didn’t we do this so far? I think we didn’t want to reserve an issue to prevent people sniping topics with the danger of not or insufficiently working on it. But overall I think the management overhead for assigning (and unassigning) is small compared to reviewing bad PRs and sorting concurrent ones.
    This would be a rate and quality gate. I’d request a brief solution description (without AI usage) from the contributor before assigning the issue.

  • Something like the new contributor meeting is much better than GFI. Personal communication creates more attachment, you can much better judge qualification and motivation and individually help on getting started. Btw. Thanks a lot for doing this! 🙏
    Maybe this should be the primary entry point for less experienced people and/or the ones just generally wanting to help.

@story645
Copy link
Copy Markdown
Member Author

Maybe this should be the primary entry point for less experienced people and/or the ones just generally wanting to help.

tried to make this the more explicit entry point in #31163. thankfully @melissawm's back to help w/ the ncms 😄

This would be a rate and quality gate. I’d request a brief solution description (without AI usage) from the contributor before assigning the issue.

Same. I think we've avoided assignment 'cause process overhead, but we've basically ended up implementing a worse implicit system where nobody knows what to prioritize and everyone is burned out. I think even in #29686 , most folks (at least grudgingly) conceded to an assignment system b/c it's explicit and enforceable.

@rcomer
Copy link
Copy Markdown
Member

rcomer commented Feb 15, 2026

What about the current GFI definition makes it accessible here where adding that this is explicitly about project onboarding would be a turn off?

Updates since this comment make me more comfortable with that - i.e. removing the part about "the process of making a PR".

I think #23548 is an example where discussion about exactly what should be done is still needed, but it's GFI because you don't need to know anything about the Matplotlib code to work on it.

@story645
Copy link
Copy Markdown
Member Author

I think #23548 is an example where discussion about exactly what should be done is still needed, but it's GFI because you don't need to know anything about the Matplotlib code to work on it.

This seems to lay out the what #23548 (comment) and I think that's actually what's tripping up the contributors - they're seeing that comment and not reading through the later discussion that modifies it. Also I think what's ambiguous here is more the how than the what.

@story645 story645 force-pushed the gfi-note branch 4 times, most recently from 712dfff to ebfb7b1 Compare February 15, 2026 23:23
@melissawm
Copy link
Copy Markdown
Member

melissawm commented Feb 16, 2026

Thanks all and especially @story645 - I think this wording is great.

For the second discussion happening here, which is how do we more efficiently triage new contributions, there are some things we can consider.

  1. When we label something a good first issue, should we have a bot post a comment outlining specific contribution guidelines? Something like "we do not assign issues; please check if there is already a PR linked to this issue before submitting yours; if you are unsure about your solution, outline it here as a comment before submitting a PR; don't use agents; come to the new contributor's meeting for more guidance"?
  2. Should we have a "not ready for contribution" label? A few times in Matplotlib I've seen this happen, where we have an issue that is open for discussion and the implementation details are not agreed upon yet. This sometimes turns into contributions that are otherwise good, but maybe not the desired solution to the problem. A few of the long-time open PRs have this feeling. I'm not sure if this is a solution either, as sometimes you can only tell that something is not the right call when you see the PR, but I'm wondering if there's a way to filter out these issues and prevent folks from going in the wrong direction.
  3. Do we want triage meetings/dedicated time on the weekly call for triage of pending PRs? I know there's a lot to discuss already, but NumPy has seen a lot of success with dedicated triage meetings - making decisions such as when to close PRs and how to guide new contributors can be much more efficient in a synchronous discussion.

I would like to suggest moving this discussion to discourse, maybe, or another issue so we can unblock this PR.

@story645
Copy link
Copy Markdown
Member Author

story645 commented Feb 16, 2026

1 and 3 are things we already do (the calls have just had too much other stuff lately for triage) we can maybe open an issue about 2?

Copy link
Copy Markdown
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s almost there. Some wording and clarification.

@story645 story645 force-pushed the gfi-note branch 3 times, most recently from dc6b8a9 to de439a3 Compare February 16, 2026 22:28
Comment on lines -275 to -277
- It has less clearly defined tasks, which require some independent
exploration, making suggestions, or follow-up discussions to clarify a good
path to resolve the issue.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted this b/c I think this means there's not enough info yet to evaluate difficulty and the issue needs more triage:

graph TD;
    A(is the issue) -->unresolvable-->C(explain why <br /> technically infeasible <br /> or not aligned with <br /> values/mission/scope/etc);
    A-->resolved--> B(explain why, <br /> link to docs, API, etc) ;
    A-->resolvable-->D(work w/ issue author <br /> on an implementable <br /> code/docs solution)
   A-->E(not sure)-->F(ask for more information) -->A;
Loading

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in the same vein as the proposed "not ready for contribution" label @melissawm proposed in #31142 (comment)


This issue is suited to new contributors because it does not require
understanding of the Matplotlib internals. This is a low priority task
understanding of the Matplotlib internals. This is a not urgent task
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
understanding of the Matplotlib internals. This is a not urgent task
understanding of the Matplotlib internals. This is a nonurgent task

Copy link
Copy Markdown
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence L268 forward could be simplified....

Comment on lines +263 to +266
Matplotlib, and do not need urgent resolution. Pull requests that are :ref:`AI generated <generative_ai>`
will be closed because good first issues are intended to onboard newcomers with a
genuine interest in improving Matplotlib in the hopes they will continue to participate
in our development community.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

Suggested change
Matplotlib, and do not need urgent resolution. Pull requests that are :ref:`AI generated <generative_ai>`
will be closed because good first issues are intended to onboard newcomers with a
genuine interest in improving Matplotlib in the hopes they will continue to participate
in our development community.
Matplotlib, and do not need urgent resolution. Pull requests to Good First Issues that are :ref:`AI generated <generative_ai>`
will be closed. These issues are reserved to provide hands-on experience for new contributors.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the part about hoping they will continue to participate...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so maybe make it a concluding sentence to the whole paragraph, rather than tack it onto the AI sentence.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it's specifically why we don't allow AI here. I want to be explicit about the reasoning b/c this PR was largely motivated by my wanting to explain why our gfi policy is not AI.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I can invert things though so that it doesn't feel so tacked on.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the continuing to participate is specifically cause of @timhoffm's review #31142 (comment)

@story645
Copy link
Copy Markdown
Member Author

coverage failure definitely unrelated to PR

Matplotlib, and do not need urgent resolution. Good first issues are intended to onboard
newcomers with a genuine interest in improving Matplotlib in the hopes that they will
continue to participate in our development community; therefore, pull requests that are
:ref:`AI generated <generative_ai>` will be closed because
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sentence not finished.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad git amend 🤦‍♀️

Copy link
Copy Markdown
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, apart from the unfinished sentence.

@story645 story645 force-pushed the gfi-note branch 2 times, most recently from 7310d60 to 0866b56 Compare February 19, 2026 19:37
move difficulty under issues
[ci doc]

Co-authored-by: Tim Hoffmann <[email protected]>
Co-authored-by: Ruth Comer <[email protected]>
@timhoffm timhoffm merged commit 6dbc0c7 into matplotlib:main Feb 20, 2026
33 of 36 checks passed
@story645 story645 deleted the gfi-note branch February 20, 2026 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Documentation: devdocs files in doc/devel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants