Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions 133-async-triage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Asynchronous Issue Triage

This proposal introduces an asynchronous triage process for GitHub issues, allowing maintainers and component owners to triage issues outside of the community call using emoji reactions.
The process starts as a manual workflow with the goal of automating it in the future.

## Current situation

Currently, issue triage happens synchronously during community calls.
This takes up valuable call time and can delay triage for issues that are straightforward to evaluate.
Issues may sit untriaged until the next community call, even when maintainers already have a clear opinion on them.

## Motivation

Asynchronous triage allows maintainers to review and vote on issues at their own pace, freeing up community call time for issues that genuinely need discussion.
It also reduces the triage backlog by enabling continuous progress between calls.

## Proposal

> [!IMPORTANT]
> An issue is eligible for asynchronous triage when it is labeled `needs-triage`.
> All new issues should be marked with the label `needs-triage` when/after they are created.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we do this today but it's also true that quite often if we think the issue is pretty clear and needs to be fixed, someone just opens the PR and fixes it without waiting for the triage. What happens then is that during the community call we end up with "there is already a PR for this, so it's already triaged". Are we continuing to do so after this proposal, or we'll end up with waiting for an easy issue to be seen and voted by enough maintainers before proposing a PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with our current behaviour that sometimes an issue has a PR opened before the issue is triaged. I don't think we should be delaying ourselves just to wait for triage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so instead:

An issue is eligible for asynchronous triage when it is labeled `needs-triage`.
All new issues should be marked with the label `needs-triage` when/after they are created.

I can re-phrase like:

Issues that require triage should be marked with the label needs-triage when or after they are created. Straightforward issues (e.g., simple fixes or dependency version bumps) do not necessarily require triage and may proceed directly with a pull request. 

Does this look better?


### Voting mechanism

Maintainers and component owners with merge rights in the given repository indicate their opinion using emoji reactions on the issue:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we clarify where the emoji reaction is made? On the original issue post or as a comment.


| Reaction | Meaning |
|----------|---------------------------------------------------------------------------------|
| 👍 | We should keep this issue |
| 👎 | We should close this issue (with an explanatory comment) |
| 👀 | I saw this but want to discuss it further (ideally with an explanatory comment) |

Any user is encouraged to share their opinion.
But only votes from maintainers and component owners with merge rights in the given repository count toward the triage decision.

### Decision rules

After at least **5 days** since the last major comment on the issue, the following rules apply:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it is at least 5 days since issue creation. Major comment is pretty random.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that 5-day window resets whenever a maintainer or component owner comments or reacts (in this case it's reaction). So basically then you have enough time to react when adding a new input. That's what I mean by major comment.

Like having 5 day since issue creation has its own pros (easier to automate) but not sure ... like you can imagine scenarious where this approach doensn't work (e.g., I comment on day 4 because I didn't find time until then and then other maintainers have just 1 day to react?).

@strimzi/maintainers thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should be 5 days from creation and then e.g. 72 hours since the last major comment, just to give that extra buffer if there is a lot of discussion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take from the Slack discussion was that we wait 5 days since the creation but I also agree that if there is a major comment, we should enough people to react. If the comment comes on the 5th day we can't expect people to block everything else they are doing and looking at such issue. Additional time would be useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should add some strict timing dates like the one that is currently in proposal. Currently we don't have any, some issues are triaged pretty quickly, some not and usually just two/three people joins the discussion. My expectation is:

  • Issue is created, label needs triage is added
  • Voting started asynchronously
  • IF there is +3 votes after 3/5 days (I don't care about exact time, but +3 seems to me like an agreed solution, actually more votes than we have nowdays for some/most issues) -> issue is triaged
  • If there are some -1 or eyes reactions the discussion should lead to some agreement or the issue will be triaged on community call as nowdays

Waiting any specific amount of time after latest comment doesn't sound to me like a proper way. If anyone will need more time to think about it than he/she should explain it in the comment and issue should be triaged on community call (in case the person didn't update the issue until than).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without a clear definition of what a major comment is, it's unclear when the 5-day window resets.
I would just say after 5 days.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to say who is responsible for checking whether the decision rules are satisfied? Pre-call checker? To avoid a situation where everyone assumes someone else will apply the rules.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so I will modify proposal based on consensus with having 5 days from issue creation okay?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to this:

We are moving from a situation where an issue was triaged only at Strimzi community call, so you had 2 weekes to review them, to a situation where even 1 hour is not enough.

I know that nobody wanted to hear about the automation, but my plan is to have there check how long it is actually created and only after the 5 days it will be triaged. That way there will not be any situation like "yeah we have 3 votes in 10 minutes, let's mark it as triaged".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not about that people need more time, it's about people being busy and coming to the issues later on. This is where the "time window" helps. Without it, it could happen that for example on Monday the issue is open by a user, after 1 hour it gets already 3 votes (worse if a maintainer opened the issue, it means there is need for just 2 additional votes) and it's considered triaged. I think that 1 hour is not enough time for people to even notice that an issue was created.

but this is not the problem, I think everyone basically agrees that we will have some timewindow, the problem is with extending it endlessly. Without clear definition what is major comment is not possible to effectively use it. Also from my POV we are not going to switch into totally different approach as we have now. The main target for this are the issues that are clear to understand and the output from them is pretty clear - PR with fix or proposal. For such issues when we do triage on community calls there is usually silence. I cannot image we will use this for issues where there will be 10 possible solutions with different pros/cons for us and users.


- **Approved (keep):**
No 👀 and no 👎, and at least 3 👍 reactions.
The issue is considered triaged and should be kept.
The `needs-triage` label is removed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we mention adding additional labels depending on comments here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we say what happens if a new comment introduces additional technical information, proposes a different solution, or raises a new concern relevant to the issue after the 3 approving votes have been made?

- **Rejected (close):**
No 👀 and no 👍, and at least 3 👎 reactions.
The issue is considered triaged and rejected.
It should be closed with an explanatory comment or with acknowledgment of another maintainer explanation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it has got 3 👎 each of them with a comment, do we really need an additional comment to close and reject the issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a simple "This issue is being closed due to the reasons given by X person" is useful for the people raising issues, that's what I understand from "acknowledgement of another maintainer explanation"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a final decision comment is useful

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a simple "This issue is being closed due to the reasons given by X person" is useful for the people raising issues, that's what I understand from "acknowledgement of another maintainer explanation"

+1

It's very similar as we do in the community call when we triage an issue and then just comment why we are closing it.

- **Needs discussion:**
Any other mix of reactions (e.g., presence of 👀, mixed votes, or fewer than 3 votes).
The issue remains labeled `needs-triage` and is discussed at the next community call.

### Additional labels

- **`needs-proposal`:**
Can be added at any point by anyone who thinks the issue requires a formal proposal.
If the issue is approved via async triage, the label will remain.
- **`good-start` / `help-wanted`:**
The person marking the issue as triaged (or anyone else later) can add these labels if desired, but they should be ready to provide guidance to whoever wants to work on it.

### Pre-call check

Until automation is in place, the person running the community call should do a manual check of `needs-triage` issues before the call.
This ensures that issues which reached consensus asynchronously are cleaned up, and remaining issues are queued for discussion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to get into the clean up checks?
Removing needs-triage, closing rejected issues, and preparing discussion items for the call agenda


### Future automation

A bot, scanner, or cron job (to be investigated separately) could automate the process:

- Monitor `needs-triage` issues for vote counts after the 5-day (i.e., 120 hours) window
- Automatically remove the `needs-triage` label when consensus is reached (based on decision rules stated above)

## Affected/not affected projects

This proposal affects the Strimzi project's issue triage workflow across all repositories.
No code changes are required initially (i.e., this is a process change).
Future automation tooling may be developed separately.

## Compatibility

This process is additive and does not replace synchronous triage during community calls.
Issues that do not reach async consensus are still discussed during the call as before.

## Rejected alternatives

### Using comments or commands instead of emoji reactions

Using comments or bot commands (e.g., `/strimzi-triage +1`) was considered but rejected in favor of emoji reactions only.
Emoji reactions provide a quicker, more visible way to see the current vote count at a glance without cluttering the issue with additional comments.

### 72-hour triage window

An initial 72-hour window was discussed but extended to 5 days to give maintainers enough time to review issues, especially across different time zones and work schedules.