Skip to content

ci: auto pr assign by CODEOWNERS#248

Closed
feihongxu0824 wants to merge 2 commits intoalibaba:mainfrom
feihongxu0824:chore/codeowners
Closed

ci: auto pr assign by CODEOWNERS#248
feihongxu0824 wants to merge 2 commits intoalibaba:mainfrom
feihongxu0824:chore/codeowners

Conversation

@feihongxu0824
Copy link
Collaborator

@feihongxu0824 feihongxu0824 commented Mar 20, 2026

Greptile Summary

This PR introduces an automated PR assignment system for the alibaba/zvec repository, consisting of two workflows (auto-assign.yml and community-pr-handler.yml), a CODEOWNERS file mapping directory paths to maintainers, and an auto-assign-action configuration. Both workflows use the pull_request_target trigger — correctly checking out the base branch rather than the PR head to avoid supply-chain risks.

Key issues found:

  • Both workflows will fail to assign anyone at runtimeauto-assign.yml and community-pr-handler.yml both call POST /repos/.../issues/{number}/assignees, which requires issues: write. Neither workflow grants that permission (only pull-requests: write is declared), so all assignee API calls will return "Resource not accessible by integration".
  • Pagination missing in community-pr-handler.ymlgithub.rest.pulls.listFiles returns at most 30 files per page by default. PRs touching more than 30 files will have their extra files silently ignored, causing their CODEOWNERS owners to never be assigned.
  • .gitmodules CODEOWNERS entry is unanchored — Missing the leading / means it would match any .gitmodules file at any depth in the repository, not just the root-level one.
  • (Previously flagged) Directory pattern matching uses file.startsWith(dir) without a trailing /, which can produce false-positive owner matches for similarly-named paths.
  • (Previously flagged) Only rule.owners[0] is used, ignoring additional owners on the same CODEOWNERS line.

Confidence Score: 2/5

  • Not safe to merge — both workflows will fail at runtime due to missing permissions and a pagination bug in the main assignment logic.
  • The two most critical workflows (auto-assign.yml and community-pr-handler.yml) both lack the issues: write permission required to call the assignees API, meaning the core feature (auto-assigning reviewers) will silently fail entirely. Additionally, the listFiles call without pagination will miss changed files in any PR touching more than 30 files. These are functional blockers, not just style issues.
  • .github/workflows/auto-assign.yml (missing issues: write) and .github/workflows/community-pr-handler.yml (missing issues: write + no pagination).

Important Files Changed

Filename Overview
.github/workflows/community-pr-handler.yml Core workflow for assigning CODEOWNERS reviewers to community PRs — has three pre-existing issues (missing issues: write, overly broad startsWith matching, only first owner used) plus a new pagination bug that causes files beyond the first 30 to be silently ignored.
.github/workflows/auto-assign.yml Auto-assigns the PR author as assignee via kentaro-m/auto-assign-action; missing issues: write permission will cause the action to fail at runtime.
.github/CODEOWNERS Defines per-path code ownership rules; all directory patterns are correctly anchored with a leading slash except .gitmodules, which needs a /.gitmodules prefix to avoid matching nested files.
.github/auto-assign-config.yml Configuration for kentaro-m/auto-assign-action; correctly sets addAssignees: author, disables reviewer assignment (delegated to CODEOWNERS), and skips draft/WIP PRs.

Sequence Diagram

sequenceDiagram
    participant Author as PR Author (fork)
    participant GH as GitHub Events
    participant AutoAssign as auto-assign.yml<br/>(pull_request_target)
    participant CommunityHandler as community-pr-handler.yml<br/>(pull_request_target)
    participant API as GitHub REST API

    Author->>GH: Opens / reopens PR

    GH->>AutoAssign: Trigger (opened/reopened/ready_for_review)
    AutoAssign->>AutoAssign: Checkout base branch (sparse: auto-assign-config.yml)
    AutoAssign->>API: POST /issues/{pr}/assignees<br/>(assignee = PR author)<br/>⚠️ needs issues:write

    GH->>CommunityHandler: Trigger (opened/reopened/ready_for_review)
    CommunityHandler->>API: Check author_association
    API-->>CommunityHandler: OWNER / MEMBER / COLLABORATOR / CONTRIBUTOR ...
    alt Community contributor
        CommunityHandler->>CommunityHandler: Checkout base branch (sparse: CODEOWNERS)
        CommunityHandler->>API: GET /pulls/{pr}/files (page 1 only ⚠️ no pagination)
        API-->>CommunityHandler: Changed files (max 30)
        CommunityHandler->>CommunityHandler: Parse CODEOWNERS & match owners<br/>(last-rule-wins per file)
        CommunityHandler->>API: POST /issues/{pr}/assignees<br/>(matched CODEOWNERS owners)<br/>⚠️ needs issues:write
    else Maintainer
        CommunityHandler->>CommunityHandler: Skip — no action needed
    end
Loading

Last reviewed commit: "fix"

Greptile also left 1 inline comment on this PR.

Comment on lines +7 to +8
permissions:
pull-requests: write
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Missing issues: write permission

The workflow calls github.rest.issues.addAssignees (line 132) to set assignees on the PR, but the permissions block only grants pull-requests: write. Assigning users to an issue/PR via the GitHub REST API requires issues: write — without it, the API call will fail with a "Resource not accessible by integration" error, and the core functionality of this workflow (assigning code owners as reviewers) will silently fail.

Suggested change
permissions:
pull-requests: write
permissions:
pull-requests: write
issues: write

matchedPattern = pattern;
} else if (pattern.endsWith('/')) {
const dir = pattern.replace(/^\//, '').replace(/\/$/, '');
if (file.startsWith(dir + '/') || file.startsWith(dir)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Directory pattern matching too broad

The condition || file.startsWith(dir) (without a trailing /) causes false-positive matches. For example, a hypothetical file src/db_utils.cpp or src/binding_test.go at the repository root would incorrectly match the /src/db/ or /src/binding/ CODEOWNERS patterns because "src/db_utils.cpp".startsWith("src/db") evaluates to true.

Only the file.startsWith(dir + '/') check is correct for directory patterns. Remove the bare file.startsWith(dir) branch:

Suggested change
if (file.startsWith(dir + '/') || file.startsWith(dir)) {
if (file.startsWith(dir + '/')) {

Comment on lines +3 to +5
on:
pull_request_target:
types: [opened, reopened, ready_for_review]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 pull_request_target + checkout requires caution

The workflow uses the pull_request_target trigger, which runs with write permissions against the base repository. Although the actions/checkout@v4 step correctly checks out the base branch (not the PR head), the pulls.listFiles and issues.addAssignees calls operate on data from the PR author's fork. If a malicious contributor crafts a file with a specially crafted filename (e.g., containing shell metacharacters), the logging statements (console.log) in the script could be misused. This is low-risk here since no eval or shell execution is performed on filenames, but it's worth being aware of the pull_request_target attack surface and ensuring no future changes execute arbitrary content from the PR.

Comment on lines +99 to +115
let matchedOwner = null;
let matchedPattern = null;
for (const rule of codeowners) {
const pattern = rule.pattern;
if (pattern === '*') {
matchedOwner = rule.owners[0];
matchedPattern = pattern;
} else if (pattern.endsWith('/')) {
const dir = pattern.replace(/^\//, '').replace(/\/$/, '');
if (file.startsWith(dir + '/') || file.startsWith(dir)) {
matchedOwner = rule.owners[0];
matchedPattern = pattern;
}
} else if (file === pattern || file === pattern.replace(/^\//, '')) {
matchedOwner = rule.owners[0];
matchedPattern = pattern;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Only first owner per CODEOWNERS rule is assigned

The script always uses rule.owners[0], ignoring any additional owners listed on the same CODEOWNERS line. While the current CODEOWNERS file only has one owner per pattern, this is a fragile assumption. If a pattern is later updated to have multiple owners (e.g., /src/core/ @richyreachy @feihongxu0824), only the first will ever be assigned, which may lead to missed review coverage.

Consider using rule.owners (the full array) when building the matchedOwners set:

// Instead of: matchedOwner = rule.owners[0];
rule.owners.forEach(o => matchedOwners.add(o));

Comment on lines +65 to +69
const { data: files } = await github.rest.pulls.listFiles({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 listFiles not paginated — files beyond page 1 will be silently skipped

github.rest.pulls.listFiles returns at most 30 files per page by default. For any PR that touches more than 30 files, only the first 30 are fetched here, so CODEOWNERS owners for the remaining files are never computed and never assigned. Use github.paginate to consume all pages:

const files = await github.paginate(github.rest.pulls.listFiles, {
  owner: context.repo.owner,
  repo: context.repo.repo,
  pull_number: prNumber,
  per_page: 100
});
const changedFiles = files.map(f => f.filename);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant