Skip to content

Buildfarm Spreadsheet Automation #202

@Crola1702

Description

@Crola1702

Problem Description

Currently, the Buildfarm Issues Spreadsheet is manually updated once per week. This implies: First, that we're sometimes behind the current state if the issues that are happening; and second, that if we don't update the spreadsheet, the only other source of truth for the issues that are happening is the daily report.

The spreadsheet contains 3 main tabs (by project): Buildfarm Issues, Jobs Priority and Greenness Report History.

  • Buildfarm Issues: A collection of the current state of the issues reported by the buildfarmers.
  • Jobs Priorities: An informative tab of the job names and assigned priorities based on job maintenance support tiers (and regex patterns).
  • Greenness: The historic greenness of a project by collection.

Project Description

Goal: Automate the update of the Buildfarm Issues Spreadsheet using one source of truth (the buildfarmer database) of the current buildfarm issues state.

Epics:

  • Planning: Define what is needed to be automated and how it is going to be integrated with the existent scripts
  • Execution: Automate each part of the Buildfarm Issue Spreadshee

Automation objectives:

Buildfarm Issues

Columns data source:

Column Description Possible Data Source
Issue Issue name and Link (for easy access)
  • Link: Buildfarmer database
  • Name: gh CLI output
Assignee Developer Assigned to fix the issue gh CLI output
Priority Priority assigned to an issue, calculated from buildfarm-tools/database/scripts/lib/buildfarm_tools.rb -> calculate_issue_priority Buildfarmer database (and scripts)
Status Current status of the issue. It can be one of: Not Assigned, Investigating, Pending Fix, Blocked, Completed, To Be Disabled, Help Wanted, Obliviated, Disabled Option 1: Each of the status possibilities has its own set of rules on why is that status. It's source might be waterfall rules written in the scripts with take into account: Buildfarmer database, gh CLI output and manual changes. E.g., Labels + Linked PRs + Flakyness over time
Notes Manual notes for the issues This is not actually added to the database, but instead to the spreadsheet

Schema update to a dynamic table: To keep the spreadsheet generation separated from existing scripts. We might want the test_fail_issues updated in-place with up-to-date information of all the issues (as part of the dailyWorkflow, for example). New columns will help to track the state of the issues. E.g.,: CreatedAt, UpdatedAt, IssueLastActivity, Priority, Assignee, Status, LastStatusUpdatedBy (who authorized a change to a disabled test, or to a blocked issue?). Also, to keep track of how the issues are being updated, we should enable debug logging for the scripts that actually update the issues.

Buildfarmer metrics: Updating the same table can lead us to lose information. However, it might be useful saving snapshots of the issue status in some periodicity (e.g., weekly), to show buildfarmer and PMC performance tracking "Time to Resolution" for issues over time.

Jobs priority

This is not an automation but instead a migration of this data source. This is updated manually using Add a new Gazebo Distribution

The job names are fetchet using common.py. And we might need a new source file of job patterns priorities, maybe based in common.py

Greenness History

The source is the greenness_report.py script. The easy part is that we should just only upload to the spreadsheet the new greeness, whenever we generate the report.

Tasks (WBS)

  • 1.0 New tools required.
    • 1.1 Investigate libraries to modify spreadsheets.
    • 1.2 Describe integration of each library with the current tools.
  • 2.0 Migrate Job priorities
    • 2.1 Define job patterns for each priority
    • 2.2 Code generate_priorities.py, to generate job_priorities.csv
    • 2.3 Integrate new script into the workflow
  • 3.0 Greenness Report History
    • 3.1 Update greenness_report.py to update Greenness tab whenever we generate the report.
  • 4.0 Buildfarm Issues
    • 4.1 Automate test-fail-issues table in-place updates
      • 4.1.1 Add new columns to the table
      • 4.1.2 Define rules for issue status
      • 4.1.3 Identify information from gh cli scripts
      • 4.1.4 Create one single script to manage updates (join refresh_known_open_issues.sh and close_old_known_issues with their respective changes).
    • 4.2 Integrate new changes to the current workflow
      • 4.2.1 Separate the responsibility of spreadsheet update to a new script (e.g., update_spreadsheet.py) which fetches the updated buildfarmer database (be wary of gh api limits)
      • 4.2.2 Integrate new script with current workflow

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions