Skip to content

[EPIC] [MVP] Improvements to Thoth advises output #434

@mayaCostantini

Description

@mayaCostantini

Problem statement

As a Python Developer,
I would like to have concise information about the quality of my software stack and all its transitive dependencies,
so that I get some absolute metrics such as:

  • "95% of my dependencies are maintained with a dependency update tool (i.e. dependabot, etc)"
  • "45% of my dependencies have 3 or more maintainers"
  • ...

Which would be aggregated and compared to metrics for packages present in Thoth's database to provide a global quality metric for a given software stack, eventually given a specific criteria (maintenance, code quality...), in the form of a percentage or score (A, B, C...).

We consider the metrics derived from direct and transitive dependencies to be of the same importance, so there will not be any difference in the weight given to information carried by the two types of dependencies.

Proposal description

  1. create ADR wrt/ implementation of the service as 'a bot' eg GitHub App, Action, ... ?
  2. PoC: Implement an experimental thamos flag on the advise command to give users insights about the maintenance of their packages
  1. Compute metrics for packages present in Thoth's database that will serve as a basis for a global software stack quality score

Taking the example of OSSF Scorecards, we already aggregate this information in prescriptions which are used directly by the adviser. However, the aggregation logic present in prescriptions-refresh-job only updates prescriptions for packages already present in the repository. We could either aggregate Scorecards data for more packages using the OSSF BigQuery dataset or have our own tool that computes Scorecards metrics on a new package release, which could be integrated directly into package-update-job for instance. This would most likely consist in a simple script querying the GitHub API and computing the metrics on the project's last release commit.

  1. Schedule a new job to compute metrics on aggregated data
  • Implement the job to be run after each package-update-job or on a regular schedule
  • Compute percentiles for each metric that will serve as a basis to score a software stack
  1. Implement the global scoring logic

For example, if a software stack is in the 95th percentile of packages with the best development practices (CI/CD, testing...), score it as "A" for this category. Compute a global score from the different category scores.

  • Gather user feedback/opinions on useful Scorecard metrics #442
  • Implement this logic either on the adviser side by performing a lookup to the database when an advise and integrating these metrics in the advise report, or on each endpoint separately if we wish to separate information carried by metrics from advise reports.
  • Make the scoring logic publicly accessible via justification URLs provided with each scoring

Additional context

Actionable items
If implemented, those improvements will most likely be a way for maintainers of a project to show that they use a trusted software stacks to their users. AFAICS, this would not provide any actionable feedback to developers about their dependencies.

Acceptance Criteria

To define.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureCategorizes issue or PR as related to a new feature.needs-triageIndicates an issue or PR lacks a `triage/...` label and requires one.priority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next release.sig/user-experienceIssues or PRs related to the User Experience of our Services, Tools, and Libraries.

    Type

    No type

    Projects

    Status

    📋 Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions