-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Problem statement
As a Python Developer,
I would like to have concise information about the quality of my software stack and all its transitive dependencies,
so that I get some absolute metrics such as:
- "95% of my dependencies are maintained with a dependency update tool (i.e. dependabot, etc)"
- "45% of my dependencies have 3 or more maintainers"
- ...
Which would be aggregated and compared to metrics for packages present in Thoth's database to provide a global quality metric for a given software stack, eventually given a specific criteria (maintenance, code quality...), in the form of a percentage or score (A, B, C...).
We consider the metrics derived from direct and transitive dependencies to be of the same importance, so there will not be any difference in the weight given to information carried by the two types of dependencies.
Proposal description
- create ADR wrt/ implementation of the service as 'a bot' eg GitHub App, Action, ... ?
- PoC: Implement an experimental thamos flag on the advise command to give users insights about the maintenance of their packages
- Print a concise summary of Scorecard metrics when the
--scoringflag is passed onthamos advisethamos#1149 - Implement the logic to compute maintenance metrics for dependencies of an advised software stack using data from the advised report thamos#1148
- Compute metrics for packages present in Thoth's database that will serve as a basis for a global software stack quality score
Taking the example of OSSF Scorecards, we already aggregate this information in prescriptions which are used directly by the adviser. However, the aggregation logic present in prescriptions-refresh-job only updates prescriptions for packages already present in the repository. We could either aggregate Scorecards data for more packages using the OSSF BigQuery dataset or have our own tool that computes Scorecards metrics on a new package release, which could be integrated directly into package-update-job for instance. This would most likely consist in a simple script querying the GitHub API and computing the metrics on the project's last release commit.
- Aggregate Scorecards metrics on a new package release #440
- Create a new table for storing Scorecard metrics storages#2668
- Schedule a new job to compute metrics on aggregated data
- Implement the job to be run after each
package-update-jobor on a regular schedule - Compute percentiles for each metric that will serve as a basis to score a software stack
- Implement the global scoring logic
For example, if a software stack is in the 95th percentile of packages with the best development practices (CI/CD, testing...), score it as "A" for this category. Compute a global score from the different category scores.
- Gather user feedback/opinions on useful Scorecard metrics #442
- Implement this logic either on the adviser side by performing a lookup to the database when an advise and integrating these metrics in the advise report, or on each endpoint separately if we wish to separate information carried by metrics from advise reports.
- Make the scoring logic publicly accessible via justification URLs provided with each scoring
Additional context
Actionable items
If implemented, those improvements will most likely be a way for maintainers of a project to show that they use a trusted software stacks to their users. AFAICS, this would not provide any actionable feedback to developers about their dependencies.
Acceptance Criteria
To define.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status