-
Notifications
You must be signed in to change notification settings - Fork 0
GDCD script for page counts, remove deprecated project #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cbullinger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
love it!
| 1. Runs audit-cli once to identify projects that exist only in audit-cli (not in the log) | ||
| 2. Re-runs audit-cli with those projects excluded using the `--exclude-dirs` flag | ||
| 3. Compares the filtered results for a cleaner comparison |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
| // projectNameMapping maps log file project names to their audit-cli equivalents. | ||
| // This handles cases where the same project has different names in the GDCD logs | ||
| // versus the audit-cli output. Add new mappings here as needed. | ||
| var projectNameMapping = map[string]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugh another place to custom map names 😑
audit/gdcd/scripts/README.md
Outdated
| - **Only in log**: Projects found in the log but not in audit-cli output (may indicate project name mismatches) | ||
| - **Only in audit-cli**: Projects found in audit-cli but not in the log - these are automatically excluded in the second run for a cleaner comparison |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm confused by the output, i think. will these ever be populated? e.g. i see we have a handful of "only in <log/audit-cli>" entries but these are both 0 in the summary -- would they have values on the first run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way the code is structured, they're populated on the "initial run" and then the tool re-runs the audit-cli with excluded dirs for only in audit-cli entries. At that point, the number is reduced to 0. The "only in log" entries can be populated by new projects that we haven't added naming mapping for (if the project name does not match the name in the audit-cli) but will probably be 0 other than that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the example we give shows both types in the results, though, which is why i'm confused. shouldn't the summary only in log reflect the three results that are marked with only in log?
i'm also not really seeing the value of showing the only in audit-cli if it effectively gets reduced to 0 every time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good points. Made some minor tweaks to the way the output is generated to:
- Omit
only in audit-clisince it should never be populated - Only conditionally show
only in logif there are projects that only appear in the log but notaudit-cli - Also check that
audit-cliis available before trying to run the thing
I also updated the example output in the README so hopefully this is all consistent now. 🤞
Co-authored-by: cory <115956901+cbullinger@users.noreply.github.com>
This PR adds a new script to compare the GDCD ingest logs (from Snooty Data API ingest job) to the
audit-clioutput from local monorepo files.In investigating discrepancies, I also discovered that
docs-k8s-operatoris deprecated and we should no longer be ingesting data for it during our weekly ingest job.