-
Notifications
You must be signed in to change notification settings - Fork 85
Open
Description
It would be more credible to show "With the 100k token dump, Opus 4.5 achieves a 95% improvement in finding vulnerabilities in Y codebase, measured across 50 single-shot scenarios"
Without this, it's hard to optimise or improve the existing context.
How do you know if a change is a measurable improvement?
And if someone makes a different set of security guidelines, how do you know if they're as good (or better) than this?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels