-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Overview
Define provenance metadata for OpenPAKT reports and findings.
Provenance metadata captures where a finding originated, how it was produced, and what system generated it. This information is important for traceability, reproducibility, and trust when OpenPAKT findings are consumed by CI pipelines, dashboards, or security analysis tools.
This proposal explores how OpenPAKT can represent minimal provenance information while keeping the core specification lightweight and implementation-agnostic.
The goal is to allow tools to understand the origin and context of a finding without relying on scanner-specific formats.
Motivation
In real-world DevSecOps environments, security findings often pass through multiple systems before reaching a developer or security dashboard.
For example:
- a scanner analyzes a repository in CI
- findings are exported as an OpenPAKT report
- the report is processed by a CI policy engine
- results appear in a security dashboard
- the same findings may be archived for audit purposes
Without provenance metadata, it becomes difficult to answer important questions such as:
- Which scanner produced this finding?
- What version of the scanner generated it?
- What artifact or environment was analyzed?
- When was the scan executed?
- Was the finding produced during CI, local analysis, or another environment?
Providing standardized provenance metadata allows OpenPAKT findings to remain traceable and trustworthy across tools and workflows.
This is particularly important for automated security pipelines where results may influence deployment decisions.
Proposed Approach
Introduce optional provenance metadata fields within OpenPAKT reports.
These fields would describe the origin of the scan and the context in which findings were produced.
Example provenance metadata structure:
scan:
tool:
name: detektor
version: 0.1.0
environment:
type: ci
system: github_actions
timestamp: 2026-03-08T10:00:00ZExample JSON representation:
{
"scan": {
"tool": {
"name": "detektor",
"version": "0.1.0"
},
"environment": {
"type": "ci",
"system": "github_actions"
},
"timestamp": "2026-03-08T10:00:00Z"
}
}This structure aligns with the minimal report model already used by OpenPAKT reports, which include a scan section describing the tool and execution context.
Possible provenance attributes may include:
| Field | Description |
|---|---|
tool.name |
Scanner or tool producing the report |
tool.version |
Version of the scanner |
environment.type |
Execution environment (CI, local, etc.) |
environment.system |
CI platform or runtime system |
timestamp |
Time when the scan was executed |
These fields should remain minimal and optional, ensuring that scanners with limited metadata capabilities can still produce valid reports.
Alternatives Considered
Tool-specific metadata
Scanners could include proprietary metadata fields to describe provenance.
However, this approach reduces interoperability and makes it harder for external tools to interpret findings consistently.
External provenance tracking
Provenance information could be maintained entirely outside of OpenPAKT, such as within CI systems or security dashboards.
While this may work in some environments, embedding minimal provenance metadata in the report improves portability and traceability across systems.
Mandatory provenance metadata
Another option would be to require all provenance fields in OpenPAKT reports.
However, this would increase the burden on simple scanners and reduce the accessibility of the specification.
For this reason, provenance metadata should remain optional but recommended.
Risks and Trade-offs
Metadata inconsistency
Different scanners may populate provenance metadata differently.
Providing guidance and examples can help encourage consistent usage.
Privacy considerations
In some environments, including detailed provenance metadata could reveal internal system details.
Implementations should allow sensitive fields to be omitted when necessary.
Specification growth
Adding provenance metadata expands the report structure.
This risk can be mitigated by keeping the model minimal and avoiding complex provenance chains.
Open Questions
- Should provenance metadata remain optional or become recommended for CI environments?
- Should provenance metadata include unique scan identifiers?
- Should provenance metadata support linking findings to specific pipeline runs or commits?
- Should OpenPAKT support multiple scanners contributing to a single report in future versions?
Examples
Example OpenPAKT report snippet showing provenance metadata:
{
"schema_version": "0.1",
"scan": {
"tool": {
"name": "detektor",
"version": "0.1.0"
},
"environment": {
"type": "ci",
"system": "github_actions"
},
"timestamp": "2026-03-08T10:00:00Z"
},
"findings": []
}This allows downstream systems to understand who produced the findings and under what context.
Next Steps
If this proposal gains support:
- Define the minimal provenance metadata fields for OpenPAKT reports.
- Document normative and recommended usage guidance.
- Provide example report snippets demonstrating provenance fields.
- Validate compatibility with CI pipelines and reporting systems.
- Implement provenance metadata generation in the Detektor reference scanner.
Architectural note
This issue works closely with two other v0.2 ecosystem issues:
- Cross-surface correlation model (relationships between components)
- SARIF mapping (interoperability with external security dashboards)
Together these features establish the OpenPAKT interoperability layer, enabling findings to move across scanners, CI systems, and security tooling while preserving context and traceability.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status