Skip to content

Define Provenance Metadata for OpenPAKT Reports and Findings #27

@meisterware-admin

Description

@meisterware-admin

Overview

Define provenance metadata for OpenPAKT reports and findings.

Provenance metadata captures where a finding originated, how it was produced, and what system generated it. This information is important for traceability, reproducibility, and trust when OpenPAKT findings are consumed by CI pipelines, dashboards, or security analysis tools.

This proposal explores how OpenPAKT can represent minimal provenance information while keeping the core specification lightweight and implementation-agnostic.

The goal is to allow tools to understand the origin and context of a finding without relying on scanner-specific formats.


Motivation

In real-world DevSecOps environments, security findings often pass through multiple systems before reaching a developer or security dashboard.

For example:

  • a scanner analyzes a repository in CI
  • findings are exported as an OpenPAKT report
  • the report is processed by a CI policy engine
  • results appear in a security dashboard
  • the same findings may be archived for audit purposes

Without provenance metadata, it becomes difficult to answer important questions such as:

  • Which scanner produced this finding?
  • What version of the scanner generated it?
  • What artifact or environment was analyzed?
  • When was the scan executed?
  • Was the finding produced during CI, local analysis, or another environment?

Providing standardized provenance metadata allows OpenPAKT findings to remain traceable and trustworthy across tools and workflows.

This is particularly important for automated security pipelines where results may influence deployment decisions.


Proposed Approach

Introduce optional provenance metadata fields within OpenPAKT reports.

These fields would describe the origin of the scan and the context in which findings were produced.

Example provenance metadata structure:

scan:
  tool:
    name: detektor
    version: 0.1.0
  environment:
    type: ci
    system: github_actions
  timestamp: 2026-03-08T10:00:00Z

Example JSON representation:

{
  "scan": {
    "tool": {
      "name": "detektor",
      "version": "0.1.0"
    },
    "environment": {
      "type": "ci",
      "system": "github_actions"
    },
    "timestamp": "2026-03-08T10:00:00Z"
  }
}

This structure aligns with the minimal report model already used by OpenPAKT reports, which include a scan section describing the tool and execution context.

Possible provenance attributes may include:

Field Description
tool.name Scanner or tool producing the report
tool.version Version of the scanner
environment.type Execution environment (CI, local, etc.)
environment.system CI platform or runtime system
timestamp Time when the scan was executed

These fields should remain minimal and optional, ensuring that scanners with limited metadata capabilities can still produce valid reports.


Alternatives Considered

Tool-specific metadata

Scanners could include proprietary metadata fields to describe provenance.

However, this approach reduces interoperability and makes it harder for external tools to interpret findings consistently.


External provenance tracking

Provenance information could be maintained entirely outside of OpenPAKT, such as within CI systems or security dashboards.

While this may work in some environments, embedding minimal provenance metadata in the report improves portability and traceability across systems.


Mandatory provenance metadata

Another option would be to require all provenance fields in OpenPAKT reports.

However, this would increase the burden on simple scanners and reduce the accessibility of the specification.

For this reason, provenance metadata should remain optional but recommended.


Risks and Trade-offs

Metadata inconsistency

Different scanners may populate provenance metadata differently.

Providing guidance and examples can help encourage consistent usage.


Privacy considerations

In some environments, including detailed provenance metadata could reveal internal system details.

Implementations should allow sensitive fields to be omitted when necessary.


Specification growth

Adding provenance metadata expands the report structure.

This risk can be mitigated by keeping the model minimal and avoiding complex provenance chains.


Open Questions

  • Should provenance metadata remain optional or become recommended for CI environments?
  • Should provenance metadata include unique scan identifiers?
  • Should provenance metadata support linking findings to specific pipeline runs or commits?
  • Should OpenPAKT support multiple scanners contributing to a single report in future versions?

Examples

Example OpenPAKT report snippet showing provenance metadata:

{
  "schema_version": "0.1",
  "scan": {
    "tool": {
      "name": "detektor",
      "version": "0.1.0"
    },
    "environment": {
      "type": "ci",
      "system": "github_actions"
    },
    "timestamp": "2026-03-08T10:00:00Z"
  },
  "findings": []
}

This allows downstream systems to understand who produced the findings and under what context.


Next Steps

If this proposal gains support:

  1. Define the minimal provenance metadata fields for OpenPAKT reports.
  2. Document normative and recommended usage guidance.
  3. Provide example report snippets demonstrating provenance fields.
  4. Validate compatibility with CI pipelines and reporting systems.
  5. Implement provenance metadata generation in the Detektor reference scanner.

Architectural note

This issue works closely with two other v0.2 ecosystem issues:

  • Cross-surface correlation model (relationships between components)
  • SARIF mapping (interoperability with external security dashboards)

Together these features establish the OpenPAKT interoperability layer, enabling findings to move across scanners, CI systems, and security tooling while preserving context and traceability.

Metadata

Metadata

Assignees

No one assigned

    Labels

    designArchitectural or structural discussions affecting the direction of the specification.enhancementNew feature or requestproposalEarly-stage ideas requiring discussion before becoming specification changes.specOpenPAKT specification definition or normative behavior.

    Projects

    Status

    Backlog

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions