Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
218 changes: 200 additions & 18 deletions spec/ci-policy.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,217 @@
Specification: OpenPAKT
Document: CI Policy Semantics
Document: CI Policy Evaluation Semantics
Version: v0.1
Status: Draft

# OpenPAKT — CI Policy Semantics
# OpenPAKT — CI Policy Evaluation Semantics

## Overview
## Purpose

This document defines the **CI policy semantics** used to evaluate OpenPAKT security findings within continuous integration pipelines.
This document defines a minimal, deterministic model for evaluating OpenPAKT findings in CI.

CI policies enable automated enforcement of security requirements based on finding severity and taxonomy categories.
The v0.1 model provides a tool-independent way to determine pass/fail outcomes from normalized findings.

## Design Goals
CI policy evaluation operates on findings that conform to the OpenPAKT report schema.

CI policy semantics are designed to:
CI evaluation input is the normalized findings array from an OpenPAKT report (`report.findings`) or an equivalent extracted normalized findings list.

- enable deterministic CI pipeline evaluation
- support consistent enforcement across tools
- remain simple and portable
- allow flexible policy configuration
OpenPAKT v0.1 CI policy evaluation applies to normalized findings and does **not** directly evaluate scenario definitions or scenario execution outcomes.

## Specification
## Scope

CI policies operate on OpenPAKT findings and determine whether a build should pass, fail, or report warnings.
This document defines:

Detailed policy evaluation rules will be defined in future revisions.
- a minimal CI policy input shape
- deterministic pass/fail evaluation rules
- deterministic handling for ignored severities and ignored finding types
- severity threshold behavior aligned to the OpenPAKT severity model
- compatibility guidance for CI systems and external reporting formats

## Examples
This document does **not** define:

Examples of CI policy evaluation will be included in future revisions of the OpenPAKT specification.
- a policy DSL or query language
- scanner normalization logic
- taxonomy or severity definitions (see dedicated specification documents)
- SARIF mapping
- provenance or registry semantics
- implementation-specific workflow logic

## Compatibility Considerations
## Design goals

CI policy semantics are designed to integrate with common CI systems such as GitHub Actions, GitLab CI, and Azure Pipelines.
The v0.1 CI policy evaluation semantics are designed to be:

- minimal
- deterministic
- implementation-agnostic
- CI-friendly
- compatible with simple pipeline gate behavior

## Normative guidance

- CI policy evaluation **MUST** operate on normalized OpenPAKT findings.
- Evaluators **MUST** apply the severity ordering defined in the OpenPAKT severity model and referenced in this document.
- Policies **MUST** define `fail_on`, and the value **MUST** be one of the severity levels defined in the OpenPAKT severity model.
- Evaluators **MUST** treat policies with a missing `fail_on` key or unsupported `fail_on` value as invalid input and **MUST** stop evaluation with an `invalid-policy` result (no pass/fail decision is produced).
- Policies **MAY** define `ignore_severities`.
- Policies **MAY** define `ignore_types`.
- Evaluators **MUST** ignore unknown top-level policy keys.
- If present, `ignore_severities` **MUST** be an array of strings; entries that are not severity levels defined in the OpenPAKT severity model **MUST** be ignored.
- If present, `ignore_types` **MUST** be an array of strings; entries that are not canonical taxonomy identifiers defined in the OpenPAKT taxonomy specification **MUST** be ignored.
- Evaluators **MUST** treat non-array `ignore_severities`/`ignore_types` values as invalid policy input and **MUST** stop evaluation with an `invalid-policy` result (no pass/fail decision is produced).
- If evaluated findings input is malformed or not normalized (for example missing required finding fields or unsupported severity/type values), evaluators **MUST** stop evaluation with an `invalid-findings` result (no pass/fail decision is produced).
- Evaluators **MUST** exclude ignored findings from fail/pass evaluation.
- A build **MUST** fail if at least one non-ignored finding has severity at or above `fail_on`.
- A build **MUST** pass if no non-ignored finding has severity at or above `fail_on`.
- Evaluators **MUST NOT** use tool-specific extensions to alter the normative pass/fail outcome.
- Evaluators **SHOULD** return a machine-readable evaluation result that includes at least: decision (`pass`/`fail`/`invalid-policy`/`invalid-findings`), `fail_on`, and `matched_finding_ids`.
- Evaluators **MUST** emit `matched_finding_ids` in the original finding order from the evaluated findings list and **MUST** preserve duplicates.
- For `invalid-policy` decisions, machine-readable results **MUST** set `fail_on` to `null` and `matched_finding_ids` to an empty array.
- For `invalid-findings` decisions, machine-readable results **MUST** set `fail_on` to the validated policy threshold and `matched_finding_ids` to an empty array.

## Policy input model (v0.1)

A v0.1 policy input uses three concepts:

- `fail_on` (required): severity threshold for failing the build
- `ignore_severities` (optional): list of severities to exclude
- `ignore_types` (optional): list of finding `type` values to exclude

Policy keys are case-sensitive and **MUST** appear exactly as defined. Unknown top-level keys are allowed and **MUST** be ignored.

If present, `ignore_severities` and `ignore_types` **MUST** be arrays of strings. Entries that do not use canonical identifiers defined by the severity and taxonomy specifications **MUST** be ignored.

### Example policy input (YAML)

```yaml
fail_on: high
ignore_severities:
- informational
ignore_types:
- prompt_injection
```

## Evaluation model

Given:

- a policy `P`
- a findings list `F` sourced from `report.findings` or an equivalent extracted normalized findings list

evaluation proceeds as follows:

1. Validate `P` according to this document. If invalid, decision is `invalid-policy` and evaluation stops.
2. Validate `F` as normalized OpenPAKT findings. If invalid, decision is `invalid-findings` and evaluation stops.
3. Start with all findings in `F`.
4. Remove findings where `severity` is listed in `P.ignore_severities`.
5. Remove findings where `type` is listed in `P.ignore_types`.
6. From the remaining findings, select findings with `severity >= P.fail_on` according to the severity ordering defined in this document.
7. If one or more findings match step 6, decision is `fail`; otherwise decision is `pass`.

If `ignore_severities` or `ignore_types` are omitted, evaluators **MUST** treat them as empty sets.

## Deterministic severity threshold behavior

Severity comparison **MUST** use this strict ranking:

1. `critical`
2. `high`
3. `medium`
4. `low`
5. `informational`

For threshold checks, a finding meets `fail_on` when its severity is the same as the threshold or appears to the left of the threshold in the ordered list above.

Examples:

- with `fail_on: medium`, severities `medium`, `high`, and `critical` meet the threshold
- with `fail_on: high`, only `high` and `critical` meet the threshold

## Deterministic ignore handling

Ignore logic applies before threshold comparison.

A finding is ignored when at least one of the following is true:

- its `severity` is in `ignore_severities`
- its `type` is in `ignore_types`

If both ignore lists are present, evaluators **MUST** treat ignore matching as logical OR.

Ignored findings:

- **MUST NOT** contribute to threshold matching
- **MAY** be reported as excluded in implementation-specific output
- **MAY** include ignored finding identifiers and exclusion reasons in implementation-specific output
- **MUST NOT** change the normative pass/fail rule

## Compatibility guidance

### CI system compatibility

Implementations in CI systems (for example GitHub Actions, GitLab CI, and Azure Pipelines) **SHOULD** preserve the normative evaluation order and pass/fail rules in this document.

The CI platform exit status **MUST** be derived directly from the policy decision:

- `pass` -> successful job/stage
- `fail` -> failed job/stage
- `invalid-policy` -> failed job/stage
- `invalid-findings` -> failed job/stage

### External reporting compatibility

When exporting results to external reporting formats, producers **SHOULD** preserve:

- the original policy inputs used for evaluation
- the final decision (`pass`/`fail`/`invalid-policy`/`invalid-findings`)
- `matched_finding_ids` as the ordered list of matching non-ignored finding identifiers (preserving duplicates in original finding order)

Export behavior **MUST NOT** redefine OpenPAKT evaluation semantics.

## Deterministic examples

### Example findings (normalized)

```yaml
findings:
- id: f-001
type: tool_abuse_privilege_escalation
severity: high
- id: f-002
type: prompt_injection
severity: medium
- id: f-003
type: sensitive_data_exposure
severity: informational
```

### Evaluation examples

| Policy input | Non-ignored findings | Threshold matches | Decision |
|---|---|---|---|
| `fail_on: high` | `f-001`, `f-002`, `f-003` | `f-001` | `fail` |
| `fail_on: high`, `ignore_types: [prompt_injection]` | `f-001`, `f-003` | `f-001` | `fail` |
| `fail_on: critical`, `ignore_severities: [informational]` | `f-001`, `f-002` | none | `pass` |
| `fail_on: medium`, `ignore_severities: [high, medium]` | `f-003` | none | `pass` |

### Invalid input example

```yaml
findings:
- id: f-001
type: tool_abuse_privilege_escalation
severity: severe
```

Expected machine-readable result:

```yaml
decision: invalid-findings
fail_on: high
matched_finding_ids: []
```

## Versioning and compatibility notes

This document defines the minimal CI policy evaluation semantics for OpenPAKT v0.1.

Future versions may extend policy expressiveness, but v0.1 implementations should treat this evaluation model as the normative baseline for deterministic pass/fail behavior.
4 changes: 2 additions & 2 deletions spec/severity.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,10 +88,10 @@ evidence:
### CI threshold style example

```txt
fail-on: high
fail_on: high
```

Expected deterministic behaviour for `fail-on: high`:
Expected deterministic behaviour for `fail_on: high`:

- `critical` -> fail build
- `high` -> fail build
Expand Down
Loading