[FEAT] New control: pipelineMustNotExecuteUnverifiedScripts: detect curl|bash and similar patterns

## **Is your feature request related to a problem? Please describe.**

When a CI job's `script:`, `before_script:`, or `after_script:` block contains patterns like `curl ... | bash` or `wget ... | sh`, the pipeline downloads and executes code from the internet **without any integrity verification**. This is a well-documented supply chain attack vector:

- **Script tampering**: An attacker who compromises the remote URL (via DNS hijacking, CDN compromise, or maintainer account takeover) can serve a modified script that exfiltrates CI/CD secrets (`$CI_JOB_TOKEN`, deploy keys, custom variables)
- **Server-side detection**: The remote server can detect that `curl` is piping to `bash` (vs. downloading to a file) and serve different content rendering manual verification useless
- **Silent exfiltration**: The malicious payload can inspect environment variables for patterns like `AWS_`, `SECRET_`, `TOKEN`, `API_KEY` and exfiltrate them via a POST request, while the legitimate tool still installs correctly

This risk compounds in CI/CD because pipelines run repeatedly, often with elevated privileges, and the same poisoned URL serves malware to every pipeline execution across the organization.


## **Describe the solution you'd like**

Add a new control `pipelineMustNotExecuteUnverifiedScripts` that:

- Scans `script:`, `before_script:`, and `after_script:` blocks across all jobs (including global `before_script`/`after_script`)
- Detects patterns where downloaded content is piped directly to a shell interpreter or executed immediately after download
- Reports the job name, the offending line, and the detected pattern

### Patterns to detect

```
# Direct pipe to shell
curl ... | bash
curl ... | sh
wget ... | bash
wget ... | sh
curl ... | sudo bash
curl ... | sudo sh

# Download-and-execute (single script block)
curl -o script.sh ... && bash script.sh
wget -O script.sh ... && sh script.sh
curl ... > install.sh; sh install.sh

# Python/other interpreters
curl ... | python
curl ... | python3
wget ... | perl
```

### Configuration in `.plumber.yaml`

```yaml
controls:
  pipelineMustNotExecuteUnverifiedScripts:
    enabled: true
    # Severity: "error" or "warning"
    severity: warning
    # URLs that are trusted and should not trigger findings
    # (e.g., internal artifact server with integrity checks)
    trustedUrls: []
      # - https://internal-artifacts.example.com/*
```

## **Implementation Hints**

These are just ideas. Feel free to change the implementation.

1. **Data source**: The `PipelineOriginData` collector already provides the merged CI configuration via `MergedConf` (`GitlabCIConf`). The `GitlabJob` struct exposes `Script`, `BeforeScript`, and `AfterScript` as `interface{}` fields. The global-level `BeforeScript` and `AfterScript` are on `GitlabCIConf` itself. Use `ParseJobVariables()` and iterate over `GitlabJobs` to access each job's script blocks.
2. **New control file**: Create `control/controlGitlabPipelineUnverifiedScripts.go`.
3. **New collector** (optional): A new `collector/dataCollectionGitlabPipelineScript.go` may be useful if script parsing logic is reused across multiple controls (this control + the injection control). Alternatively, the logic can live entirely in the control.
4. **Logic**:
   - For each job, normalize the script blocks into a flat list of command strings
   - Apply regex patterns to detect `curl|bash`, `wget|sh`, and download-then-execute variants
   - For each match, check if the URL matches any `trustedUrls` pattern
   - If not trusted, create an issue with the job name, the matched line, and the pattern type
5. **Regex approach**: Use a set of compiled regexes. Key patterns:
   - `(curl|wget)\s+[^|]*\|\s*(sudo\s+)?(bash|sh|zsh|python[23]?|perl|ruby)`
   - `(curl|wget)\s+.*(-o|-O)\s+\S+.*&&\s*(bash|sh|source)\s+`
   - `(curl|wget)\s+.*>\s*\S+\.sh\s*[;&]\s*(bash|sh|source)\s+`
6. **Compliance**: 0% if any unverified script execution is found, 100% otherwise.

## **Files Touched**

- `control/controlGitlabPipelineUnverifiedScripts.go` (new control)
- `control/types.go` (add `UnverifiedScriptsResult` field to `AnalysisResult`)
- `control/task.go` (wire the new control in `RunAnalysis()`)
- `configuration/plumberconfig.go` (add config struct and getter)
- `.plumber.yaml` (add default config section)
- `cmd/analyze.go` (add output formatting)

## **Why It's Valuable**

The `curl | bash` pattern is one of the most widespread and dangerous anti-patterns in CI/CD pipelines. Real-world incidents like the [Codecov breach (2021)](https://about.codecov.io/security-update/) demonstrated how a compromised installation script can silently exfiltrate secrets from thousands of CI/CD pipelines for months. Unlike container image controls (which Plumber already covers), script execution controls address the second major supply chain vector: **runtime code injection via untrusted downloads**. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEAT] New control: pipelineMustNotExecuteUnverifiedScripts: detect curl|bash and similar patterns #88

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Patterns to detect

Configuration in `.plumber.yaml`

Implementation Hints

Files Touched

Why It's Valuable

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEAT] New control: pipelineMustNotExecuteUnverifiedScripts: detect curl|bash and similar patterns #88

Description

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Patterns to detect

Configuration in .plumber.yaml

Implementation Hints

Files Touched

Why It's Valuable

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Configuration in `.plumber.yaml`