feat: integrate Drain3-style log template mining into audit report and logs by Copilot · Pull Request #24328 · github/gh-aw

Copilot · 2026-04-03T15:03:06Z

Summary

Implements a production-oriented Drain3-style log template mining module and integrates it into the gh aw audit, gh aw audit report, and gh aw logs observability pipeline. Includes a --train flag on the logs command to build and embed default weights.

What this PR does

New package: `pkg/agentdrain`

A standalone, zero-dependency Go package implementing online Drain3-style log template mining, specifically designed for agent session observability:

File	Purpose
`types.go`	Exported types: `Config`, `MaskRule`, `Cluster`, `MatchResult`, `AnomalyReport`, `AgentEvent`
`config.go`	`DefaultConfig()` with sensible defaults and built-in masking rules
`mask.go`	`Masker` (UUID, session ID, number, URL, timestamp, quoted-string), `FlattenEvent`, `Tokenize`
`tree.go`	Fixed-depth Drain parse tree (token-count bucket → first-token → cluster IDs)
`cluster.go`	`clusterStore`, `computeSimilarity`, `mergeTemplate`
`miner.go`	Thread-safe `Miner`: `Train` / `Match` / `TrainEvent` / `AnalyzeEvent`
`anomaly.go`	`AnomalyDetector` — weighted anomaly score (new=1.0, low-sim=0.7, rare=0.3)
`pretrain.go`	`PreTrainTemplate` / `PreTrainTemplates` / `PreTrainTemplateCounts`
`persist.go`	JSON round-trip: `SaveJSON` / `LoadJSON` / `LoadMinerJSON`
`coordinator.go`	Stage-based `Coordinator` (plan, tool_call, tool_result, retry, error, finish) + `StageSequence` + `SaveWeightsJSON` / `LoadWeightsJSON`
`defaults.go`	Embeds `data/default_weights.json`; `LoadDefaultWeights()` preloads embedded templates
`miner_test.go`	12 tests: cluster creation/merge, inference-only, masking, flatten, pretrain, save/load, concurrency, stage routing, similarity, merge
`anomaly_test.go`	5 tests: new template, low similarity, rare cluster, normal match, full pipeline

New: `pkg/cli/drain3_integration.go`

Integration layer connecting pkg/agentdrain to the existing audit pipeline. Coordinators call LoadDefaultWeights() on startup so trained templates are available immediately:

buildDrain3Insights(processedRun, metrics, toolUsage) — converts a single run's structured data into AgentEvents, mines templates, and returns ObservabilityInsights covering cluster summary, anomaly report, and stage sequence evidence.
buildDrain3InsightsMultiRun(processedRuns) — same for cross-run analysis with a shared coordinator, detecting patterns across ≤50 runs.
buildDrain3InsightsFromCrossRunInputs(inputs) — converts crossRunInput entries to ProcessedRuns and delegates to buildDrain3InsightsMultiRun, used by the audit report subcommand.

New: `pkg/cli/drain3_train.go`

TrainDrain3Weights(processedRuns, outputDir, verbose) trains a Drain3 coordinator across all processed runs and writes drain3_weights.json to the output directory. The file can be committed to pkg/agentdrain/data/default_weights.json and rebuilt to embed it as the binary default.

Modified: `pkg/cli/logs_command.go` — `--train` flag

gh aw logs --train                    # train on last 10 runs
gh aw logs my-workflow --train -c 50  # train on up to 50 runs

When --train is set, DownloadWorkflowLogs runs TrainDrain3Weights after processing all runs and returns an error if training fails.

Modified: `pkg/cli/audit_report.go`

Appends Drain3 insights to ObservabilityInsights in buildAuditData():

observabilityInsights := buildAuditObservabilityInsights(...)
observabilityInsights = append(observabilityInsights, buildDrain3Insights(...)...)

Modified: `pkg/cli/logs_report.go`

Appends Drain3 multi-run insights to Observability in BuildLogsData():

observability := buildLogsObservabilityInsights(processedRuns, toolUsage)
observability = append(observability, buildDrain3InsightsMultiRun(processedRuns)...)

Modified: `pkg/cli/audit_cross_run.go` — `gh aw audit report` integration

CrossRunAuditReport now includes a drain3_insights field ([]ObservabilityInsight). Phase 7 of buildCrossRunAuditReport calls buildDrain3InsightsFromCrossRunInputs to mine event templates across all runs in the report. Both markdown and pretty renderers include an "Agent Event Pattern Analysis (Drain3)" section with severity icons (🔴/🟠/🟡/ℹ), category labels, summaries, and evidence strings.

## Agent Event Pattern Analysis (Drain3)

### ℹ Log template patterns mined

**Category:** execution | **Severity:** info

Drain3 analysis identified 4 distinct event templates across 3 pipeline stages from 12 events.

_Evidence:_ `error=1 plan=1 tool_call=2`

New: `cmd/agentdrain-demo/main.go`

Runnable demo that shows the full pipeline: coordinator setup → pretraining → event ingestion → cluster/template/param/anomaly output → snapshot save/reload → post-restore inference.

Embedding default weights workflow

gh aw logs --train -c 100
cp .github/aw/logs/drain3_weights.json pkg/agentdrain/data/default_weights.json
make build   # re-embeds the trained weights into the binary

After this, all future gh aw audit and gh aw logs runs preload those templates into the Drain3 coordinator before processing new events, giving better anomaly detection from the first observation.

Testing

17 new unit tests in pkg/agentdrain/ — all pass
8 new tests in pkg/cli/drain3_integration_test.go — all pass
4 new tests in pkg/cli/drain3_train_test.go — all pass (including --train flag presence, JSON structure, stage key assertions)
5 new tests in pkg/cli/audit_cross_run_test.go — all pass (field-mapping assertions, insight field validation, markdown rendering)
make lint — 0 issues
make test-unit — all pre-existing failures are unrelated (require live GitHub API access)

Design decisions

No external dependencies — pure standard library
Structured event inputs — uses already-extracted audit fields (tool usage, MCP failures, etc.) rather than re-parsing raw logs, giving clean integration without I/O overhead
Drain3 similarity — based on fraction of exact token matches at non-wildcard positions; length mismatch → similarity 0
Thread-safe — sync.RWMutex throughout; Match uses RLock, Train uses full Lock
Graceful degradation — unknown stages are skipped without error in the audit pipeline
--train returns error — since the user explicitly requested training, failures are surfaced rather than silently warned

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/850383e4-6ce1-4a3d-aa07-dae32343caa6 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

pelikhan · 2026-04-03T15:06:05Z

@copilot

add a flag "--train" to the "logs" command that runs the drain3 pre training to build the weights file which should be embedded to the compiler as default

pelikhan · 2026-04-03T15:07:23Z

@copilot

integrate drain3 analysis into the "audit report" sub command report

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/621cd144-30cc-44cd-9e7c-37361cee1b70 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

…return train error Agent-Logs-Url: https://github.com/github/gh-aw/sessions/621cd144-30cc-44cd-9e7c-37361cee1b70 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-04-03T15:24:34Z

@copilot

add a flag "--train" to the "logs" command that runs the drain3 pre training to build the weights file which should be embedded to the compiler as default...

Done in commit d90b5f3.

--train flag added to gh aw logs:

gh aw logs --train                    # train on last 10 runs
gh aw logs --train -c 50              # train on up to 50 runs
gh aw logs my-workflow --train -o ./out  # custom output dir

How it works:

Downloads and processes runs as normal
Trains a Drain3 coordinator across all processed events (plan, tool_call, tool_result, retry, error, finish stages)
Writes drain3_weights.json to the output directory (pretty-printed JSON)

Prints instructions to embed as default:

cp .github/aw/logs/drain3_weights.json pkg/agentdrain/data/default_weights.json
make build

Embedding as default: The pkg/agentdrain/data/default_weights.json file is embedded via //go:embed in defaults.go. Once a trained weights file is committed there, all future gh aw audit and gh aw logs runs will preload those templates into the Drain3 coordinator before processing new events — giving better anomaly detection from the first run.

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1361e355-3eb5-4c65-9f64-ee483320bd65 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-04-03T15:40:08Z

@copilot

integrate drain3 analysis into the "audit report" sub command report...

Done in commit 9809766.

Drain3 multi-run pattern analysis is now part of gh aw audit report (gh aw audit report --workflow my-workflow).

What was added:

CrossRunAuditReport gets a new drain3_insights JSON field ([]ObservabilityInsight)
Phase 7 in buildCrossRunAuditReport converts the existing crossRunInput slice to ProcessedRuns and runs buildDrain3InsightsMultiRun
Both markdown (--format markdown) and pretty (--format pretty) renderers include an "Agent Event Pattern Analysis (Drain3)" section with severity icons (🔴/🟠/🟡/ℹ), category, summary, and evidence

Markdown output example:

## Agent Event Pattern Analysis (Drain3)

### ℹ Log template patterns mined

**Category:** execution | **Severity:** info

Drain3 analysis identified 4 distinct event templates across 3 pipeline stages from 12 events.

_Evidence:_ `error=1 plan=1 tool_call=2`

pelikhan · 2026-04-03T16:41:08Z

@copilot don't mention drain3 in report (internal detail)

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/89ebe149-2934-400a-a97e-a8f73ee6bbe4 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-04-03T16:49:08Z

@copilot don't mention drain3 in report (internal detail)

Done in commit 2a9876a. All user-facing output now refers to "Agent Event Pattern Analysis" / "log pattern weights" without mentioning Drain3. The name remains only in internal identifiers (package, variable names, function names, debug logs).

Copilot AI and others added 3 commits April 3, 2026 14:49

feat: add pkg/agentdrain - Drain3-style log template mining package

c671a75

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/850383e4-6ce1-4a3d-aa07-dae32343caa6 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

feat: integrate drain3 analysis into audit report and logs

3dfcd4f

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/850383e4-6ce1-4a3d-aa07-dae32343caa6 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

fix: remove incorrect build tags from non-test source files

1b5ebf2

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/850383e4-6ce1-4a3d-aa07-dae32343caa6 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI assigned Copilot and pelikhan Apr 3, 2026

Copilot created this pull request from a session on behalf of pelikhan April 3, 2026 15:04 View session

Copilot AI requested a review from pelikhan April 3, 2026 15:04

Copilot finished work on behalf of pelikhan April 3, 2026 15:04

Copilot started work on behalf of pelikhan April 3, 2026 15:06 View session

Copilot AI and others added 2 commits April 3, 2026 15:21

feat: add --train flag to logs command for drain3 weight pretraining

3b51401

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/621cd144-30cc-44cd-9e7c-37361cee1b70 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

fix: address code review - bytes.TrimSpace, log pretty-print errors, …

d90b5f3

…return train error Agent-Logs-Url: https://github.com/github/gh-aw/sessions/621cd144-30cc-44cd-9e7c-37361cee1b70 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan April 3, 2026 15:26

Copilot started work on behalf of pelikhan April 3, 2026 15:26 View session

Copilot AI and others added 2 commits April 3, 2026 15:36

feat: integrate drain3 analysis into audit report subcommand

b4ab0da

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1361e355-3eb5-4c65-9f64-ee483320bd65 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

fix: improve drain3 cross-run test assertions and doc comment

9809766

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1361e355-3eb5-4c65-9f64-ee483320bd65 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan April 3, 2026 15:41

Copilot started work on behalf of pelikhan April 3, 2026 16:41 View session

fix: remove Drain3 name from user-facing report output

2a9876a

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/89ebe149-2934-400a-a97e-a8f73ee6bbe4 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: integrate Drain3-style log template mining into audit report and logs#24328

feat: integrate Drain3-style log template mining into audit report and logs#24328
Copilot wants to merge 8 commits intomainfrom
copilot/integrate-drain3-style-analysis

Copilot AI commented Apr 3, 2026 •

edited

Loading

Uh oh!

pelikhan commented Apr 3, 2026

Uh oh!

pelikhan commented Apr 3, 2026

Uh oh!

Copilot AI commented Apr 3, 2026

Uh oh!

Copilot AI commented Apr 3, 2026

Uh oh!

pelikhan commented Apr 3, 2026

Uh oh!

Copilot AI commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What this PR does

New package: pkg/agentdrain

New: pkg/cli/drain3_integration.go

New: pkg/cli/drain3_train.go

Modified: pkg/cli/logs_command.go — --train flag

Modified: pkg/cli/audit_report.go

Modified: pkg/cli/logs_report.go

Modified: pkg/cli/audit_cross_run.go — gh aw audit report integration

New: cmd/agentdrain-demo/main.go

Embedding default weights workflow

Testing

Design decisions

Uh oh!

pelikhan commented Apr 3, 2026

Uh oh!

pelikhan commented Apr 3, 2026

Uh oh!

Copilot AI commented Apr 3, 2026

Uh oh!

Copilot AI commented Apr 3, 2026

Uh oh!

pelikhan commented Apr 3, 2026

Uh oh!

Copilot AI commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Apr 3, 2026 •

edited

Loading

New package: `pkg/agentdrain`

New: `pkg/cli/drain3_integration.go`

New: `pkg/cli/drain3_train.go`

Modified: `pkg/cli/logs_command.go` — `--train` flag

Modified: `pkg/cli/audit_report.go`

Modified: `pkg/cli/logs_report.go`

Modified: `pkg/cli/audit_cross_run.go` — `gh aw audit report` integration

New: `cmd/agentdrain-demo/main.go`