Skip to content

feat: integrate log template mining into audit report and logs#24328

Merged
pelikhan merged 10 commits intomainfrom
copilot/integrate-drain3-style-analysis
Apr 3, 2026
Merged

feat: integrate log template mining into audit report and logs#24328
pelikhan merged 10 commits intomainfrom
copilot/integrate-drain3-style-analysis

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 3, 2026

Summary

Implements a production-oriented log template mining module (pkg/agentdrain) and integrates it into the gh aw audit, gh aw audit report, and gh aw logs observability pipeline. Includes a --train flag on the logs command to build and embed default weights.

What this PR does

New package: pkg/agentdrain

A standalone, zero-dependency Go package implementing online log template mining (Drain3-style algorithm), specifically designed for agent session observability:

File Purpose
types.go Exported types: Config, MaskRule, Cluster, MatchResult, AnomalyReport, AgentEvent
config.go DefaultConfig() with sensible defaults and built-in masking rules
mask.go Masker (UUID, session ID, number, URL, timestamp, quoted-string), FlattenEvent, Tokenize
tree.go Fixed-depth parse tree (token-count bucket → first-token → cluster IDs)
cluster.go clusterStore, computeSimilarity, mergeTemplate
miner.go Thread-safe Miner: Train / Match / TrainEvent / AnalyzeEvent
anomaly.go AnomalyDetector — weighted anomaly score (new=1.0, low-sim=0.7, rare=0.3)
pretrain.go PreTrainTemplate / PreTrainTemplates / PreTrainTemplateCounts
persist.go JSON round-trip: SaveJSON / LoadJSON / LoadMinerJSON
coordinator.go Stage-based Coordinator (plan, tool_call, tool_result, retry, error, finish) + StageSequence + SaveWeightsJSON / LoadWeightsJSON
defaults.go Embeds data/default_weights.json; LoadDefaultWeights() preloads embedded templates
miner_test.go 12 tests: cluster creation/merge, inference-only, masking, flatten, pretrain, save/load, concurrency, stage routing, similarity, merge
anomaly_test.go 5 tests: new template, low similarity, rare cluster, normal match, full pipeline

New: pkg/cli/drain3_integration.go

Integration layer connecting pkg/agentdrain to the existing audit pipeline. Coordinators call LoadDefaultWeights() on startup so trained templates are available immediately:

  • buildDrain3Insights(processedRun, metrics, toolUsage) — converts a single run's structured data into AgentEvents, mines templates, and returns ObservabilityInsights covering cluster summary, anomaly report, and stage sequence evidence.
  • buildDrain3InsightsMultiRun(processedRuns) — same for cross-run analysis with a shared coordinator, detecting patterns across ≤50 runs.
  • buildDrain3InsightsFromCrossRunInputs(inputs) — converts crossRunInput entries to ProcessedRuns and delegates to buildDrain3InsightsMultiRun, used by the audit report subcommand.

New: pkg/cli/drain3_train.go

TrainDrain3Weights(processedRuns, outputDir, verbose) trains a coordinator across all processed runs and writes drain3_weights.json to the output directory. The file can be committed to pkg/agentdrain/data/default_weights.json and rebuilt to embed it as the binary default.

Modified: pkg/cli/logs_command.go--train flag

gh aw logs --train                    # train on last 10 runs
gh aw logs my-workflow --train -c 50  # train on up to 50 runs

When --train is set, DownloadWorkflowLogs runs TrainDrain3Weights after processing all runs and returns an error if training fails.

Modified: pkg/cli/audit_report.go

Appends log pattern insights to ObservabilityInsights in buildAuditData():

observabilityInsights := buildAuditObservabilityInsights(...)
observabilityInsights = append(observabilityInsights, buildDrain3Insights(...)...)

Modified: pkg/cli/logs_report.go

Appends cross-run log pattern insights to Observability in BuildLogsData():

observability := buildLogsObservabilityInsights(processedRuns, toolUsage)
observability = append(observability, buildDrain3InsightsMultiRun(processedRuns)...)

Modified: pkg/cli/audit_cross_run.gogh aw audit report integration

CrossRunAuditReport now includes a drain3_insights field ([]ObservabilityInsight). Phase 7 of buildCrossRunAuditReport mines event templates across all runs in the report. Both markdown and pretty renderers include an "Agent Event Pattern Analysis" section with severity icons (🔴/🟠/🟡/ℹ), category labels, summaries, and evidence strings.

## Agent Event Pattern Analysis

### ℹ Log template patterns mined

**Category:** execution | **Severity:** info

Analysis identified 4 distinct event templates across 3 pipeline stages from 12 events.

_Evidence:_ `error=1 plan=1 tool_call=2`

New: cmd/agentdrain-demo/main.go

Runnable demo that shows the full pipeline: coordinator setup → pretraining → event ingestion → cluster/template/param/anomaly output → snapshot save/reload → post-restore inference.

Embedding default weights workflow

gh aw logs --train -c 100
cp .github/aw/logs/drain3_weights.json pkg/agentdrain/data/default_weights.json
make build   # re-embeds the trained weights into the binary

After this, all future gh aw audit and gh aw logs runs preload those templates before processing new events, giving better anomaly detection from the first observation.

Testing

  • 17 new unit tests in pkg/agentdrain/ — all pass
  • 8 new tests in pkg/cli/drain3_integration_test.go — all pass
  • 4 new tests in pkg/cli/drain3_train_test.go — all pass (including --train flag presence, JSON structure, stage key assertions)
  • 5 new tests in pkg/cli/audit_cross_run_test.go — all pass (field-mapping assertions, insight field validation, markdown rendering)
  • make lint — 0 issues
  • make test-unit — all pre-existing failures are unrelated (require live GitHub API access)

Design decisions

  • No external dependencies — pure standard library
  • Structured event inputs — uses already-extracted audit fields (tool usage, MCP failures, etc.) rather than re-parsing raw logs, giving clean integration without I/O overhead
  • Template similarity — based on fraction of exact token matches at non-wildcard positions; length mismatch → similarity 0
  • Thread-safesync.RWMutex throughout; Match uses RLock, Train uses full Lock
  • Graceful degradation — unknown stages are skipped without error in the audit pipeline
  • --train returns error — since the user explicitly requested training, failures are surfaced rather than silently warned

@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Apr 3, 2026

@copilot

add a flag "--train" to the "logs" command that runs the drain3 pre training to build the weights file which should be embedded to the compiler as default

@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Apr 3, 2026

@copilot

  • integrate drain3 analysis into the "audit report" sub command report

Copilot AI and others added 2 commits April 3, 2026 15:21
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 3, 2026

@copilot

add a flag "--train" to the "logs" command that runs the drain3 pre training to build the weights file which should be embedded to the compiler as default...

Done in commit d90b5f3.

--train flag added to gh aw logs:

gh aw logs --train                    # train on last 10 runs
gh aw logs --train -c 50              # train on up to 50 runs
gh aw logs my-workflow --train -o ./out  # custom output dir

How it works:

  1. Downloads and processes runs as normal
  2. Trains a Drain3 coordinator across all processed events (plan, tool_call, tool_result, retry, error, finish stages)
  3. Writes drain3_weights.json to the output directory (pretty-printed JSON)
  4. Prints instructions to embed as default:
    cp .github/aw/logs/drain3_weights.json pkg/agentdrain/data/default_weights.json
    make build
    

Embedding as default: The pkg/agentdrain/data/default_weights.json file is embedded via //go:embed in defaults.go. Once a trained weights file is committed there, all future gh aw audit and gh aw logs runs will preload those templates into the Drain3 coordinator before processing new events — giving better anomaly detection from the first run.

Copilot AI and others added 2 commits April 3, 2026 15:36
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 3, 2026

@copilot

  • integrate drain3 analysis into the "audit report" sub command report...

Done in commit 9809766.

Drain3 multi-run pattern analysis is now part of gh aw audit report (gh aw audit report --workflow my-workflow).

What was added:

  • CrossRunAuditReport gets a new drain3_insights JSON field ([]ObservabilityInsight)
  • Phase 7 in buildCrossRunAuditReport converts the existing crossRunInput slice to ProcessedRuns and runs buildDrain3InsightsMultiRun
  • Both markdown (--format markdown) and pretty (--format pretty) renderers include an "Agent Event Pattern Analysis (Drain3)" section with severity icons (🔴/🟠/🟡/ℹ), category, summary, and evidence

Markdown output example:

## Agent Event Pattern Analysis (Drain3)

### ℹ Log template patterns mined

**Category:** execution | **Severity:** info

Drain3 analysis identified 4 distinct event templates across 3 pipeline stages from 12 events.

_Evidence:_ `error=1 plan=1 tool_call=2`

@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Apr 3, 2026

@copilot don't mention drain3 in report (internal detail)

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 3, 2026

@copilot don't mention drain3 in report (internal detail)

Done in commit 2a9876a. All user-facing output now refers to "Agent Event Pattern Analysis" / "log pattern weights" without mentioning Drain3. The name remains only in internal identifiers (package, variable names, function names, debug logs).

Copilot AI changed the title feat: integrate Drain3-style log template mining into audit report and logs feat: integrate log template mining into audit report and logs Apr 3, 2026
Copy link
Copy Markdown
Collaborator

@pelikhan pelikhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Use copilot to add a GitHub workflow (non-agentic) that runs daily, downloads run logs using the train flag, and commits the training file, then creates a PR with the training file back to the…

// - printing matched cluster ID, template, extracted params, and anomaly report
// - saving and reloading a snapshot
// - running inference on a new event after restore
package main
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot delete file

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 3, 2026

@pelikhan I've opened a new pull request, #24344, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 2 commits April 3, 2026 10:20
…emo binary (#24344)

* Initial plan

* feat: add daily drain3 weight training workflow and delete demo binary

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/c5c57258-f0a1-4ade-8afc-9c4464b162cc

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
@pelikhan pelikhan marked this pull request as ready for review April 3, 2026 17:25
Copilot AI review requested due to automatic review settings April 3, 2026 17:25
@pelikhan pelikhan merged commit 1355331 into main Apr 3, 2026
42 of 56 checks passed
@pelikhan pelikhan deleted the copilot/integrate-drain3-style-analysis branch April 3, 2026 17:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an embedded, production-oriented Drain3-style log template mining module and wires it into existing gh aw audit, gh aw audit report, and gh aw logs observability outputs, including an opt-in --train workflow to generate and persist default weights.

Changes:

  • Introduces pkg/agentdrain (masking, mining, persistence, coordinator, anomaly scoring) with unit tests and embedded default weights.
  • Integrates mined template/anomaly/stage-sequence insights into logs/audit single-run and cross-run reports.
  • Adds gh aw logs --train to write drain3_weights.json, plus a scheduled workflow to retrain and open a PR updating embedded defaults.

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pkg/cli/logs_report.go Appends multi-run Drain3 insights to logs report observability output.
pkg/cli/logs_orchestrator.go Adds train parameter and triggers weight training after processing runs.
pkg/cli/logs_json_stderr_order_test.go Updates orchestrator calls to include the new train arg.
pkg/cli/logs_download_test.go Updates orchestrator calls to include the new train arg.
pkg/cli/logs_command.go Adds --train flag and passes it through to the orchestrator.
pkg/cli/logs_ci_scenario_test.go Updates orchestrator calls to include the new train arg.
pkg/cli/drain3_train.go Implements coordinator training + drain3_weights.json writing and instructions.
pkg/cli/drain3_train_test.go Adds tests for training behavior and --train flag presence.
pkg/cli/drain3_integration.go Adds integration layer that converts run data into events and emits insights.
pkg/cli/drain3_integration_test.go Adds tests validating insight generation and event conversion behavior.
pkg/cli/context_cancellation_test.go Updates orchestrator calls to include the new train arg.
pkg/cli/audit_report.go Appends single-run Drain3 insights to audit report observability insights.
pkg/cli/audit_cross_run.go Adds drain3_insights field and builds cross-run Drain3 insights.
pkg/cli/audit_cross_run_test.go Adds tests for cross-run Drain3 mapping, inclusion, and rendering.
pkg/cli/audit_cross_run_render.go Renders Drain3 insights in markdown and pretty cross-run reports.
pkg/agentdrain/types.go Defines exported config/event/cluster/match/anomaly types.
pkg/agentdrain/tree.go Implements the parse tree used to bucket candidate clusters.
pkg/agentdrain/pretrain.go Adds pretraining helpers for seeding templates/counts.
pkg/agentdrain/persist.go Adds JSON snapshot save/load and miner restore constructor.
pkg/agentdrain/miner.go Implements the concurrent miner (train/match/analyze/event ingestion).
pkg/agentdrain/miner_test.go Adds miner/coordinator/masking/persistence/concurrency tests.
pkg/agentdrain/mask.go Implements regex-based masking plus deterministic event flattening/tokenization.
pkg/agentdrain/defaults.go Embeds data/default_weights.json and loads it into a coordinator.
pkg/agentdrain/data/default_weights.json Adds an initial empty embedded weights file ({}).
pkg/agentdrain/coordinator.go Adds stage-based coordinator and combined weights save/load helpers.
pkg/agentdrain/config.go Provides default configuration and masking rules.
pkg/agentdrain/cluster.go Implements cluster store, similarity, and template merging utilities.
pkg/agentdrain/anomaly.go Implements weighted anomaly scoring and reason strings.
pkg/agentdrain/anomaly_test.go Adds anomaly detector and end-to-end AnalyzeEvent tests.
.github/workflows/train-drain3-weights.yml Adds scheduled/manual workflow to retrain weights and open a PR with updates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +225 to +229
for g := range goroutines {
wg.Add(1)
go func(id int) {
defer wg.Done()
for i := range linesEach {
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestConcurrency uses for g := range goroutines / for i := range linesEach, but goroutines and linesEach are integers, so ranging over them does not compile. Use a standard index loop (e.g., for g := 0; g < goroutines; g++ { ... }) for both loops.

Suggested change
for g := range goroutines {
wg.Add(1)
go func(id int) {
defer wg.Done()
for i := range linesEach {
for g := 0; g < goroutines; g++ {
wg.Add(1)
go func(id int) {
defer wg.Done()
for i := 0; i < linesEach; i++ {

Copilot uses AI. Check for mistakes.
Comment on lines +29 to +31
if !isNew {
report.LowSimilarity = result.Similarity < d.threshold
}
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LowSimilarity is effectively unreachable in the current pipeline: Miner.match only returns results when Similarity >= SimThreshold, and AnalyzeEvent constructs the detector with the same threshold (m.cfg.SimThreshold). That means result.Similarity < d.threshold can never be true when isNew == false, so the low-similarity anomaly path won't trigger. Consider returning the best match even when it is below the threshold (so you can report low similarity), or using a separate (higher) detector threshold than the match threshold.

Copilot uses AI. Check for mistakes.
Comment on lines +343 to +349
parts := make([]string, 0, len(allClusters))
for stage, clusters := range allClusters {
if len(clusters) > 0 {
parts = append(parts, fmt.Sprintf("%s=%d", stage, len(clusters)))
}
}
return strings.Join(parts, " ")
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildStageBreakdown iterates over a Go map (allClusters), so the resulting evidence string order is nondeterministic across runs. This can lead to unstable CLI/report output (and flaky golden tests if added later). Consider sorting the stage keys before building parts so the output is deterministic.

Copilot uses AI. Check for mistakes.
MaskRules: []MaskRule{
{
Name: "uuid",
Pattern: `[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}`,
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UUID masking regex only matches lowercase hex ([0-9a-f]). If log lines contain uppercase UUIDs, they won't be masked and could leak identifiers into templates/params. Consider making the pattern case-insensitive or expanding the character classes to include A-F.

Suggested change
Pattern: `[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}`,
Pattern: `(?i)[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}`,

Copilot uses AI. Check for mistakes.
Comment on lines +87 to +92
fmt.Fprintln(os.Stderr, console.FormatSuccessMessage("Log pattern weights written to: "+outputPath))
fmt.Fprintln(os.Stderr, console.FormatInfoMessage(
"To embed these weights as default, copy the file and rebuild:\n"+
" cp "+outputPath+" pkg/agentdrain/data/default_weights.json\n"+
" make build",
))
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copy/paste instructions build a cp command by concatenating paths without quoting. If outputDir contains spaces (or shell-special characters), the suggested command will fail. Consider quoting/escaping outputPath in the printed instructions.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants