Skip to content

Add trace analysis toolkit#4

Open
doneyli wants to merge 1 commit intomainfrom
feat/trace-analysis-toolkit
Open

Add trace analysis toolkit#4
doneyli wants to merge 1 commit intomainfrom
feat/trace-analysis-toolkit

Conversation

@doneyli
Copy link
Owner

@doneyli doneyli commented Feb 8, 2026

Summary

  • scripts/analyze-traces.sh — ClickHouse-direct shell script for instant full-dataset analytics on self-hosted Langfuse. Zero deps beyond curl and Docker. Supports --json and --tag flags.
  • scripts/analyze-traces-sdk.py — Official Langfuse SDK alternative using REST API pagination. Works with both Cloud and self-hosted deployments.
  • docs/trace-analysis.md — Full methodology guide with ClickHouse schema reference, query cookbook, and interpretation guide for five analyses.
  • README.md — Added "Analyze Your Traces" section with quick-start commands.

Why two scripts?

The Langfuse Metrics API v2 (which supports server-side aggregations) is Cloud-only. On self-hosted, calling it returns "v2 APIs are currently in beta and only available on Langfuse Cloud". The REST API v1 paginates at 100 items/page with no aggregation support.

For self-hosted deployments with thousands of traces, querying ClickHouse directly (port 8124) is the only practical path for full-dataset analytics.

The five analyses

  1. Overview — total traces, observations, sessions, date range
  2. Tool usage distribution — Read/Search/Write/Execute breakdown with percentages
  3. Session turn distribution — bucketed counts showing session length patterns
  4. Productivity by session length — code changes per turn revealing the efficiency cliff
  5. Read-before-edit pattern — behavioral pattern validation (measure twice, cut once)

Test plan

  • Run ./scripts/analyze-traces.sh against a running Langfuse stack
  • Verify --json output parses correctly with jq
  • Verify --tag filtering works
  • Run python3 scripts/analyze-traces-sdk.py with valid credentials
  • Verify docs/trace-analysis.md custom query examples execute correctly

🤖 Generated with Claude Code

Add two scripts for analyzing Claude Code session traces:
- analyze-traces.sh: queries Langfuse's ClickHouse directly for instant
  full-dataset analytics (self-hosted only, zero deps beyond curl)
- analyze-traces-sdk.py: uses the official Langfuse Python SDK with
  REST API pagination (works with Cloud and self-hosted)

Both produce five analyses: tool usage distribution, session turn
distribution, productivity by session length (the efficiency cliff),
and read-before-edit behavioral patterns.

Includes docs/trace-analysis.md with methodology, ClickHouse schema
reference, interpretation guide, and a query cookbook for custom analytics.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant