feat: add tableau-extract-calcs script and tableau-upstream slash command#3473
feat: add tableau-extract-calcs script and tableau-upstream slash command#3473GabyRangelB wants to merge 14 commits intomainfrom
Conversation
…mand Adds two artifacts to support moving Tableau calculated fields upstream into dbt models: - scripts/tableau-extract-calcs.py: extracts calculated fields from a .twbx/.twb file (local or downloaded from Tableau Server). Supports exposure-aware mode that looks up the workbook LSID and depends_on models directly from models/exposures/tableau.yml. - .claude/commands/tableau-upstream.md: /tableau-upstream slash command that guides Claude through the full workflow — run the extraction script, identify the target rpt_tableau__ model, translate Tableau formula syntax to BigQuery SQL, update the model SQL and YAML, and validate with dbt show. Closes #3472 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Claude finished @GabyRangelB's task in 1m 10s —— View job Code Review: PR #3473 — tableau-extract-calcs + tableau-upstream commandTasks
|
- Replace xml.etree.ElementTree with defusedxml to resolve Bandit B314 - Default --server and --site from TABLEAU_SERVER_ADDRESS / TABLEAU_SITE_ID env vars (already injected via 1Password for all team members) - Authenticate via TABLEAU_TOKEN_NAME + TABLEAU_PERSONAL_ACCESS_TOKEN env vars by default (no password prompt); fall back to --username/password only when PAT env vars are absent Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Tableau Server returns XML by default. Without the Accept header the signin and workbook list endpoints return XML, causing a JSONDecodeError when the script tries to parse the response. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without --datasource, the script now prints a table of unique data sources and their calc counts, then exits with instructions to re-run with -d. This avoids flooding output with 983 fields across all sources when the user only wants one. Also fixes Bandit B314 cleanly by dropping the stdlib xml import entirely (defusedxml.ElementTree handles all parsing) and removing the type annotation from private helpers where the Element type was only needed as a hint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… command Script: when run with no arguments, launches a guided prompt sequence — numbered exposure list, Tableau Server download, numbered datasource picker — so users don't need to know any CLI flags. Slash command: rewritten to drive the same workflow through Claude chat. Removed all workbook-specific examples (replaced with generic placeholders) so the command works for any exposure. Also fixed markdown lint warnings (table alignment, fenced code block languages). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Design for smart classification of Tableau calculated fields into READY / NEEDS WORK / LOD / SKIP categories, with semantic layer recommendations for LOD expressions and slash command UX updates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Design for Step 4b in /tableau-upstream: trace calculated field refs through the dbt DAG, recommend closest-to-extract insertion point, draft BigQuery SQL at that layer, and let the analyst choose how far upstream to push each field. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers field categorization (READY/NEEDS WORK/LOD/SKIP), --model flag, LOD mart-matching, grouped script output, and /tableau-upstream slash command updates (Step 4b lineage tracing, output format preference, LOD/NEEDS WORK handling). Reviewed and approved via automated spec review loop. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…iterals upstream - Re-prompt with valid range when user enters an out-of-bounds number for workbook or datasource selection (Steps 1 and 2) - Before marking bare org-name string literals as SKIP, check int_extracts__student_enrollments for a district/equivalent column and propose adding it to the extract instead Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@claude please review this PR for vulnerabilities and/or exposed sensitive information |
|
Claude finished @GabyRangelB's task in 1m 12s —— View job Security Review: PR #3473Tasks
Security Review ResultsNo hardcoded credentials or sensitive information found. The PR is clean in that regard. Below is a full assessment of security-relevant patterns.
|
| Issue | Severity | Location |
|---|---|---|
| No HTTPS enforcement before sending credentials | Medium | get_auth_token, lines 164–169 |
| Session token not signed out after use | Low | signin_pat/signin_password, lines 119–150 |
| Tableau API filter value not sanitized | Very low / informational | find_workbook_id_by_name, line 198 |
| Hardcoded credentials | None | — |
| Exposed secrets in docs or command file | None | — |
| XML injection risk | None (mitigated by defusedxml) |
— |
The only item worth addressing before merge is the HTTPS enforcement check — it's a one-liner guard that prevents accidental credential exposure if the env var or flag is misconfigured. The rest are informational.
…adable names - Add _build_calc_name_map() to map Calculation_* and *(copy)_* XML names to their human captions within a datasource element - Add _resolve_calc_refs() to substitute those IDs in formulas before display - Wire both into _extract_fields() so all output (and future classification) sees human-readable field names instead of opaque Tableau internal IDs - Add Task 0 to the implementation plan covering this with TDD steps Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Enforce HTTPS before sending credentials: get_auth_token() now raises SystemExit if the server URL does not start with https:// - Sign out session token after use: add signout() and wrap all three download paths (interactive, --exposure, --workbook) in try/finally so the token is always invalidated, even on error - Sanitize Tableau API filter value: strip ':' and ',' from workbook names in find_workbook_id_by_name() to prevent filter injection Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…anatory comment CodeQL flagged the bare `except: pass` as an empty handler with no explanation. Keep best-effort semantics but name the exception, add a comment clarifying the intent, and print to stderr when TABLEAU_DEBUG is set. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
scripts/tableau-extract-calcs.pyto extract calculated fields from Tableau workbooks (local.twbx/.twbor downloaded from Tableau Server). Supports exposure-aware mode that resolves the workbook LSID anddepends_onmodels directly frommodels/exposures/tableau.yml..claude/commands/tableau-upstream.md— a/tableau-upstreamslash command that guides Claude through the full upstream workflow: extract calcs → identify targetrpt_tableau__model → translate Tableau formula syntax to BigQuery SQL → update model SQL and YAML → validate withdbt show.Closes #3472
Test plan
uv run scripts/tableau-extract-calcs.py --exposure gradebook_and_gpa_dashboard --list-onlyand confirm it prints the workbook LSID anddepends_onmodels fromtableau.ymlwithout making network calls.twbxfile and confirm the markdown table of calculated fields is printed correctly/tableau-upstreamin a new Claude session and confirm the guided workflow prompts correctly at each step🤖 Generated with Claude Code