Skip to content

feat: add /star-schema-advisor slash command and tableau workbook parser#3537

Open
GabyRangelB wants to merge 1 commit intomainfrom
cbaldor/feat/claude-star-schema-advisor
Open

feat: add /star-schema-advisor slash command and tableau workbook parser#3537
GabyRangelB wants to merge 1 commit intomainfrom
cbaldor/feat/claude-star-schema-advisor

Conversation

@GabyRangelB
Copy link
Copy Markdown
Contributor

Pull Request

Summary & Motivation

"When merged, this pull request will..."

Add the /star-schema-advisor slash command and tableau-analyze-workbook.py
script, consolidating three parallel efforts (#3513, PR #3508, and an
uncommitted handoff session) into a single solution for triaging Tableau
calculated fields into dbt mart changes and a semantic layer queue.

Components:

  • scripts/tableau-analyze-workbook.py — Parses .twb/.twbx XML, emits
    structured JSON with field metadata, viz usage (dimension vs measure per
    worksheet), LOD detection, and parameter extraction. Supports both local file
    (--file) and Tableau Server API (--exposure) input modes.
  • .claude/commands/star-schema-advisor.md — 8-step interactive workflow with
    4-bucket hybrid verdict system (move upstream / existing mart for Cube / new
    mart for Cube / stay on Tableau) and analyst approval gate before dbt changes.
  • Design spec at docs/superpowers/specs/2026-03-27-star-schema-advisor-design.md
  • 21 passing tests covering all script functionality
  • workbooks/ gitignored drop folder for .twbx files
  • docs/superpowers/star-schema-reports/ directory for audit trail reports

AI Assistance

Co-authored with Claude Code. Human-directed: requirements, approach selection
(Approach B: thin script + rich slash command), 4-bucket verdict taxonomy,
dual extraction mode decision. AI-assisted: script implementation, test fixtures,
slash command workflow prose, design spec drafting.

Self-review

General

  • Update due date and assignee on the
    TEAMster Asana Project
  • Review the Claude Code Review comment posted on this PR. Address valid
    feedback; dismiss false positives with a brief reply explaining why.

CI checks

  • Trunk — passes.
  • dbt Cloud — N/A (no dbt model changes).
  • Dagster Cloud — N/A (no Dagster changes).

Consolidates three parallel efforts into a single solution for triaging
Tableau calculated fields into dbt mart changes and a semantic layer queue.

Components:
- scripts/tableau-analyze-workbook.py: parses twb/twbx XML, emits structured
  JSON with field metadata, viz usage, LOD detection, and parameters
- .claude/commands/star-schema-advisor.md: 8-step interactive workflow with
  4-bucket hybrid verdict system and analyst approval gate
- 21 passing tests covering zip extraction, dedup, field filtering, calc
  name resolution, viz usage tracking, LOD detection, and integration

Closes #3513

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@GabyRangelB GabyRangelB requested a review from a team as a code owner March 27, 2026 16:02
@GabyRangelB GabyRangelB requested a review from a team as a code owner March 27, 2026 16:02
@GabyRangelB GabyRangelB requested review from a team as code owners March 27, 2026 16:02
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 27, 2026

Claude finished @GabyRangelB's task in 5m 6s —— View job


Code Review: feat: add /star-schema-advisor slash command and tableau workbook parser

  • Gather context and understand request
  • Read changed files
  • Analyze and write review findings

The overall architecture is solid — thin script + rich slash command, defusedxml used consistently, HTTPS enforced before credential transmission, finally guarantees signout on download failure, and test fixtures are well-structured. Two items should block merge; the rest are fixable here or in a fast-follow.


Critical

1. Unhandled StopIteration from bare next() in zip extraction (tableau-analyze-workbook.py:56, 64)

Both zip functions use:

twb_name = next(n for n in zf.namelist() if n.endswith(".twb"))

If the .twbx archive contains no .twb entry, Python raises StopIteration with a raw traceback — no actionable message for the analyst. This affects both the local-file and server paths. No test covers this failure case.

Fix: use next(..., None) and raise SystemExit with a useful message, or filter-then-check with LBYL.


Important

2. Dead code: find_workbook_id_by_name() is never called (tableau-analyze-workbook.py:488–504)

The server flow uses get_workbook_id_from_exposure() instead. find_workbook_id_by_name() has no callers anywhere. It should be removed.

3. Untyped XML element and Namespace parameters (tableau-analyze-workbook.py:71, 78, 103, 222, 372, 538)

_is_internal(column), _build_calc_name_map(datasource), _extract_viz_usage(root, ...), _parse_encoding_field(element, ...), _parse_parameters(root), and _fetch_from_server(args) all accept untyped parameters. With from __future__ import annotations already in place, these should use ET.Element and argparse.Namespace respectively.

4. Relative EXPOSURES_PATH + missing encoding= on read_text() (tableau-analyze-workbook.py:39, 512)

pathlib.Path("src/dbt/kipptaf/...") resolves against the caller's cwd. When load_exposure() fails, the error is a bare FileNotFoundError. Should be anchored to pathlib.Path(__file__).parent.parent or guarded with an explicit SystemExit. Also: EXPOSURES_PATH.read_text() should pass encoding="utf-8".

5. Inline import inside function body (test_tableau_analyze_workbook.py:25)

def _load_module():
    import importlib.util  # ← should be at module level

importlib.util is always available; there's no reason for a deferred import here.

6. Relative test paths break outside repo root (test_tableau_analyze_workbook.py:14–16)

SCRIPT_PATH = pathlib.Path("scripts/tableau-analyze-workbook.py")
FIXTURES = pathlib.Path("tests/scripts/fixtures")

These fail silently if pytest is invoked from any directory other than the repo root. Use __file__-anchored absolute paths:

_HERE = pathlib.Path(__file__).parent
FIXTURES = _HERE / "fixtures"
SCRIPT_PATH = _HERE.parent.parent / "scripts" / "tableau-analyze-workbook.py"

7. Return type -> dict | None is too loose (tableau-analyze-workbook.py:302)

The output shape is fully known. dict | None gives type checkers nothing to work with. Use dict[str, object] at minimum, or a TypedDict for the structured output (appropriate for Python 3.13).


Minor

8. Deleted slash command — deliverable is incomplete (.claude/commands/star-schema-advisor.md)

git status shows D .claude/commands/star-schema-advisor.md. The commit added the file (258 lines) but the working tree has it deleted. The spec lists this file as a primary deliverable. Without it, the script exists but the 8-step workflow that consumes it does not. Was this deletion intentional? If iterating further, this PR shouldn't merge until the file is restored or a follow-up is tracked.

9. Silent pass on RequestException in signout() (tableau-analyze-workbook.py:469–472)

Best-effort signout is a valid design, but bare pass means network failures are completely invisible. Add a warnings.warn() or logging.warning() call so failures are diagnosable.

10. Redundant str() cast on caption (tableau-analyze-workbook.py:576–586)

list_datasources() already returns caption as str. The str(s.get("caption", "")) cast in _build_output() is defensive noise; fix the type annotations upstream instead.

11. XML parsed N+1 times per workbook (tableau-analyze-workbook.py:283, 304)

list_datasources() and each parse_datasource() call both invoke ET.fromstring(twb_bytes). For N datasources, the XML is parsed N+1 times. Parse once and thread the root element through.

12. _is_internal() docstring vs. implementation mismatch (tableau-analyze-workbook.py:71–75)

Docstring says "system fields," but not caption also filters uncaptioned non-system fields. Not a logic error, but the docstring should reflect the actual behavior.


Summary

# Severity Location Issue
1 Critical tableau-analyze-workbook.py:56,64 Unhandled StopIteration from bare next() on zip namelist
2 Important tableau-analyze-workbook.py:488–504 find_workbook_id_by_name() is dead code
3 Important tableau-analyze-workbook.py:71,78,103,222,372,538 Untyped XML element and Namespace parameters
4 Important tableau-analyze-workbook.py:39,512 Relative EXPOSURES_PATH, missing encoding= on read_text()
5 Important test_tableau_analyze_workbook.py:25 Inline import inside function body
6 Important test_tableau_analyze_workbook.py:14–16 Relative test paths, location-sensitive
7 Important tableau-analyze-workbook.py:302 -> dict | None return type too loose
8 Minor .claude/commands/star-schema-advisor.md File deleted in working tree — deliverable incomplete
9 Minor tableau-analyze-workbook.py:469–472 Silent pass on signout failure with no log
10 Minor tableau-analyze-workbook.py:576–586 Redundant str() cast on typed caption
11 Minor tableau-analyze-workbook.py:283,304 XML parsed N+1 times
12 Minor tableau-analyze-workbook.py:71–75 _is_internal docstring vs. implementation mismatch

Items 1 (runtime crash on malformed .twbx) and 8 (slash command deleted, primary deliverable missing) should be resolved before merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: star-schema-advisor slash command

2 participants