Skip to content

refactor(http): extract BaseHTTPResource base class from 9 API resources#3558

Open
cbini wants to merge 27 commits intomainfrom
cbini/refactor/claude-base-http-resource
Open

refactor(http): extract BaseHTTPResource base class from 9 API resources#3558
cbini wants to merge 27 commits intomainfrom
cbini/refactor/claude-base-http-resource

Conversation

@cbini
Copy link
Copy Markdown
Member

@cbini cbini commented Apr 1, 2026

Pull Request

Summary & Motivation

"When merged, this pull request will..."

Extract a reusable BaseHTTPResource base class from 9 existing Dagster HTTP
resource classes, eliminating duplicated session lifecycle, retry, error
handling, and pagination logic.

What changed:

  • New src/teamster/libraries/http/ package with BaseHTTPResource (lifecycle,
    request pipeline with tenacity retry, error handling with rate-limit awareness,
    re-auth guard) and three pagination helpers (cursor, offset, page)
  • All 9 active HTTP resources migrated to inherit from BaseHTTPResource:
    SmartRecruiters, PowerSchool Enrollment, Overgrad, KnowBe4 (Tier 1);
    Zendesk, Coupa, Grow (Tier 2); ADP WFN, DeansList (Tier 3)
  • ADP WFN post() renamed to post_action() (4 callers updated) to avoid
    base class conflict
  • 67 unit tests added (37 base class + pagination, 30 subclass migration tests)
  • Updated spec with audit findings (FinalsiteResource reclassified as vestigial)

Vestigial resources left untouched: AdpWorkforceManagerResource,
MClassResource, DibelsDataSystemResource, CouchdropResource,
AlchemerResource, FinalsiteResource.

AI Assistance

Implementation was AI-assisted (Claude Code) following a human-directed spec and
implementation plan. Spec design, audit, and all architectural decisions were
human-directed. Implementation was subagent-driven with review between tasks.

Self-review

General

  • Update due date and assignee on the
    TEAMster Asana Project
  • Review the Claude Code Review comment posted on this PR. Address valid
    feedback; dismiss false positives with a brief reply explaining why.
    (Claude is advisory — use your judgement, but don't ignore it.)

Dagster (skip if no Dagster changes)

  • Run uv run dagster definitions validate for any modified code location
  • Run uv run pytest tests/test_dagster_definitions.py for any modified
    code location
  • New integrations follow the
    Library + Config pattern
    with the correct asset key format and IO manager

CI checks

  • Trunk — passes. If it fails, run trunk check and trunk fmt
    locally, fix any issues, and push again.
  • dbt Cloud — passes. If it fails: click Details on the check →
    expand Invoke dbt build ...Debug Logs. Or tag Claude on
    the PR with the error details.
  • Dagster Cloud — passes or not triggered (only runs when relevant paths
    change and the PR is not a draft). If it fails: check the Actions tab
    for the workflow run logs. Re-run failed jobs if the failure looks
    transient; otherwise check the Dagster Cloud branch deployment for
    error details. Or tag Claude on the PR for help.

🤖 Generated with Claude Code

cbini and others added 20 commits March 25, 2026 22:25
Elevate design specs directive to top-level Working Convention in CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add InitResourceContext types, check.not_none pattern, teardown
net-new note, tenacity cancellation caveat, and OAuth2Session typing
guidance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix "see note below/above" reference direction and clarify request
pipeline diagram to show _prepare_request destructuring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Audit found FinalsiteResource is vestigial (code locations use SFTP),
Overgrad/DeansList use page-based pagination (not offset), and ADP WFN
stops on HTTP 204. Updated spec with all resolutions and wrote 13-task
implementation plan for BaseHTTPResource extraction.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d URL construction

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…h mTLS and offset pagination

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tion and beta URL logic

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cbini cbini requested review from a team as code owners April 1, 2026 03:34
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Claude finished @cbini's task in 14m 26s —— View job


Code Review: PR #3558refactor(http): extract BaseHTTPResource base class

  • Gather context and git SHAs
  • Read key source files
  • Run code-reviewer subagent
  • Post review findings

Strengths

The PR achieves its core goal cleanly. Eliminating duplicated session lifecycle, retry, and error handling code across nine resources is real and substantial. The three pagination helpers (paginate_cursor, paginate_offset, paginate_page) are clean, composable, and well-parameterized. The _prepare_request hook pattern is a good extension point. The tenacity retry + _NonRetryableHTTPError sentinel is an elegant way to selectively escape the retry loop. Test coverage is thorough for both the base class and each subclass, with mocking done at the right seam (_session.request rather than _request). The post_action() rename is handled completely across all four callers.


Critical Issues

1. Bare raise outside an exception handler will crash at runtime

src/teamster/libraries/http/resources.py, line 129:

def _reauthenticate(self) -> None:
    raise

A bare raise re-raises the active exception from the current thread's exception context. _reauthenticate() is called from _handle_error(), which is called from inside an except HTTPError as exc: block in _request_with_retry(). By the time Python enters _reauthenticate, it has already left the except block — there is no active exception on the stack. The raise will produce RuntimeError: No active exception to re-raise, not the intended HTTPError propagation.

Any 401 response will crash with an unexpected RuntimeError rather than surfacing as an HTTPError to the retry machinery.

The test test_401_calls_reauthenticate_once patches _reauthenticate with a no-op mock, which intercepts the call before the bare raise executes — this is why the test passes while masking the defect.

Fix: Accept the HTTPError as a parameter and re-raise it explicitly:

def _reauthenticate(self, error: HTTPError) -> None:
    raise error

2. paginate_cursor hardcodes page[after] as the cursor parameter name

src/teamster/libraries/http/pagination.py, line 35:

params["page[after]"] = cursor

This is Zendesk's convention. paginate_cursor is a general-purpose helper, but the cursor injection is hardcoded. Any future integration that uses a different cursor param name (after, cursor, next_page_token) cannot use this helper without copying it. paginate_offset and paginate_page both parameterize their key names; paginate_cursor is inconsistent.

Fix: Add cursor_param: str = "page[after]" parameter.


3. test_get_retry_after_parses_http_date has a dead assertion and a broken mock

tests/libraries/http/test_resources.py, lines 249–264:

datetime(2026, 4, 1, 12, 0, 5, tzinfo=timezone.utc)  # no-op, immediately discarded

The mock patches teamster.libraries.http.resources.datetime, but _get_retry_after calls parsedate_to_datetime (from email.utils) which is not patched — so it uses the real system clock for datetime.now(). The assertion abs(result - 5.0) < 1.0 is only correct if the test runs at exactly 2026-04-01T12:00:00Z. This test will silently fail on any other date/time.

Fix: Patch email.utils.parsedate_to_datetime or pin the mock's now() return value to make the test time-independent.


Important Issues

4. _post_request in DeansListResource is misleadingly named

src/teamster/libraries/deanslist/resources.py, lines 57–59 and 171–175:

_post_request sounds like a base class pipeline hook (analogous to _prepare_request) but is not — it's a private method only called by _request_with_cleanup(). A reader of the base class will naturally look for a matching _post_request counterpart that doesn't exist.

Fix: Rename to something that reflects its specific purpose (e.g., _strip_api_key), or add a proper _post_request hook to the base class.

5. Orphaned Session in ADP and Coupa _setup_session

src/teamster/libraries/adp/workforce_now/api/resources.py, line 23; src/teamster/libraries/coupa/resources.py, line 24:

BaseHTTPResource initializes _session via PrivateAttr(default_factory=Session). These subclasses then replace self._session entirely with an OAuth2Session. The initially-allocated Session is orphaned (never closed). teardown_after_execution closes the replacement correctly, but the first Session leaks.

Fix: Either don't allocate the default Session (change the PrivateAttr default to None and construct in _setup_session), or add a base class docstring note that subclasses replacing self._session should call self._session.close() first.

6. ZendeskResource._get_retry_after can return a negative value

src/teamster/libraries/zendesk/resources.py, lines 39 and 45:

return float(reset) - time.time() + 1

The base class clamps to max(delta, 0.0). This override adds 1 second padding but does not clamp — if the reset timestamp has already passed, the return value will be negative. time.sleep(-n) is a no-op on most platforms, but this is inconsistent with the base class contract.


Minor Issues

7. DeansListResource._current_school_id not reset on exception

src/teamster/libraries/deanslist/resources.py, lines 96, 165:

If a request inside get() or list() raises, self._current_school_id is never reset to 0. A subsequent call on a different school would inject the wrong API key via _prepare_request. Pre-existing issue, but this refactor is a good opportunity to add try/finally.

8. AdpWorkforceNowResource.extract_records on 204 No Content

src/teamster/libraries/adp/workforce_now/api/resources.py, lines 71–83:

When ADP returns 204, extract_records calls resp.json()[endpoint_name], but a 204 response has no body — resp.json() will raise JSONDecodeError before is_last_page is checked. Pre-existing behavior preserved by the refactor. A comment noting that ADP never actually returns 204 in practice (or a body guard) would clarify intent.

9. All three pagination helpers yield an empty last page without documentation

src/teamster/libraries/http/pagination.py:

yield records is unconditional before the cursor/next-page check. If the final page has zero records, callers receive one empty list. Internally consistent across all three helpers, but undocumented. Accumulation patterns like list(chain.from_iterable(...)) handle this fine; direct iteration may not.


Assessment

Needs work before merge.

Issues 1 and 3 are the blockers: the bare raise will cause any 401 response to raise RuntimeError in production (the only subclasses that override _reauthenticate are the ones that need it), and the HTTP-date test is time-dependent and already broken. Issue 2 is a design gap better fixed now than after other integrations build on the hardcoded behavior. Issues 4–9 are correctness or clarity concerns that should be addressed but are not merge blockers.

- Add cursor_param to paginate_cursor (was hardcoded to page[after])
- Clean up HTTP-date test (remove dead code, use pytest.approx)
- Rename _post_request to _strip_api_key in DeansListResource
- Document session replacement in _setup_session docstring
- Clamp Zendesk _get_retry_after to max(..., 0.0)
- Add try/finally for DeansList _current_school_id reset
- Guard ADP extract_records against 204 No Content

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cbini
Copy link
Copy Markdown
Member Author

cbini commented Apr 1, 2026

@claude

#1 — bare raise is correct. _reauthenticate() is called from within the except HTTPError block in _request_with_retry (line 224). Python's exception context persists through nested function calls. Verified with a reproducer.
#9 — empty last-page yield is by design, consistent across all three paginators, and harmless to callers using .extend().

update your review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Claude finished @cbini's task in 1m 24s —— View job


Code Review Update: PR #3558

  • Verify bare raise claim against actual source
  • Verify empty last-page yield claim
  • Update review findings

Updated Findings

Issue #1 — retracted. The bare raise in _reauthenticate() is correct. Python's exception context (sys.exc_info()) is thread-local and persists through nested function calls made within an except block. Since _reauthenticate() is called (via _handle_error) from within the except HTTPError as exc: block at resources.py:228, the active exception is still on the stack and bare raise re-raises it as intended. The original review was wrong on this point.

Issue #9 — acknowledged as by design. yield records before the cursor/stop check is intentional and consistent across all three pagination helpers. Callers using .extend() or chain.from_iterable() handle an occasional empty list correctly. No change needed.


Status

All other issues from the original review have been addressed in 6a8d3e9b4:

  • cursor_param added to paginate_cursor (was hardcoded)
  • HTTP-date test cleaned up (dead assertion removed, pytest.approx used)
  • _post_request renamed to _strip_api_key in DeansListResource
  • Session replacement documented in _setup_session docstring
  • Zendesk _get_retry_after clamped to max(..., 0.0)
  • _current_school_id reset guarded with try/finally
  • ADP extract_records guarded against 204 No Content

The PR looks good to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor: extract base HTTP resource from repetitive requests.Session wrappers

2 participants