Skip to content

fix: replace tick-based SinceCondition with storage-id comparison#3422

Draft
cbini wants to merge 1 commit intomainfrom
fix/automation-condition-tick-collision
Draft

fix: replace tick-based SinceCondition with storage-id comparison#3422
cbini wants to merge 1 commit intomainfrom
fix/automation-condition-tick-collision

Conversation

@cbini
Copy link
Copy Markdown
Member

@cbini cbini commented Mar 12, 2026

Summary

  • Fixes a production bug where Dagster's SinceCondition silently swallows dep updates when both any_deps_updated (trigger) and newly_updated (reset) fire within the same evaluation tick — the reset unconditionally wins regardless of wall-clock ordering (SinceCondition discards trigger events when trigger and reset both fire in the same evaluation tick dagster-io/dagster#33587)
  • Introduces AnyDepsUpdatedSinceSelf, a custom SubsetAutomationCondition that compares monotonically-increasing storage_ids instead of relying on tick boundaries, with view lookthrough and diamond-DAG protection
  • Removes the now-dead _build_any_ancestor_updated function whose functionality is subsumed by the new condition

Test plan

  • Regression test test_table_requested_when_dep_updates_between_request_and_completion confirms the fix
  • All 28 automation condition tests pass (uv run pytest tests/test_automation_conditions.py -v)
  • All 5 individual code location definition validation tests pass
  • Monitor production automation evaluations after deploy to confirm no regressions

🤖 Generated with Claude Code

… dep updates

SinceCondition's tick-based state machine silently swallows dep updates when
both any_deps_updated (trigger) and newly_updated (reset) fire in the same
evaluation tick — the reset unconditionally wins regardless of wall-clock
ordering. This was observed in production when an upstream asset materialized
after the downstream's run completed but within the same tick window.

Introduces AnyDepsUpdatedSinceSelf, a custom SubsetAutomationCondition that
compares monotonically-increasing storage_ids instead of relying on tick
boundaries. Also supports will_be_requested() semantics for same-tick chain
propagation and view lookthrough with diamond-DAG protection.

Removes the now-dead _build_any_ancestor_updated function whose functionality
is subsumed by the new condition's _gather_dep_keys recursion.

Ref: dagster-io/dagster#33587

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@cbini cbini requested review from a team as code owners March 12, 2026 19:41
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

Claude finished @cbini's task in 2m 11s —— View job


Code Review

  • Gather context
  • Review automation_conditions.py
  • Review test_automation_conditions.py
  • Review CLAUDE.md update
  • Post final feedback

Overall Assessment

This is a well-reasoned fix for a real production bug. The storage-id comparison approach correctly sidesteps the tick-collision issue in SinceCondition, and the implementation is clean with good test coverage. A few things worth flagging:


AnyDepsUpdatedSinceSelf — design and correctness

get_inner_queryer_for_back_compat() coupling (automation_conditions.py:127)

The _back_compat suffix is a yellow flag — Dagster may remove this in a future version without a major version bump. The same is true for the two queryer methods used below it (get_latest_materialization_or_observation_storage_id, get_latest_materialization_or_observation_record). The codebase already does private-API patching (_patched_get_dep_keys), so this is a known trade-off, but it's worth keeping an eye on when upgrading Dagster.

Unpartitioned-only behavior (automation_conditions.py:128)

asset_partition = AssetKeyPartitionKey(context.key)

AssetKeyPartitionKey(key) with no partition key returns an unpartitioned record. For partitioned dbt assets this would silently query the wrong storage slot. Since all dbt models in this project appear to be unpartitioned this is fine in practice, but a comment or assertion (assert not context.partition_key) would make the assumption explicit and prevent silent misbehavior if a partitioned table ever adopts this condition.

result includes view deps themselves (automation_conditions.py:109)

dep_keys = set(asset_graph.get(key).parent_entity_keys) - ignore_keys
result = set(dep_keys)

Views are included in result along with their parents. This means if an intermediate view was re-materialized after self, the condition fires even without a table data change. This is arguably correct (a view refresh can indicate upstream data changed), but differs from the intent described in the docstring ("looks through intermediate views"). Worth a clarifying comment at the call site or in the docstring.

Diamond-DAG protection is correct_visited is shared across the full recursive traversal so each view is expanded at most once. Good.


compute_subset — logic

The two-check structure is sound:

  1. Storage-id comparison for materializations that already happened
  2. request_subsets_by_key for same-tick propagation through chains (mirrors will_be_requested() semantics)

The early-exit on my_storage_id is None (never materialized → defer to newly_missing) is correct and the comment explains it well.

One minor nit: the dep_record is not None check at line 159 returns the entire candidate subset on the first matching dep. For most dbt DAGs this is fine, but it means the condition can fire for a dep in ignore_selection-adjacent territory if _gather_dep_keys was called with a stale ignore_keys set. This looks correct as written — just confirming the ignore_keys subtraction happens before the loop.


Tests

The regression test (test_table_requested_when_dep_updates_between_request_and_completion) correctly mirrors the production scenario, including the T3 < T4 < T5 ordering that proves the storage-id comparison is doing the right thing. The comment mapping test steps to real timestamps is excellent for long-term maintainability.

Missing test coverage worth adding:

  • ignore_selection actually excludes the specified dep from triggering (currently only the empty selection is tested via _EMPTY_SELECTION)
  • Behavior when max_view_depth=0 (no view lookthrough)
  • View itself being newer than self fires the condition (verify the implicit behavior documented above)

These are non-blocking for the current fix — the regression test and the chain tests give solid confidence in the core change.


CLAUDE.md

The updated description accurately reflects the new implementation. The note about "unsynced badge" behavior is a useful callout for future maintainers.


Summary: The core fix is correct and well-tested. The most actionable suggestion before merging is adding a comment or assertion about the unpartitioned assumption in compute_subset. Everything else is minor or nice-to-have.

@cbini cbini marked this pull request as draft March 12, 2026 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant