[ty] improve complex TDD-based narrowing#23201
[ty] improve complex TDD-based narrowing#23201mtshiba wants to merge 75 commits intoastral-sh:mainfrom
Conversation
Typing conformance resultsNo changes detected ✅ |
Memory usage reportSummary
Significant changesClick to expand detailed breakdownprefect
sphinx
trio
flake8
|
Merging this PR will improve performance by 24.57%
Performance Changes
Comparing |
d247e5c to
7182d3c
Compare
|
13ce7d1 to
eb8977b
Compare
|
My original idea didn't work, but instead I was able to suppress the exponential blowup by implementing a cache for |
2503e23 to
6ded4be
Compare
bda7146 to
61ea03f
Compare
61ea03f to
0e2ca45
Compare
|
| Lint rule | Added | Removed | Changed |
|---|---|---|---|
invalid-key |
0 | 51 | 0 |
unresolved-attribute |
0 | 4 | 32 |
possibly-unresolved-reference |
0 | 20 | 0 |
invalid-argument-type |
1 | 3 | 12 |
unresolved-reference |
0 | 10 | 0 |
unsupported-operator |
0 | 1 | 6 |
invalid-return-type |
2 | 0 | 3 |
not-iterable |
0 | 0 | 2 |
unused-type-ignore-comment |
1 | 1 | 0 |
invalid-assignment |
0 | 0 | 1 |
no-matching-overload |
0 | 1 | 0 |
possibly-missing-attribute |
0 | 1 | 0 |
type-assertion-failure |
0 | 1 | 0 |
| Total | 4 | 93 | 56 |
| .. | ||
| }) => Some(!resolve_to_literal(operand)?), | ||
| _ => None, | ||
| #[derive(Copy, Clone)] |
There was a problem hiding this comment.
In the end, this didn't seem to do much to improve the performance degradation, but I don't think there's anything bad that can come from leaving it.
| self.record_narrowing_constraint(negated_predicate); | ||
| self.record_reachability_constraint(negated_predicate); | ||
| let predicate_id = self.record_narrowing_constraint(negated_predicate); | ||
| self.record_reachability_constraint_id(predicate_id); |
There was a problem hiding this comment.
Because the reachability constraint and the narrowing constraint point to the same predicate, we can reuse the same ID to save memory. That was the original intention, but this change was actually essential for this PR (it was essential for the Bindings::merge change https://github.com/astral-sh/ruff/pull/23201/changes#r2800954123).
Without this identification, it seems that narrowing would be mistakenly determined to be a non no-op when it should be no-op.
| place: ScopedPlaceId, | ||
| ) -> Type<'db> { | ||
| self.narrow_by_constraint_inner(db, predicates, id, base_ty, place, None) | ||
| let mut memo = FxHashMap::default(); |
There was a problem hiding this comment.
Caching helps mitigate the exponential blowup reported in 5fd9a9c.
| { | ||
| ScopedNarrowingConstraint::ALWAYS_TRUE | ||
| } else { | ||
| // A branch contributes narrowing only when it is reachable. |
There was a problem hiding this comment.
This technique is possible because narrowing and reachability are now managed in the same data structure. The narrowing of a/b is triggered when the reachability of a/b is ALWAYS_TRUE.
That is, gate each narrowing constraint with each reachability constraint with and.
|
I've tried various things, but this is the limit. I'm out of ideas for optimization, and the codspeed profiling results show no abnormal hot spots. So I'll mark this PR as ready for review. |
fd08b52 to
a82538c
Compare
6146f04 to
52c4104
Compare
|
|
||
| /// Inner recursive helper that accumulates narrowing constraints along each TDD path. | ||
| #[allow(clippy::too_many_arguments)] | ||
| fn narrow_by_constraint_inner<'db>( |
There was a problem hiding this comment.
This function was the hotspot in most of the performance degradation cases, as we started to build complex narrowing TDD and the number of recursive calls grew significantly.
| /// ... | ||
| /// case _: pass | ||
| /// ``` | ||
| fn benchmark_large_union_narrowing(criterion: &mut Criterion) { |
There was a problem hiding this comment.
this benchmark currently shows up as "new" in the codspeed report, so we can't see how much this PR improves performance on this benchmark relative to main. It might be better to add this benchmark in a standalone PR, wait for codspeed to finish on that PR branch, and then rebase this PR on top of that PR branch. Then Codspeed should tell us how much this PR improves performance on this benchmark.
| } | ||
|
|
||
| /// Inner recursive helper that accumulates narrowing constraints along each TDD path. | ||
| #[allow(clippy::too_many_arguments)] |
There was a problem hiding this comment.
| #[allow(clippy::too_many_arguments)] | |
| #[expect(clippy::too_many_arguments)] |
| /// if reachability analysis etc. fails when analysing these enums. | ||
| const MAX_NON_RECURSIVE_UNION_ENUM_LITERALS: usize = 8192; | ||
|
|
||
| #[allow(clippy::struct_excessive_bools)] |
There was a problem hiding this comment.
| #[allow(clippy::struct_excessive_bools)] | |
| #[expect(clippy::struct_excessive_bools)] |
| } | ||
| } | ||
|
|
||
| #[allow(clippy::unnecessary_wraps)] |
There was a problem hiding this comment.
| #[allow(clippy::unnecessary_wraps)] | |
| #[expect(clippy::unnecessary_wraps)] |
| ) | ||
| } | ||
|
|
||
| #[allow(clippy::unnecessary_wraps)] |
There was a problem hiding this comment.
| #[allow(clippy::unnecessary_wraps)] | |
| #[expect(clippy::unnecessary_wraps)] |
| // Fast path for intersection types: use set-based subset check instead of | ||
| // the full `has_relation_to` machinery. This is critical for narrowing where | ||
| // many intersection types with overlapping positive elements are produced. | ||
| if let (Type::Intersection(self_inter), Type::Intersection(other_inter)) = (self, other) { |
There was a problem hiding this comment.
I'm not sure I understand this comment: why is it important for this fast path to be in Type::has_relation_to() (an uncached function) rather than Type::has_relation_to_impl (a cached function)? Can this fast path not be applied to the (Type::Intersection(...), Type::Intersection(...)) branch of Type::has_relation_to_impl?
| #[salsa::tracked( | ||
| cycle_initial=|_, id, _, _| Type::divergent(id), | ||
| cycle_fn=|db, cycle, previous: &Type<'db>, result: Type<'db>, _, _| { | ||
| result.cycle_normalized(db, *previous, cycle) | ||
| }, | ||
| heap_size=ruff_memory_usage::heap_size | ||
| )] | ||
| fn from_two_elements(db: &'db dyn Db, a: Type<'db>, b: Type<'db>) -> Type<'db> { | ||
| IntersectionBuilder::new(db) | ||
| .positive_elements([a, b]) | ||
| .build() | ||
| } |
There was a problem hiding this comment.
I would be very interested to see the effect of this change (and the changes elsewhere to use this new constructor) in isolation, pulled out into a standalone PR. I suspect it may be a significant factor in both the memory-usage increase on this branch and the pydantic performance improvement on this branch
Summary
An unsolved issue in #23109 is that the following narrowing doesn't work properly:
This was attempted to be fixed in ada1084, but was abandoned due to the exponential blowup (5fd9a9c).I've come up with a different approach to this, so I'll try it out to see if it works.
What this PR does is simple: when merging bindings, it "gates" narrowing constraints with reachability constraints.
This allows us to exclude types from unreachable paths.
The problem is that this increases the size of the TDD used for narrowing, which causes serious performance degradation.
This PR also implements optimizations to prevent this.
In order of importance:
narrow_by_constraintcache: Without it, CI would be so slow that it would not end (989268d).NarrowingConstraintstructure:intersection_disjunctnow has multiple types (Conjunctions), and intersection construction is now delayed (a1e303e).all_negative_narrowing_constraints_for_expression:all_narrowing_constraints_for_expressionnow calculates both positive and negative values simultaneously (8468a9e).build_predicate: This did not seem to have much effect on this PR....
ecosystem-analyzer timing results showed a significant regression in egglog-python that was missed by codspeed. Each commit gradually mitigates this regression.
narrow_by_constraint_innerall_negative_narrowing_constraints_for_...NarrowingConstrainthasCunjunctionsTest Plan
mdtest updated