⚡ Optimize duplicate export detection from O(N^2) to O(N)#15
⚡ Optimize duplicate export detection from O(N^2) to O(N)#15bashandbone wants to merge 2 commits intomainfrom
Conversation
Replaces the O(N^2) duplicate detection logic in `PropagationGraph._validate_no_duplicates` (which used `export_names.count(name)` inside a list comprehension) with an O(N) approach using `collections.Counter(export_names)`. This significantly improves the performance of validation for large module graphs. Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
There was a problem hiding this comment.
Pull request overview
Optimizes the duplicate-export validation path in PropagationGraph by replacing an O(N²) duplicate scan with collections.Counter, aiming to speed up manifest generation for large export sets.
Changes:
- Add
import collections. - Use
collections.Counter(export_names)to compute duplicate names in_validate_no_duplicates.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| duplicates = [name for name, count in collections.Counter(export_names).items() if count > 1] | ||
| raise ValueError( | ||
| f"❌ Error: Duplicate exports in module {module_path}\n\n" | ||
| f" Duplicates: {', '.join(set(duplicates))}" |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Signed-off-by: Adam Poulemanos <89049923+bashandbone@users.noreply.github.com>
💡 What: Replaced the list comprehension
export_names.count(name)withcollections.Counter(export_names)in_validate_no_duplicates.🎯 Why:
export_names.count()iterates over the entire list for every single element, making it an O(N^2) operation. When generating manifests for modules with thousands of exports, this check becomes a bottleneck. Using acollections.Countercorrectly counts occurrences in O(N) time.📊 Measured Improvement: In a targeted benchmark testing duplicate detection on an array of 20,000 strings:
PR created automatically by Jules for task 6934359350758911397 started by @bashandbone