Skip to content

refactor: hub-aware decompose grouping to prevent mega-clusters#956

Open
chubes4 wants to merge 1 commit intomainfrom
improve-decompose-grouping
Open

refactor: hub-aware decompose grouping to prevent mega-clusters#956
chubes4 wants to merge 1 commit intomainfrom
improve-decompose-grouping

Conversation

@chubes4
Copy link
Member

@chubes4 chubes4 commented Mar 23, 2026

Summary

  • Fix: The refactor decompose algorithm used undirected union-find for call graph clustering, which collapsed everything reachable from a high-fan-out orchestrator function into a single mega-group
  • Change: Identify "hub" functions (≥4 callees) and exclude their edges from clustering, producing focused sub-groups instead of mega-clusters
  • Change: Add dominant prefix detection for semantic cluster naming (e.g., resolve_* functions → resolve.rs) instead of naming after the most-called function

Before vs After

contract_testgen.rs (2,998 lines)

Before (3 groups, one mega-group of 15):

infer_hint_for_param.rs — 15 functions (including the main API!)
make_test_async.rs      — 2 functions
types.rs                — 3 structs

After (8 focused groups):

build.rs              — build_param_inputs, build_variables
condition_contains.rs — condition_contains_param_method, condition_contains_negated_method
default_call_arg.rs   — default_call_arg, infer_setup_from_condition, resolve_constructor
generate_test.rs      — generate_test_plan, generate_test_plan_with_types
helpers.rs            — derive_template_key, slugify, infer_hint_for_param, extract_method_string_arg
like.rs               — is_path_like, is_numeric_like
make_test_async.rs    — make_test_async, render_test_plan
types.rs              — SetupOverride, TestCase, TestPlan

extension/mod.rs (1,130 lines) → 9 groups

rename/mod.rs (1,912 lines) → 9 groups

How It Works

The key insight is that orchestrator functions (those that call 4+ other functions in the same file) create transitive closure in union-find. By excluding hub edges:

  1. Leaf helper clusters form naturally (2-3 tightly coupled functions)
  2. Hub functions fall through to name-based clustering where they group with similarly-named peers
  3. No single group absorbs the majority of the file's functions

Note

The Rust grammar parser currently only matches pub and pub(crate) function visibility — pub(super) and private functions with complex signatures may not be parsed. This limits decompose to ~20 of ~30 items in contract_testgen.rs. That's a separate grammar issue (#818 scope).

Related

The call graph clustering used undirected union-find, which collapsed
everything reachable from a high-fan-out orchestrator into a single
mega-group. For contract_testgen.rs (3k lines), this produced 3 groups
with one containing 15 functions.

Changes:
- Identify hub functions (>=4 callees) and exclude their edges from
  union-find, preventing transitive mega-clusters
- Add dominant prefix detection for semantic cluster labels (e.g.,
  resolve_*, infer_*) instead of naming after the most-called function
- Expand stop word list to avoid generic cluster names
- Hub functions fall through to name-based clustering where they can
  form focused groups with similarly-named functions

Results on contract_testgen.rs:
  Before: 3 groups (one with 15 functions named 'infer_hint_for_param')
  After:  8 focused groups (build, generate_test, helpers, types, etc.)

Results on extension/mod.rs:
  Before: monolithic groups
  After:  9 groups (capability, find_extension, resolve, types, etc.)

Results on rename/mod.rs:
  Before: monolithic groups
  After:  9 groups (case_utilities, reference_finding, rename_generation, types, etc.)
@chubes4 chubes4 added the cleanup Dead code removal and cleanup label Mar 23, 2026
@homeboy-ci
Copy link
Contributor

homeboy-ci bot commented Mar 23, 2026

Homeboy Results — homeboy

Lint

Failure Digest

Lint Failure Digest

Autofixability classification

  • Overall: auto_fixable
  • Autofix enabled: yes
  • Autofix attempted this run: no
  • Auto-fixable failed commands:
    • lint
  • Failed commands with available automated fixes:
    • lint

Machine-readable artifacts

  • homeboy-lint-summary.json
  • homeboy-test-failures.json
  • homeboy-audit-summary.json
  • homeboy-autofixability.json

⚡ Scope: changed files only

lint (changed files only)

Audit

⚡ Scope: changed files only

audit (changed files only)

Auto-refactor

ℹ️ Autofix enabled, but no fixable file changes were produced

Failure Digest

Autofixability classification

  • Overall: human_needed
  • Autofix enabled: yes
  • Autofix attempted this run: no
  • Human-needed failed commands:
    • refactor --from all

Machine-readable artifacts

  • homeboy-lint-summary.json
  • homeboy-test-failures.json
  • homeboy-audit-summary.json
  • homeboy-autofixability.json

⚡ Scope: changed files only

refactor --from all

Tooling versions
  • Homeboy CLI: homeboy 0.85.3+a5999fe
  • Extension: rust from https://github.com/Extra-Chill/homeboy-extensions
  • Extension revision: unknown
  • Action: Extra-Chill/homeboy-action@v2

Homeboy Action v1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cleanup Dead code removal and cleanup

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant