Releases: peteromallet/desloppify
v0.9.9
This release focuses on plan lifecycle robustness — fixing workflow deadlocks, auto-resolving stale issues, hardening the reconciliation pipeline, and replacing heuristics with explicit cluster semantics. It also includes C++ detector scoping improvements from a community contributor and several UX fixes that prevent agents from getting stuck mid-cycle.
366 files changed | 16 commits | 5,367 tests passing
Refactoring & Internal Cleanup
This release continues the pattern of tightening seams and reducing indirection across the codebase. Over half the 366 changed files are internal restructuring:
- Cluster and override → subpackages —
cluster_ops_display.py,cluster_ops_manage.py,cluster_ops_reorder.py,cluster_update.py, andcluster_steps.pymoved into acluster/subpackage. Same treatment foroverride_io.py,override_misc.py,override_skip.py, andoverride_resolve_*intooverride/. - Holistic cluster accessors inlined — ~8 small wrapper files in
context_holistic/deleted (_clusters_complexity.py,_clusters_consistency.py,_clusters_dependency.py,_clusters_security.py, etc.) and inlined into their callers - Plan sync pipeline extracted — new
sync/pipeline.pyandsync/phase_cleanup.pypulled out of the monolithic workflow, withreconcile.pyrenamed toscan_issue_reconcile.pyand review import reconcile moved intosync/review_import.py - Issue semantics centralized — new
issue_semantics.py(~225 lines) consolidating classification logic that was previously scattered across multiple modules - Plan reconcile simplified —
scan/plan_reconcile.pycut from ~470 lines to ~200 by extracting shared logic into the engine layer - Work queue snapshot overhaul —
snapshot.pygained ~470 lines of phase-aware partitioning and ranking refinements, replacing ad-hoc ordering logic - TS dead code removed —
helpers_blocks.pyandhelpers_line_state.pydeleted (~200 lines of unused smell detection helpers) - Broad type/schema updates — issue type references and state schema types updated across 130+ files for consistency with the new issue semantics
Auto-Resolve Issues for Deleted Files
When a scan runs and a previously-flagged file no longer exists on disk, its open issues are now automatically set to auto_resolved with a clear note. Previously, issues for deleted files would remain open and pollute the work queue indefinitely — particularly painful in Rust projects where module reorganization is common. Closes #412.
Triage Deadlock Fix
Fixed a deadlock where triage was stale (new review issues arrived mid-cycle), but triage couldn't start because objective backlog was still open, and objective resolves were blocked because triage was stale. The fix detects this "pending behind objective backlog" state and allows objective work to continue while keeping review resolves gated. The banner now shows TRIAGE PENDING instead of nudging toward a triage command that can't run yet. Community contribution from @imetandy (#413).
Batch Runner Stall Detection Fix
The review batch runner's stall detector was prematurely killing codex batches during their initialization phase — before any output file was written. This caused --import-run to fail with "missing result files for batches" errors. The stall detector now never declares a stall when no output file exists yet, while the hard timeout still catches truly hung batches. Closes #417 and #401.
Sequential Reconciliation Pipeline
Fixes a cluster tracker race condition on parallel updates. A new shared reconciliation pipeline runs all sync steps sequentially: subjective dimensions, auto-clustering, score communication, plan creation, triage, and lifecycle phase. This replaces the previous approach where parallel operations could produce inconsistent plan state.
Explicit Cluster Semantics
Clusters now carry explicit action_type (auto_fix, refactor, manual_fix, reorganize) and execution_policy (ephemeral_autopromote, planned_only) rather than relying on command-string sniffing. A new cluster_semantics.py module provides canonical semantic helpers, and the work queue uses these for phase-aware ordering instead of inferring intent from command strings.
C++ Detector Scoping Improvements
Three targeted fixes to the C++ plugin, contributed by @Dragoy (#415):
- Security findings scoped to first-party files — clang-tidy and cppcheck findings from vendor/external headers are now filtered out instead of being reported as project issues
- CMake-based test coverage mapping —
CMakeLists.txtfiles are parsed foradd_executable/add_library/target_sourcesto discover which source files a test target compiles, treating that as direct test coverage - Unused-imports phase disabled for C++ — the generic tree-sitter unused-import detector is unsound for
#includesemantics and now skips C++ projects - Header extension support —
_extract_import_namenow handles.h,.hh,.hppextensions correctly
Flexible Triage Attestations
Triage attestation validation for organize, enrich, and sense-check stages no longer requires literal cluster name references. Users can now provide substantive work-product descriptions as an alternative, making the triage workflow less rigid for both human and AI operators.
Triage Validation & Sense-Check Enhancements
- Sense-check stage gets a dedicated orchestrator with expanded prompts and evidence parsing
- Triage completion policy significantly enhanced with richer stage validation
- Stage prompt instruction blocks expanded for clearer agent guidance
- Evidence parsing extracted into a dedicated module
Other Improvements
.gitignorereminder added to README setup instructions (#416)- PyPI publish workflow push triggers restored while maintaining the main-branch gate
- Tweet release tests now properly stub the
requestsmodule for CI isolation
Community
Thanks to @imetandy for the triage deadlock fix and @Dragoy for the C++ detector scoping improvements. Issues and feedback from @guillaumejay, @wuurrd, @astappiev, @efstathiosntonas, @xliry, @kendonB, @WojciechBednarski, and @jakob1379 helped shape this release.
v0.9.8
This release adds full C++ and Rust language plugins, introduces two-phase review scoring, unified issue lifecycle status, anti-gaming safeguards, and delivers extensive triage validation, work queue, and cross-platform improvements — alongside continued code quality cleanup that removes 23 compat wrappers and tightens seams throughout the codebase.
609 files changed | 78 commits | 5,266 tests passing
C++ Language Support
C++ is now a full-depth language plugin with tree-sitter-based extraction, structural analysis, and tool-backed security scanning. The plugin includes:
- Function/class/include extraction from C++ source files
- Dependency graph analysis via
#includegraphs fromcompile_commands.jsonand Makefile projects - Structural and coupling phases with cppcheck integration and batch issue scanning
- Security detection with normalized findings from cppcheck and clang-tidy
- Review surfaces, test coverage hooks, and move support
- 14 test files with fixtures for CMake and Makefile sample projects
Rust Language Support
Rust gains full plugin parity with 13 Rust-specific detectors, 3 auto-fixers, and deep cargo toolchain integration:
- 13 detectors across 6 modules: API surface, cargo policy, safety, smells, dependencies, and custom rules
- 3 auto-fixers: crate imports, cargo features, readme doctests
- Cargo tool integration: clippy, cargo check, rustdoc
- Rust-aware dependency graphing and test coverage mapping with inline
#[cfg(test)]recognition - 117 tests across 12 test files
Two-Phase Review Scoring
Holistic review is restructured into two distinct phases:
- Phase 1 — Observe: collect characteristics and defects without scoring
- Phase 2 — Judge: synthesize dimension character from observations, then score
Positive observations now persist as context insights with full provenance (added_at, source, positive: true), replacing ephemeral strengths. A new context_schema.json defines the review data framework.
Unified Issue Lifecycle Status
DEFERRED and TRIAGED_OUT are added to the Status enum so state is always authoritative for issue disposition. Previously, temporary and triaged-out skips left issue.status as "open", causing overcounting in plan rendering and queue surfaces. Includes status migration on scan reconcile, new status icons, updated plan rendering, and surfaces history to reviewers with --retrospective True by default.
Anti-Gaming Safeguards
Two targeted fixes prevent AI agent score-anchoring:
- Numeric target redacted from penalty messages — "matched target 95.0" replaced with "clustered on the scoring target" so agents cannot infer and anchor on the exact number
- Blind-review workflow surfaced in the very first penalty message (previously only after streak ≥ 2), pointing agents to the blind packet and overlay docs immediately
Triage Validation Overhaul
- Reflect dispositions are now binding for organize — a structured
ReflectDispositiondataclass is parsed from the Coverage Ledger; organize validates that plan state matches every reflected disposition before submission - Organize validation extracted into dedicated
organize_policy.py, separated from batch context normalization - Suggestion and evidence surfaced in
showandclustercommands - Completion flow, observe batches, and stage queue gained new focused submodules
Work Queue Decomposition
The monolithic _work_queue/core.py and lifecycle.py are split into 5 focused modules (models.py, inputs.py, selection.py, finalize.py, snapshot.py). The engine/plan_queue.py facade is deleted. A canonical QueueSnapshot provides phase-aware partitioning.
Cross-Platform Hardening
- Cross-platform state locking:
fcntlon Unix,msvcrton Windows for atomic state file persistence - Windows tool argv parsing: generic tool commands now execute correctly on Windows
- Windows WinError 2 fix: codex exec spawning uses
shutil.which()to resolve.cmdbatch shims
Code Quality & Cleanup
- 23 compat wrapper files deleted — 13
_framework/wrappers, 8context_holistic/wrappers, 2helpers/wrappers — plus removal ofSimpleNamespacefake-module antipatterns - Go generated files now skipped during scanning, with improved import-run error messages
- Scan export and scoring impact crash paths fixed
- Rust workspace rustdoc execution repaired
- Extensive seam-tightening: holistic review prep, triage validation, scan sync workflow, language framework surface, tree-sitter runtime caches, and reporting/planning helpers all streamlined
v0.9.5
This release adds Julia language support, extends the tree-sitter framework, rebalances the health score toward subjective dimensions (now 75/25), and delivers a broad set of improvements to stability, triage process reliability, security hardening, and platform-specific fixes — alongside significant code quality cleanups that reduce indirection and remove over-extractions.
354+ files changed | 80+ commits | 5,022 tests passing
Julia Language Support
Julia is now a supported language. The initial plugin skeleton includes tree-sitter-based parsing and import resolution, following the same framework as the existing Python and TypeScript plugins.
Tree-sitter Framework Extensions
The tree-sitter framework gains functional specs and functional import resolvers — new extension points that language plugins can use to define analysis rules declaratively. These underpin the Julia plugin and will simplify future language additions.
Reviewer Finding Adjudication
Review subagents now receive structured access to judgment-required findings during batch construction. Per-detector finding counts are embedded in batches, concern signals carry fingerprints and finding IDs, and the CLI renders exploration commands (desloppify show <det> --no-budget). The concern signal cap is raised from 8 to 30 with overflow guidance, and the dismiss path is simplified to 2 fields.
Scoring Rebalance
Scoring weights shift to 75% subjective / 25% mechanical. Subjective design quality — review dimensions like naming, cohesion, and abstraction quality — is now the primary driver of the health score. All docs, reporting strings, tests, and snapshots updated to match.
Judgment-Required Detectors Excluded from Auto-Clustering
Detectors marked needs_judgment (e.g., structural, dict_keys, smells, responsibility_cohesion) now return None from the clustering grouping key. This means they flow through the review process as mechanical evidence rather than being auto-grouped into plan tasks. The cluster strategy is simplified: special-case grouping by file, subtype, and detector for judgment-required issues has been removed entirely, along with unused parameters in generate_description.
Living Plan & Queue Overhaul
The next command now follows the living plan directly rather than computing priorities independently. The queue system was substantially rearchitected:
- Execution vs. backlog queues are now separate surfaces —
nextpulls from the execution queue (current cluster work), while the backlog queue shows what's upcoming. - Queue lifecycle phases are explicitly persisted, giving clear visibility into where you are in the scan → triage → execute flow.
- Scan is a first-class postflight phase — the queue system recognizes and tracks post-scan state transitions.
- Queue output is labeled by surface so it's always clear which queue you're looking at.
- Prompts and coaching text aligned with the new execution queue semantics.
Triage & Review Improvements
- Triage dashboard fix: after completing triage, the dashboard no longer incorrectly shows "start with observe" restart guidance. Root cause was inferring status from empty
triage_stagesdict instead of checkingtriaged_ids. - No-op triage completion allowed for empty review batches — completing triage without findings no longer errors.
- Staged triage flow consolidated — routing, validation, and state contracts tightened across the triage pipeline.
- Review rerun preflight scoped — rerun checks are now properly gated.
- Review dimension metadata made monkeypatchable — enables test and plugin customization.
- Stale wontfix review tails are now cleared on completion.
State Recovery & Resilience
- Triage state recovery from saved plans — if triage state is lost (e.g., from a crash), it can be reconstructed from the persisted plan.
- State recovery from saved plans with deduplication of update-skill operations.
- Plan recovery consolidated on
scan_metadata, dropping the_saved_plan_recoverymarker. scan_metadataschema simplified —inventory_available/metrics_availableare now derived from scan source rather than stored as separate bools.- Stale cluster focus cleared on completion and on skip/cluster mutations.
Security Hardening
- Subprocess command paths hardened for detectors — command resolution tightened.
- Controlled subprocess security seams tightened — removed
# noseccomment noise and the copy-pasted_resolve_cli_executablehelper that existed in 8 files. - Source security findings hardened in detector outputs.
- Silent excepts removed from treesitter resolvers — errors are no longer swallowed.
- Defer policy key overwrites removed — policy values are now immutable once set.
Platform Fixes
- Windows WinError 2 fix (#383): codex subprocess spawning now uses
shutil.which()to resolve.cmdbatch shims on Windows. Also recognizes[WinError 2]in runner failure detection. - Monorepo path validation fix (#387): the
_PATH_REregex in triage enrich was anchored onsrc/and discarded leading path components likepackages/backend/, causing false-positive path failures in monorepos.
Code Quality & Cleanup
A major theme of this release is reversing over-extractions and reducing indirection. Several rounds of mechanical function splits were reverted:
- ~800+ net lines removed across 4 revert/inline commits — single-use helpers inlined back into their callers across triage stages, cluster display, queue flow, render, dispatch, and batch orchestration.
- Triage stage commands (observe/reflect/organize) now read top-to-bottom as linear validation chains with early returns.
QueueRenderContextdataclass removed — explicit kwargs proved more readable.review_qualityunified as the single canonical key (killed dual-key handling).- Import score provenance metadata reduced from 10 fields to 4.
- Scan workflow imports tightened — direct imports from
planning.scaninstead of going through the package namespace. - Remaining split-driver plan items bulk-skipped to prevent recurrence.
Tree-sitter Cleanup
- Import cycle broken in tree-sitter spec modules.
- Unused bridge modules deleted.
- Compatibility bridges centralized into a single location.
Contract & Registry Alignment
- Registry and subjective contracts aligned.
- Plugin scaffold contracts updated.
- TypeScript command surfaces normalized.
- Command registry annotations tightened.
- Schema drift payload builders normalized.
Test Improvements
- Over-mocked flow tests replaced with direct coverage.
- Direct coverage strengthened for import flows.
- New scan orchestration test verifying the planning scan surface integration.
- New review import support test helpers for plan sync runtime patching.
- Batch triage test helpers refreshed.
