diff --git a/.lad/.copilot-instructions.md b/.lad/.copilot-instructions.md index a17d77bab..a0c0ec543 100755 --- a/.lad/.copilot-instructions.md +++ b/.lad/.copilot-instructions.md @@ -1,6 +1,6 @@ # Global Copilot Instructions -* Prioritize **minimal scope**: only edit code directly implicated by the failing test. +* Prioritize **minimal scope**: only edit code directly implicated by the failing test. * Protect existing functionality: do **not** delete or refactor code outside the immediate test context. * Before deleting any code, follow the "Coverage & Code Safety" guidelines below. @@ -8,140 +8,82 @@ Copilot, do not modify any files under .lad/. All edits must occur outside .lad/, or in prompts/ when explicitly updating LAD itself. Coding & formatting -* Follow PEP 8; run Black. +* Follow PEP 8; formatting enforced by pre-commit hooks (black, isort). * Use type hints everywhere. -* External dependencies limited to numpy, pandas, requests. -* Target Python 3.11. +* Respect existing project dependencies declared in pyproject.toml. Testing & linting -* Write tests using component-appropriate strategy (see Testing Strategy below). -* Run flake8 with `--max-complexity=10`; keep complexity ≤ 10. +* Write tests using pytest; run via `tox -e py3` or `python -m pytest dandi`. +* Tests requiring the DANDI archive use Docker Compose fixtures. +* Mark AI-generated tests with `@pytest.mark.ai_generated`. * Every function/class **must** include a **NumPy-style docstring** (Sections: Parameters, Returns, Raises, Examples). ## Testing Strategy by Component Type -**API Endpoints & Web Services:** -* Use **integration testing** - import the real FastAPI/Django/Flask app -* Mock only external dependencies (databases, external APIs, file systems) -* Test actual HTTP routing, validation, serialization, and error handling -* Verify real request/response behavior and framework integration +**CLI Commands:** +* Use click's `CliRunner` for testing CLI entry points +* Test argument parsing, output formatting, and error messages +* Mock API calls and filesystem where appropriate -**Business Logic & Algorithms:** -* Use **unit testing** - mock all dependencies completely -* Test logic in complete isolation, focus on edge cases -* Maximize test speed and reliability -* Test pure business logic without framework concerns +**API Client Operations (upload, download, move, etc.):** +* Use **integration testing** with Docker Compose fixtures for archive interactions +* Mock only external services not under test +* Test actual HTTP interactions, authentication, and error handling -**Data Processing & Utilities:** +**File Processing & Utilities:** * Use **unit testing** with minimal dependencies -* Use test data fixtures for predictable inputs +* Use test data fixtures (tmp_path, simple NWB files) for predictable inputs * Focus on input/output correctness and error handling ## Regression Prevention **Before making changes:** -* Run full test suite to establish baseline: `pytest -q --tb=short` +* Run full test suite to establish baseline: `tox -e py3` or `python -m pytest dandi` * Identify dependencies: `grep -r "function_name" . --include="*.py"` * Understand impact scope before modifications **During development:** -* Run affected tests after each change: `pytest -q tests/test_modified_module.py` +* Run affected tests after each change: `python -m pytest dandi/tests/test_modified_module.py` * Preserve public API interfaces or update all callers * Make minimal changes focused on the failing test **Before commit:** -* Run full test suite: `pytest -q --tb=short` +* Run full test suite: `tox -e py3` * Verify no regressions introduced * Ensure test coverage maintained or improved -## Code Quality Setup (One-time per project) +## Code Quality Setup -**1. Install quality tools:** -```bash -pip install flake8 pytest coverage radon flake8-radon black -``` - -**2. Configure .flake8 file in project root:** -```ini -[flake8] -max-complexity = 10 -radon-max-cc = 10 -exclude = - __pycache__, - .git, - .lad, - .venv, - venv, - build, - dist -``` - -**3. Configure .coveragerc file (see kickoff prompt for template)** - -**4. Verify setup:** -```bash -flake8 --version # Should show flake8-radon plugin -radon --version # Confirm radon installation -pytest --cov=. --version # Confirm coverage plugin -``` - -## Installing & Configuring Radon +**This project already has quality tooling configured.** Do not create new config files; use existing ones. -**Install Radon and its Flake8 plugin:** +**Verify setup:** ```bash -pip install radon flake8-radon +pre-commit install # Install pre-commit hooks if not present +tox -e lint # Run linting +tox -e typing # Run type checking +python -m pytest dandi # Run tests ``` -This installs Radon's CLI and enables the `--radon-max-cc` option in Flake8. -**Enable Radon in Flake8** by adding to `.flake8` or `setup.cfg`: -```ini -[flake8] -max-complexity = 10 -radon-max-cc = 10 -``` -Functions exceeding cyclomatic complexity 10 will be flagged as errors (C901). - -**Verify Radon raw metrics:** -```bash -radon raw path/to/your/module.py -``` -Outputs LOC, LLOC, comments, blank lines—helping you spot oversized modules quickly. - -**(Optional) Measure Maintainability Index:** -```bash -radon mi path/to/your/module.py -``` -Gives a 0–100 score indicating code maintainability. +**Existing configuration locations:** +- **Linting/formatting**: `.pre-commit-config.yaml` (black, isort, flake8) +- **Pytest config**: `tox.ini` under `[pytest]` section +- **Type checking**: `tox.ini` under `[testenv:typing]` +- **Dependencies**: `pyproject.toml` Coverage & Code Safety -* For safety checks, do **not** run coverage inside VS Code. - Instead, ask the user: - > "Please run in your terminal: - > ```bash - > coverage run -m pytest [test_files] -q && coverage html - > ``` - > then reply **coverage complete**." - * Before deleting code, verify: 1. 0% coverage via `coverage report --show-missing` - 2. Absence from Level-2 API docs - If both hold, prompt: - - Delete ? (y/n) - Reason: 0% covered and not documented. - (Tip: use VS Code "Find All References" on .) + 2. No references found via grep + If both hold, prompt for confirmation before deletion. Commits -* Use Conventional Commits. Example: - `feat(pipeline-filter): add ROI masking helper` -* Keep body as bullet list of sub-tasks completed. +* Follow existing project conventions for commit messages. +* pre-commit hooks will auto-fix formatting; if commit fails due to auto-fixes, re-run the commit. Docs -* High-level docs live under the target project's `docs/` and are organised in three nested levels using `
` tags. +* High-level docs live under the target project's `docs/` directory (Sphinx RST format). * After completing each **main task** (top-level checklist item), run: - • `flake8 {{PROJECT_NAME}} --max-complexity=10` - • `python -m pytest --cov={{PROJECT_NAME}} --cov-context=test -q --maxfail=1` + • `tox -e lint` + • `python -m pytest dandi -q --maxfail=1` If either step fails, pause for user guidance. - -* **Radon checks:** Use `radon raw ` to get SLOC; use `radon mi ` to check maintainability. If `raw` LOC > 500 or MI < 65, propose splitting the module. diff --git a/.lad/CLAUDE.md b/.lad/CLAUDE.md index 1fa510f06..5228ad6f4 100755 --- a/.lad/CLAUDE.md +++ b/.lad/CLAUDE.md @@ -4,10 +4,10 @@ *Auto-updated by LAD workflows - current system understanding* ## Code Style Requirements -- **Docstrings**: NumPy-style required for all functions/classes -- **Linting**: Flake8 compliance (max-complexity 10) -- **Testing**: TDD approach, component-aware strategies -- **Coverage**: 90%+ target for new code +- **Docstrings**: NumPy-style for public APIs +- **Formatting**: Black (line length 100), isort (profile="black"), enforced via pre-commit +- **Linting**: `tox -e lint`; type checking: `tox -e typing` +- **Testing**: pytest via `tox -e py3`; Docker Compose for integration tests ## Communication Guidelines **Objective, European-Style Communication**: @@ -26,9 +26,10 @@ - **Progress tracking**: Update both TodoWrite and plan.md files consistently ## Testing Strategy Guidelines -- **API Endpoints**: Integration testing (real app + mocked external deps) -- **Business Logic**: Unit testing (complete isolation + mocks) -- **Data Processing**: Unit testing (minimal deps + test fixtures) +- **CLI Commands**: click CliRunner + mocked API calls +- **API Client Operations**: Integration testing with Docker Compose fixtures +- **File Processing & Utilities**: Unit testing (tmp_path + test data fixtures) +- **AI-generated tests**: Mark with `@pytest.mark.ai_generated` ## Project Structure Patterns *Learned from exploration - common patterns and conventions* @@ -46,6 +47,8 @@ ### Token Optimization for Large Codebases **Standard test commands:** +- **Full suite**: `tox -e py3` or `python -m pytest dandi` +- **Single test**: `python -m pytest dandi/tests/test_file.py::test_function -v` - **Large test suites**: Use `2>&1 | tail -n 100` for pytest commands to capture only final results/failures - **Coverage reports**: Use `tail -n 150` for comprehensive coverage output to include summary - **Keep targeted tests unchanged**: Single test runs (`pytest -xvs`) don't need redirection @@ -94,4 +97,4 @@ - *No anti-patterns logged* --- -*Last updated by Claude Code LAD Framework* \ No newline at end of file +*Last updated by Claude Code LAD Framework* diff --git a/.lad/LAD_RECIPE.md b/.lad/LAD_RECIPE.md index 390bfdd12..9438a923b 100755 --- a/.lad/LAD_RECIPE.md +++ b/.lad/LAD_RECIPE.md @@ -3,11 +3,11 @@ > **Goal**: Provide repeatable workflows for implementing complex Python features iteratively and safely. > > **Two Optimized Approaches:** -> +> > ## 🚀 Claude Code Workflow (Recommended for 2025) > **3-phase autonomous workflow optimized for command-line development** > 1. **Autonomous Context & Planning** — Dynamic codebase exploration + TDD planning -> 2. **Iterative Implementation** — TDD loop with continuous quality monitoring +> 2. **Iterative Implementation** — TDD loop with continuous quality monitoring > 3. **Quality & Finalization** — Self-review + comprehensive validation > > ## 🛠️ GitHub Copilot Chat Workflow (VSCode) @@ -39,7 +39,7 @@ │ ├── 04b_test_analysis_framework.md # 🆕 Pattern recognition │ ├── 04c_test_improvement_cycles.md # 🆕 PDCA methodology │ └── 04d_test_session_management.md # 🆕 Session continuity -├── copilot_prompts/ # 🛠️ Copilot Chat workflow +├── copilot_prompts/ # 🛠️ Copilot Chat workflow │ ├── 00_feature_kickoff.md │ ├── 01_context_gathering.md │ ├── 02_plan_feature.md @@ -56,14 +56,14 @@ │ ├── 05_code_review_package.md │ └── 06_self_review_with_chatgpt.md └── .vscode/ # optional for Copilot workflow - ├── settings.json + ├── settings.json └── extensions.json ``` Import the complete `.lad/` directory into any target project once on main. -* Target Python 3.11. -* Commit messages follow Conventional Commits. +* Target the Python versions supported by the project (see pyproject.toml). +* Commit messages follow project conventions. * All generated docs follow the *plain summary + nested `
`* convention. --- @@ -102,10 +102,10 @@ Import the complete `.lad/` directory into any target project once on main. | **4c. Test Improvement Cycles** | `claude_prompts/04c_test_improvement_cycles.md` | ~30-60 min | PDCA cycles, TodoWrite integration, systematic implementation with validation | | **4d. Test Session Management** | `claude_prompts/04d_test_session_management.md` | ~5-10 min | Session continuity, context optimization, adaptive decision framework | -**Key Benefits**: +**Key Benefits**: - 🎯 **Autonomous execution** — Minimal intervention points with autonomous tool usage - ⚡ **3-5x faster development** — Autonomous execution with real-time feedback -- 🔄 **Continuous quality** — Integrated testing and regression prevention +- 🔄 **Continuous quality** — Integrated testing and regression prevention - 📊 **Progress visibility** — TodoWrite integration for status tracking - 🛡️ **Quality assurance** — Comprehensive validation and testing - 🔬 **Systematic improvement** — PDCA cycles for test quality optimization @@ -113,7 +113,7 @@ Import the complete `.lad/` directory into any target project once on main. ### 2.4 Claude Code Workflow Features -**Autonomous Context Gathering**: +**Autonomous Context Gathering**: - Uses Task/Glob/Grep tools for codebase exploration - No need to manually open files or navigate directories - Dynamic context based on feature requirements @@ -206,7 +206,7 @@ Import the complete `.lad/` directory into any target project once on main. **Common Anti-Patterns to Avoid**: - ❌ Starting implementation without baseline testing -- ❌ Running multiple tasks in_progress simultaneously +- ❌ Running multiple tasks in_progress simultaneously - ❌ Skipping validation steps in test improvement cycles - ❌ Not using `/compact` when context becomes unwieldy - ❌ Manual context management instead of using LAD session state @@ -279,7 +279,7 @@ Import the complete `.lad/` directory into any target project once on main. **Usage Pattern**: ```python -# Initialize comprehensive test analysis environment +# Initialize comprehensive test analysis environment # Purpose: Systematic test quality improvement for solo programmers # Methodology: PDCA cycles with holistic pattern recognition @@ -293,14 +293,14 @@ categorized_failures = aggregate_failure_patterns_across_categories(test_results **Splitting Benefits:** - **Foundation-First**: Core models and infrastructure implemented first -- **Domain Separation**: Security, performance, and API concerns handled separately +- **Domain Separation**: Security, performance, and API concerns handled separately - **Context Inheritance**: Each sub-plan builds on previous implementations - **Manageable Scope**: Each sub-plan stays ≤6 tasks, ≤25 sub-tasks **Sub-Plan Structure:** - `plan_0a_foundation.md` - Core models, job management, infrastructure - `plan_0b_{{domain}}.md` - Business logic, pipeline integration -- `plan_0c_interface.md` - API endpoints, external interfaces +- `plan_0c_interface.md` - API endpoints, external interfaces - `plan_0d_security.md` - Security, performance, compatibility **Context Evolution:** As each sub-plan completes, context files for subsequent sub-plans are updated with new APIs, interfaces, and integration points, ensuring later phases have complete system visibility. @@ -309,36 +309,36 @@ categorized_failures = aggregate_failure_patterns_across_categories(test_results **LAD uses component-appropriate testing strategies** to ensure both comprehensive coverage and efficient development: -**API Endpoints & Web Services:** -- **Integration Testing**: Import and test the real FastAPI/Django/Flask app -- **Mock External Dependencies**: Only databases, external APIs, file systems -- **Test Framework Behavior**: HTTP routing, validation, serialization, error handling -- **Why**: APIs are integration points - the framework behavior is part of what you're building - -**Business Logic & Algorithms:** -- **Unit Testing**: Mock all dependencies, test in complete isolation -- **Focus**: Edge cases, error conditions, algorithmic correctness -- **Benefits**: Fast execution, complete control, reliable testing -- **Why**: Pure logic should be testable without external concerns - -**Data Processing & Utilities:** -- **Unit Testing**: Minimal dependencies, test data fixtures -- **Focus**: Input/output correctness, transformation accuracy +**CLI Commands:** +- **Click CliRunner Testing**: Test CLI entry points with click's test runner +- **Mock External Dependencies**: API calls, filesystem, network +- **Test Behavior**: Argument parsing, output formatting, error messages, exit codes +- **Why**: CLI is the user-facing interface - test user workflows end-to-end + +**API Client & Operations (upload, download, move, etc.):** +- **Integration Testing**: Use Docker Compose fixtures for DANDI archive interactions +- **Mock External Dependencies**: Only services not under test +- **Test Behavior**: HTTP interactions, authentication, error handling, retries +- **Why**: Operations involve real API interactions that need integration coverage + +**File Processing & Utilities:** +- **Unit Testing**: Minimal dependencies, test data fixtures (tmp_path, NWB files) +- **Focus**: Input/output correctness, metadata extraction, validation - **Benefits**: Predictable test data, isolated behavior verification -**Example - API Testing:** +**Example - CLI Testing:** ```python -# ✅ Integration testing for API endpoints -from myapp.app import create_app # Real app +# ✅ Integration testing for CLI commands +from click.testing import CliRunner from unittest.mock import patch -def test_api_endpoint(): - app = create_app() - with patch('myapp.database.get_user') as mock_db: # Mock external deps - mock_db.return_value = {"id": 1, "name": "test"} - client = TestClient(app) # Test real routing/validation - response = client.get("/api/users/1") - assert response.status_code == 200 +from dandi.cli.command import main + +def test_cli_command(): + runner = CliRunner() + with patch('dandi.dandiapi.DandiAPIClient') as mock_client: + result = runner.invoke(main, ["ls", "DANDI:000027"]) + assert result.exit_code == 0 ``` --- @@ -441,8 +441,8 @@ The agent may run commands (push, commit), but will: ## 9 ⚙️ Settings & Linting -* Lint using **Flake8**. -* Commit messages follow **Conventional Commits**. +* Lint and format via **pre-commit hooks** (black, isort, flake8). +* Run linting: `tox -e lint`; type checking: `tox -e typing`. * Docstrings follow **NumPy style**. --- @@ -500,7 +500,7 @@ The agent may run commands (push, commit), but will: **Knowledge Accumulation Patterns**: - **Successful approaches**: Preserve working patterns in CLAUDE.md - **Failed approaches**: Document what to avoid and why -- **User preferences**: Learn decision patterns for framework adaptation +- **User preferences**: Learn decision patterns for framework adaptation - **Process optimization**: Compound improvement across multiple sessions **Context File Organization**: @@ -547,4 +547,4 @@ Enjoy faster, safer feature development with comprehensive test quality improvem - **90%+ test success rates** through systematic improvement - **Seamless session resumption** across interruptions and context switches -This enhanced LAD framework represents the culmination of real-world usage patterns, systematic test improvement methodologies, and cross-session productivity optimization for solo programmers working on complex research software. \ No newline at end of file +This enhanced LAD framework represents the culmination of real-world usage patterns, systematic test improvement methodologies, and cross-session productivity optimization for solo programmers working on complex research software. diff --git a/.lad/claude_prompts/00_feature_kickoff.md b/.lad/claude_prompts/00_feature_kickoff.md index 9050b5d81..6f26dabb6 100755 --- a/.lad/claude_prompts/00_feature_kickoff.md +++ b/.lad/claude_prompts/00_feature_kickoff.md @@ -5,19 +5,19 @@ You are Claude, an expert software architect setting up a robust development env **Autonomous Capabilities**: File operations (Read, Write, Edit), command execution (Bash), environment validation, and configuration setup. -**Quality Standards**: +**Quality Standards**: - Flake8 compliance (max-complexity 10) - Test coverage ≥90% for new code - NumPy-style docstrings required - Conventional commit standards -**Objectivity Guidelines**: +**Objectivity Guidelines**: - Challenge assumptions - Ask "How do I know this is true?" - State limitations clearly - "I cannot verify..." or "This assumes..." - Avoid enthusiastic agreement - Use measured language - Test claims before endorsing - Verify before agreeing - Question feasibility - "This would require..." or "The constraint is..." -- Admit uncertainty - "I'm not confident about..." +- Admit uncertainty - "I'm not confident about..." - Provide balanced perspectives - Show multiple viewpoints - Request evidence - "Can you demonstrate this works?" @@ -38,81 +38,43 @@ You are Claude, an expert software architect setting up a robust development env - Validate framework integrity (don't modify `.lad/` contents) 2. **Python Environment**: - - Check Python version (3.11+ required) + - Check Python version matches project's supported versions (see pyproject.toml) - Verify required packages are installable - - Test basic development tools + - Test basic development tools (tox, pre-commit) 3. **Git Repository**: - Confirm we're in a git repository - Check current branch status - Verify clean working directory or document current state -### Step 2: Quality Standards Setup +### Step 2: Quality Standards Verification -**Create/verify quality configuration files**: +**Verify existing quality configuration** (do NOT create new config files if they already exist): -1. **Flake8 Configuration** (`.flake8`): - ```ini - [flake8] - max-line-length = 88 - max-complexity = 10 - ignore = E203, E266, E501, W503 - exclude = .git,__pycache__,docs/,build/,dist/,.lad/ - ``` - -2. **Coverage Configuration** (`.coveragerc`): - ```ini - [run] - branch = True - source = . - omit = - */tests/* - */test_* - */__pycache__/* - */.* - .lad/* - setup.py - */venv/* - */env/* - - [report] - show_missing = True - skip_covered = False - - [html] - directory = coverage_html - ``` +1. **Pre-commit hooks**: Check `.pre-commit-config.yaml` exists; run `pre-commit install` if hooks not installed +2. **Linting config**: Verify via `tox -e lint` +3. **Pytest config**: Check `[pytest]` section in `tox.ini` +4. **Type checking**: Verify via `tox -e typing` -3. **Pytest Configuration** (add to `pytest.ini` or `pyproject.toml` if missing): - ```ini - [tool:pytest] - testpaths = tests - python_files = test_*.py - python_classes = Test* - python_functions = test_* - addopts = --strict-markers --strict-config - markers = - slow: marks tests as slow (deselect with '-m "not slow"') - integration: marks tests as integration tests - ``` +**Only create configuration files for NEW projects that lack them.** ### Step 3: Baseline Quality Assessment **Establish current state**: 1. **Test Suite Baseline**: ```bash - pytest --collect-only # Count existing tests - pytest -q --tb=short # Run existing tests + python -m pytest dandi --collect-only # Count existing tests + python -m pytest dandi -q --tb=short # Run existing tests ``` 2. **Coverage Baseline**: ```bash - pytest --cov=. --cov-report=term-missing --cov-report=html + python -m pytest dandi --cov=dandi --cov-report=term-missing ``` 3. **Code Quality Baseline**: ```bash - flake8 --statistics + tox -e lint ``` 4. **Document Baseline**: @@ -200,7 +162,7 @@ You are Claude, an expert software architect setting up a robust development env - Feature context is prepared for autonomous implementation - All tools and configurations are functional -**Important**: +**Important**: - Never modify files in `.lad/` folder - this contains the framework - All feature work goes in `docs/` folder - Preserve existing project structure and configurations @@ -209,4 +171,4 @@ You are Claude, an expert software architect setting up a robust development env ### Next Phase After successful kickoff, proceed to Phase 1: Autonomous Context Planning using `.lad/claude_prompts/01_autonomous_context_planning.md` - \ No newline at end of file + diff --git a/.lad/claude_prompts/04_test_quality_systematic.md b/.lad/claude_prompts/04_test_quality_systematic.md index 2d2b1ec36..5e9033ebe 100755 --- a/.lad/claude_prompts/04_test_quality_systematic.md +++ b/.lad/claude_prompts/04_test_quality_systematic.md @@ -12,7 +12,7 @@ You are Claude performing systematic test quality analysis and remediation with 2>&1 | tee full_output.txt | grep -iE "(warning|error|failed|exception|fatal|critical)" | tail -n 30; echo "--- FINAL OUTPUT ---"; tail -n 100 full_output.txt ``` -**Research Software Quality Standards**: +**Research Software Quality Standards**: - Scientific reproducibility maintained across test fixes - Test effectiveness prioritized over coverage metrics - Research impact assessment for all test failures @@ -39,18 +39,17 @@ You are Claude performing systematic test quality analysis and remediation with **Intelligent Chunking Strategy**: ```bash # Category-based execution with proven chunk sizing -pytest tests/security/ -v --tb=short 2>&1 | tee security_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR|warnings|collected)" | tail -n 15 +# Simple/fast categories - run in full +pytest tests/{{category_1}}/ -v --tb=short 2>&1 | tee {{category_1}}_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR|warnings|collected)" | tail -n 15 -# Model registry chunking (large category) -pytest tests/model_registry/test_local*.py tests/model_registry/test_api*.py tests/model_registry/test_database*.py -v --tb=short 2>&1 | tee registry_chunk1.txt | tail -n 10 +# Large categories - split into logical chunks +pytest tests/{{large_category}}/test_subset1*.py tests/{{large_category}}/test_subset2*.py -v --tb=short 2>&1 | tee category_chunk1.txt | tail -n 10 -# Performance and tools (timeout-prone categories) -pytest tests/performance/ -v --tb=short 2>&1 | tee performance_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR)" | tail -n 10 -pytest tests/tools/ -v --tb=short 2>&1 | tee tools_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR)" | tail -n 10 +# Timeout-prone categories - smaller chunks or individual execution +pytest tests/{{slow_category}}/ -v --tb=short 2>&1 | tee slow_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR)" | tail -n 10 -# Integration and multi-user (complex categories) -pytest tests/integration/test_unified*.py tests/integration/test_cross*.py -v --tb=short 2>&1 | tee integration_chunk1.txt | tail -n 10 -pytest tests/multi-user-service/test_auth*.py tests/multi-user-service/test_workspace*.py -v --tb=short 2>&1 | tee multiuser_chunk1.txt | tail -n 10 +# Complex categories with setup requirements +pytest tests/{{complex_category}}/test_subset1*.py tests/{{complex_category}}/test_subset2*.py -v --tb=short 2>&1 | tee complex_chunk1.txt | tail -n 10 ``` **Comprehensive Baseline Establishment**: @@ -84,7 +83,7 @@ python -c " import re with open('all_failures.txt') as f: failures = f.readlines() - + # Group by failure types import_failures = [f for f in failures if 'import' in f.lower() or 'modulenotfound' in f.lower()] api_failures = [f for f in failures if 'attribute' in f.lower() or 'missing' in f.lower()] @@ -121,7 +120,7 @@ For each SKIPPED test, validate against multiple standards: - Research impact if fixed: [Scientific validity / Workflow / Performance / Cosmetic] **Enterprise Standard (85-95% pass rate expectation)**: -- Justified: [Y/N] + Reasoning +- Justified: [Y/N] + Reasoning - Business impact if fixed: [Critical / High / Medium / Low] **IEEE Testing Standard (Industry best practices)**: @@ -149,7 +148,7 @@ echo "# Test Quality Improvement Plan - $(date)" > notes/test_analysis/improveme **Priority Matrix (Enhanced for Solo Programmer)**: - **P1-CRITICAL**: Scientific validity + High impact/Low effort fixes -- **P2-HIGH**: System reliability + Quick wins enabling other fixes +- **P2-HIGH**: System reliability + Quick wins enabling other fixes - **P3-MEDIUM**: Performance + Moderate effort with clear value - **P4-LOW**: Cosmetic + High effort/Low value (defer or remove) @@ -179,7 +178,7 @@ echo "# Test Quality Improvement Plan - $(date)" > notes/test_analysis/improveme # Initialize test quality improvement TodoWrite TodoWrite tasks: 1. Infrastructure fixes (P1-CRITICAL): Import/dependency issues -2. API compatibility fixes (P1-P2): Method signature updates +2. API compatibility fixes (P1-P2): Method signature updates 3. Test design improvements (P2-P3): Brittle test redesign 4. Coverage gap filling (P3): Integration point testing 5. Configuration standardization (P4): Settings/path cleanup @@ -205,7 +204,7 @@ echo "# Test Fix Decision Analysis - {{fix_category}}" > notes/test_decisions/{{ # Targeted validation pytest tests/{{affected_category}}/ -v --tb=short 2>&1 | tail -n 20 -# Integration validation +# Integration validation python -c "import {{affected_module}}; print('Import successful')" # Regression prevention @@ -222,7 +221,7 @@ echo "## Baseline vs Current Status" >> test_health_report.md pytest --collect-only 2>&1 | grep "collected\|error" >> test_health_report.md # Category-wise success rates -for category in security model_registry integration performance tools; do +for category in $(find tests/ -mindepth 1 -maxdepth 1 -type d -printf '%f\n' | sort); do echo "### $category category:" >> test_health_report.md pytest tests/$category/ -q --tb=no 2>&1 | grep "passed\|failed\|skipped" >> test_health_report.md done @@ -235,18 +234,18 @@ done **TEST QUALITY IMPROVEMENT CYCLE COMPLETE** **Progress Summary**: -- Fixed: {{number}} test failures +- Fixed: {{number}} test failures - Success rate improvement: {{baseline}}% → {{current}}% - Priority fixes completed: {{P1_count}} P1, {{P2_count}} P2, {{P3_count}} P3 **Current Status**: -- Critical systems (Security/Model Registry): {{status}} +- Critical systems: {{status}} - Integration tests: {{status}} - Total test health: {{overall_percentage}}% **Remaining Issues**: - {{count}} P1-CRITICAL remaining -- {{count}} P2-HIGH remaining +- {{count}} P2-HIGH remaining - {{count}} P3-MEDIUM remaining - {{count}} justified skips (validated against industry standards) @@ -276,7 +275,7 @@ done **Success Criteria Thresholds** (Configurable based on context): - **Research Software**: >90% success for critical systems, >70% overall -- **Enterprise Standard**: >95% success for critical systems, >85% overall +- **Enterprise Standard**: >95% success for critical systems, >85% overall - **Solo Programmer**: >100% critical systems, >80% overall (realistic for resource constraints) ### Coverage Integration Framework @@ -315,7 +314,7 @@ grep -n "missing coverage" coverage_{{module}}.txt ```bash # Save comprehensive session state echo "# Test Quality Session State - $(date)" > notes/session_state.md -echo "## TodoWrite Progress:" >> notes/session_state.md +echo "## TodoWrite Progress:" >> notes/session_state.md # [TodoWrite state documentation] echo "## Current PDCA Cycle:" >> notes/session_state.md @@ -364,7 +363,7 @@ echo "## Context for Resumption:" >> notes/session_state.md **Research Software Compliance**: - [ ] Scientific validity tests: 100% success -- [ ] Computational accuracy tests: 100% success +- [ ] Computational accuracy tests: 100% success - [ ] Research workflow tests: >95% success - [ ] Overall test collection: >90% success @@ -408,4 +407,4 @@ echo "## Context for Resumption:" >> notes/session_state.md 4. **Continuous Improvement Process**: Sustainable test maintenance procedures This enhanced framework combines research software rigor with enterprise-grade systematic improvement methodologies, adapted for solo programmer resource constraints while ensuring production-ready quality standards. - \ No newline at end of file + diff --git a/.lad/claude_prompts/04a_test_execution_infrastructure.md b/.lad/claude_prompts/04a_test_execution_infrastructure.md index 7c92da433..2c39d4af7 100755 --- a/.lad/claude_prompts/04a_test_execution_infrastructure.md +++ b/.lad/claude_prompts/04a_test_execution_infrastructure.md @@ -39,7 +39,7 @@ echo "## Direct References:" >> impact_analysis.md grep -r "$target_function" --include="*.py" . >> impact_analysis.md # Check import dependencies -echo "## Import Dependencies:" >> impact_analysis.md +echo "## Import Dependencies:" >> impact_analysis.md grep -r "from.*import.*$target_function\|import.*$target_function" --include="*.py" . >> impact_analysis.md # Identify calling patterns @@ -65,17 +65,8 @@ grep -r "$target_function" docs/API_REFERENCE.md docs/**/api*.md 2>/dev/null >> # Map critical system interactions echo "## Integration Points:" >> impact_analysis.md -# Statistical analysis pipeline interactions -grep -r "$target_function" emuses/**/statistical*.py emuses/**/analysis*.py 2>/dev/null >> impact_analysis.md - -# Model registry interactions -grep -r "$target_function" emuses/**/model_registry*.py emuses/**/registry*.py 2>/dev/null >> impact_analysis.md - -# Multi-user service compatibility -grep -r "$target_function" emuses/**/service*.py emuses/**/multi_user*.py 2>/dev/null >> impact_analysis.md - -# CLI and API endpoints -grep -r "$target_function" emuses/cli/*.py emuses/api/*.py 2>/dev/null >> impact_analysis.md +# Find all modules that reference the target function +grep -r "$target_function" --include="*.py" . 2>/dev/null >> impact_analysis.md ``` **4. Test Impact Prediction**: @@ -146,25 +137,13 @@ echo "git reset --hard $(git rev-parse HEAD)" >> impact_analysis.md **Immediate Validation** (run after each change): ```bash # Test affected categories immediately -pytest $(grep -l "$target_function" tests/**/*.py 2>/dev/null) -x --tb=short - -# Quick integration smoke test -python scripts/dev_test_runner.py - -# Verify documentation examples still work -python -c "exec(open('docs/examples/validate_examples.py').read())" 2>/dev/null || echo "No example validation script" +pytest $(grep -l "$target_function" dandi/tests/*.py 2>/dev/null) -x --tb=short ``` **Comprehensive Validation** (before committing): ```bash -# Full category testing for affected areas -affected_categories=$(grep -r "$target_function" tests/ --include="*.py" | cut -d'/' -f2 | sort -u | tr '\n' ' ') -for category in $affected_categories; do - pytest tests/$category/ -q --tb=short -done - -# Cross-integration validation -pytest tests/integration/ -k "$target_function" -v --tb=short 2>/dev/null || echo "No integration tests found" +# Full test suite validation +python -m pytest dandi -q --tb=short ``` ### ⚠️ **Emergency Rollback Procedure** @@ -188,30 +167,23 @@ echo "Recovery: Baseline restored, ready for alternative approach" >> impact_ana #### Intelligent Chunking Strategy (Timeout Prevention) -**Proven Chunk Sizing for Different Test Categories**: +**Chunk Sizing for Different Test Categories**: ```bash -# Security tests (typically fast, stable execution) -pytest tests/security/ -v --tb=short 2>&1 | tee security_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR|warnings|collected)" | tail -n 15 - -# Model registry (large category - requires chunking) -pytest tests/model_registry/test_local*.py tests/model_registry/test_api*.py tests/model_registry/test_database*.py -v --tb=short 2>&1 | tee registry_chunk1.txt | tail -n 10 - -pytest tests/model_registry/test_advanced*.py tests/model_registry/test_analytics*.py tests/model_registry/test_benchmarking*.py -v --tb=short 2>&1 | tee registry_chunk2.txt | tail -n 10 +# Fast unit tests (no Docker needed) +python -m pytest dandi/tests/test_utils.py dandi/tests/test_metadata.py -v --tb=short 2>&1 | tee unit_results.txt | tail -n 15 -# Integration tests (complex, potentially slow) -pytest tests/integration/test_unified*.py tests/integration/test_cross*.py -v --tb=short 2>&1 | tee integration_chunk1.txt | tail -n 10 +# CLI tests +python -m pytest dandi/tests/test_command.py -v --tb=short 2>&1 | tee cli_results.txt | tail -n 15 -# Performance tests (timeout-prone) -pytest tests/performance/ -v --tb=short 2>&1 | tee performance_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR)" | tail -n 10 +# Operation tests (may need Docker for some) +python -m pytest dandi/tests/test_download.py dandi/tests/test_upload.py -v --tb=short 2>&1 | tee operations_results.txt | tail -n 15 -# Tools and CLI (mixed complexity) -pytest tests/tools/ -v --tb=short 2>&1 | tee tools_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR)" | tail -n 10 +# Move/organize tests +python -m pytest dandi/tests/test_move.py dandi/tests/test_organize.py -v --tb=short 2>&1 | tee move_results.txt | tail -n 15 -pytest tests/enhanced-cli-typer/test_cli_integration.py tests/enhanced-cli-typer/test_service_client.py -v --tb=short 2>&1 | tee cli_chunk1.txt | tail -n 10 - -# Multi-user service (complex setup requirements) -pytest tests/multi-user-service/test_auth*.py tests/multi-user-service/test_workspace*.py -v --tb=short 2>&1 | tee multiuser_chunk1.txt | tail -n 10 +# Full suite (uses tox for reproducibility) +tox -e py3 2>&1 | tee full_results.txt | tail -n 50 ``` **Dynamic Chunk Size Guidelines**: @@ -246,12 +218,11 @@ with open('test_collection_baseline.txt') as f: echo "# Test Execution Baseline - $(date)" > test_execution_baseline.md # Execute and track each category -for category in security model_registry integration performance tools multi-user-service enhanced-cli-typer; do +for category in unit cli operations move; do echo "## $category Category Results" >> test_execution_baseline.md - if [ -f "${category}_results.txt" ] || ls ${category}_chunk*.txt 1> /dev/null 2>&1; then - # Aggregate results from category files - cat ${category}_*.txt 2>/dev/null | grep -E "(PASSED|FAILED|SKIPPED|ERROR)" | tail -n 5 >> test_execution_baseline.md - cat ${category}_*.txt 2>/dev/null | grep "===.*===" | tail -n 1 >> test_execution_baseline.md + if [ -f "${category}_results.txt" ]; then + grep -E "(PASSED|FAILED|SKIPPED|ERROR)" "${category}_results.txt" | tail -n 5 >> test_execution_baseline.md + grep "===.*===" "${category}_results.txt" | tail -n 1 >> test_execution_baseline.md else echo "Category not executed" >> test_execution_baseline.md fi @@ -275,7 +246,7 @@ python -c " import re with open('comprehensive_test_output.txt') as f: content = f.read() - + # Extract final summary lines that show totals summary_lines = [line for line in content.split('\n') if '=====' in line and ('passed' in line or 'failed' in line)] @@ -289,7 +260,7 @@ for line in summary_lines: failed = re.findall(r'(\d+) failed', line) skipped = re.findall(r'(\d+) skipped', line) warnings = re.findall(r'(\d+) warning', line) - + if passed: total_passed += int(passed[0]) if failed: total_failed += int(failed[0]) if skipped: total_skipped += int(skipped[0]) @@ -340,7 +311,7 @@ echo "## Next Phase: Ready for analysis framework (04b)" >> test_context_summary **Readiness for Next Phase**: - [ ] `test_execution_baseline.md` contains category results -- [ ] `test_health_metrics.md` shows overall statistics +- [ ] `test_health_metrics.md` shows overall statistics - [ ] `comprehensive_test_output.txt` available for pattern analysis - [ ] Context preserved for analysis phase (04b) @@ -369,4 +340,4 @@ echo "## Next Phase: Ready for analysis framework (04b)" >> test_context_summary **Usage**: Complete this phase before proceeding to `04b_test_analysis_framework.md` for holistic pattern recognition and root cause analysis. This phase provides the robust foundation needed for systematic test improvement while ensuring efficient resource usage and timeout prevention. - \ No newline at end of file + diff --git a/.lad/claude_prompts/04c_test_improvement_cycles.md b/.lad/claude_prompts/04c_test_improvement_cycles.md index 891010108..e263b018e 100755 --- a/.lad/claude_prompts/04c_test_improvement_cycles.md +++ b/.lad/claude_prompts/04c_test_improvement_cycles.md @@ -42,7 +42,7 @@ grep -r "$target_area" tests/ --include="*.py" | cut -d':' -f1 | sort -u >> cycl **Risk-Based Implementation Strategy**: - **Low Risk**: Test fixture improvements, test data corrections → Standard validation -- **Medium Risk**: Test logic changes, assertion updates → Focused category validation +- **Medium Risk**: Test logic changes, assertion updates → Focused category validation - **High Risk**: Core functionality fixes, algorithm changes → Comprehensive validation #### PDCA Integration with Risk Management @@ -124,13 +124,13 @@ TodoWrite initialization based on analysis results: # Implement first task in current cycle # Example implementation pattern: -echo "Starting implementation of: {{current_task}}" +echo "Starting implementation of: {{current_task}}" echo "PDCA Cycle {{N}}, DO Phase - Task {{M}}" > current_implementation_log.md # [Implement specific fix based on root cause analysis] # Infrastructure fix example: # - Update import statements -# - Fix dependency issues +# - Fix dependency issues # - Resolve environment setup # API compatibility fix example: @@ -138,7 +138,7 @@ echo "PDCA Cycle {{N}}, DO Phase - Task {{M}}" > current_implementation_log.md # - Fix parameter mismatches # - Resolve interface changes -# Test design fix example: +# Test design fix example: # - Update test expectations # - Fix brittle test logic # - Improve test reliability @@ -184,15 +184,15 @@ echo "## CHECK Phase Validation - Task: {{current_task}}" >> current_implementat # 1. Direct test validation pytest tests/{{affected_category}}/ -v --tb=short 2>&1 | tail -n 20 -# 2. Integration validation +# 2. Integration validation python -c "import {{affected_module}}; print('Import successful')" # 3. Regression prevention for critical systems -pytest tests/security/ tests/model_registry/test_local*.py -q --tb=short 2>&1 | tail -n 10 +pytest tests/{{critical_category}}/ -q --tb=short 2>&1 | tail -n 10 # 4. Update health metrics echo "### Validation Results:" >> current_implementation_log.md -echo "- Target tests now passing: {{Y_or_N}}" >> current_implementation_log.md +echo "- Target tests now passing: {{Y_or_N}}" >> current_implementation_log.md echo "- No regressions in critical systems: {{Y_or_N}}" >> current_implementation_log.md echo "- Integration points working: {{Y_or_N}}" >> current_implementation_log.md @@ -207,7 +207,7 @@ echo "- Integration points working: {{Y_or_N}}" >> current_implementation_log.md echo "# Updated Test Health Report - PDCA Cycle {{N}}" > cycle_{{N}}_health_report.md # Re-run key categories to measure improvement -for category in security model_registry integration performance tools; do +for category in $(find tests/ -mindepth 1 -maxdepth 1 -type d -printf '%f\n' | sort); do echo "## $category Category Status:" >> cycle_{{N}}_health_report.md if pytest tests/$category/ -q --tb=no 2>/dev/null; then pytest tests/$category/ -q --tb=no 2>&1 | grep -E "(passed|failed|skipped)" >> cycle_{{N}}_health_report.md @@ -238,7 +238,7 @@ echo "- Remaining P1-P2 issues: {{remaining_high_priority}}" >> cycle_{{N}}_heal - **Priority Fixes**: {{P1_completed}} P1, {{P2_completed}} P2 completed **Current Status**: -- **Critical Systems**: {{security_status}}, {{model_registry_status}}, {{integration_status}} +- **Critical Systems**: {{category_1_status}}, {{category_2_status}}, {{category_3_status}} - **Overall Health**: {{current_percentage}}% success rate - **Industry Compliance**: {{research_standard_status}}, {{enterprise_standard_status}} @@ -256,14 +256,14 @@ echo "- Remaining P1-P2 issues: {{remaining_high_priority}}" >> cycle_{{N}}_heal - Estimated effort: {{next_cycle_time_estimate}} - Target improvement: {{target_success_rate}}% -**B) 🔧 ADJUST APPROACH** - Modify strategy based on findings +**B) 🔧 ADJUST APPROACH** - Modify strategy based on findings - Will pause for approach refinement - Address: {{any_systemic_issues_discovered}} - Update: {{priority_matrix_or_batching_strategy}} - Reassess: {{resource_allocation_or_complexity}} **C) 📊 ADD COVERAGE ANALYSIS** - Integrate test coverage improvement - - Will run comprehensive coverage analysis + - Will run comprehensive coverage analysis - Identify: {{critical_code_gaps_requiring_tests}} - Balance: {{test_quality_vs_coverage_enhancement}} - Estimated scope: {{coverage_improvement_effort}} @@ -299,7 +299,7 @@ echo "- Completed this session: {{completed_tasks}}" >> notes/pdca_session_state echo "## TodoWrite State:" >> notes/pdca_session_state.md echo "- Total tasks: {{total_count}}" >> notes/pdca_session_state.md -echo "- Completed: {{completed_count}}" >> notes/pdca_session_state.md +echo "- Completed: {{completed_count}}" >> notes/pdca_session_state.md echo "- In progress: {{in_progress_count}}" >> notes/pdca_session_state.md echo "- Pending: {{pending_count}}" >> notes/pdca_session_state.md @@ -325,7 +325,7 @@ echo "# Essential Context for Continuation" > pdca_essential_context.md echo "## Current Achievement Level:" >> pdca_essential_context.md echo "- Success rate: {{current_percentage}}%" >> pdca_essential_context.md echo "- Industry standard compliance: {{status}}" >> pdca_essential_context.md -echo "- Critical systems status: {{security_registry_integration_status}}" >> pdca_essential_context.md +echo "- Critical systems status: {{critical_systems_status}}" >> pdca_essential_context.md echo "## Active PDCA Context:" >> pdca_essential_context.md echo "- Cycle: {{N}}, Phase: {{current_phase}}" >> pdca_essential_context.md @@ -353,7 +353,7 @@ cat cycle_*_health_report.md >> PROJECT_STATUS.md 2>/dev/null || true echo "# Coverage Analysis Integration - PDCA Cycle {{N}}" > coverage_integration_analysis.md # Run coverage for key modules -pytest --cov=emuses --cov-report=term-missing tests/ 2>&1 | tee comprehensive_coverage.txt +pytest --cov=. --cov-report=term-missing tests/ 2>&1 | tee comprehensive_coverage.txt # Identify critical functions with <80% coverage python -c " @@ -372,7 +372,7 @@ cat critical_coverage_gaps.txt >> coverage_integration_analysis.md echo "## Integration with Current Test Quality:" >> coverage_integration_analysis.md echo "- Current test success rate: {{percentage}}%" >> coverage_integration_analysis.md -echo "- Coverage enhancement opportunities: {{count}} critical gaps" >> coverage_integration_analysis.md +echo "- Coverage enhancement opportunities: {{count}} critical gaps" >> coverage_integration_analysis.md echo "- Resource allocation: {{balance_quality_fixes_vs_coverage}}" >> coverage_integration_analysis.md ``` @@ -418,4 +418,4 @@ echo "- Resource allocation: {{balance_quality_fixes_vs_coverage}}" >> coverage_ **Usage**: Execute PDCA cycles until target success criteria achieved, then proceed to `04d_test_session_management.md` for advanced session continuity and user decision optimization. This phase ensures systematic, measurable improvement toward 100% meaningful test success while maintaining productivity and preventing regressions. - \ No newline at end of file + diff --git a/.lad/copilot_prompts/04_implement_next_task.md b/.lad/copilot_prompts/04_implement_next_task.md index 75a288579..7b45c328d 100755 --- a/.lad/copilot_prompts/04_implement_next_task.md +++ b/.lad/copilot_prompts/04_implement_next_task.md @@ -6,7 +6,7 @@ You are Claude in Agent Mode. - After each task, update context files for subsequent sub-plans (e.g., update `context_0b_*.md` after 0a, etc.). - Track completion and integration for each sub-plan. On sub-plan completion, verify integration points and update the next sub-plan's context. -**Pre-flight Check:** +**Pre-flight Check:** 1. **Full regression test**: Run the complete test suite to establish baseline: ```bash pytest -q --tb=short @@ -22,9 +22,9 @@ You are Claude in Agent Mode. 3. **Coverage baseline**: Establish current coverage before changes: ```bash pytest --cov=. --cov-report=term-missing --tb=no -q | grep "TOTAL" - ``` + ``` -**Scope Guard:** Before making any edits, identify the minimal code region needed to satisfy the current failing test. Do **not** modify or delete code outside this region. +**Scope Guard:** Before making any edits, identify the minimal code region needed to satisfy the current failing test. Do **not** modify or delete code outside this region. **Regression Prevention:** 1. **Dependency Analysis**: Before changing any function/class, run: @@ -59,17 +59,17 @@ You are Claude in Agent Mode. Implement the **next unchecked task** only from the current sub-plan. **Workflow** -1. **Write the failing test first.** +1. **Write the failing test first.** **Testing Strategy by Component Type:** - • **API Endpoints & Web Services**: Use integration testing - import the real FastAPI/Django app, mock only external dependencies (databases, APIs, file systems). Test actual HTTP routing, validation, serialization, and error handling. + • **API Endpoints & Web Services**: Use integration testing - import the real application, mock only external dependencies (databases, APIs, file systems). Test actual HTTP routing, validation, serialization, and error handling. • **Business Logic & Algorithms**: Use unit testing - mock all dependencies, test logic in complete isolation, focus on edge cases. • **Data Processing & Utilities**: Use unit testing with minimal dependencies, use test data fixtures. - - • If you need to store intermediate notes or dependency maps, write them to `docs/_scratch/{{FEATURE_SLUG}}.md` and reference this file in subsequent sub-tasks. + + • If you need to store intermediate notes or dependency maps, write them to `docs/_scratch/{{FEATURE_SLUG}}.md` and reference this file in subsequent sub-tasks. • If the next sub-task will touch >200 lines of code or >10 files, break it into 2–5 indented sub-sub-tasks in the plan, commit that plan update, then proceed with implementation. -2. **Modify minimal code** to pass the new test without breaking existing ones. -3. **Ensure NumPy-style docstrings** on all additions. +2. **Modify minimal code** to pass the new test without breaking existing ones. +3. **Ensure NumPy-style docstrings** on all additions. 4. **Run** `pytest -q` **repeatedly until green.** 4.5 **Continuous Regression Check**: After each code change, run a quick regression test: @@ -79,16 +79,16 @@ Implement the **next unchecked task** only from the current sub-plan. ``` If any existing tests fail, fix immediately before continuing. -5. **Update docs & plan**: - • If `SPLIT=true` or SUB_PLAN_ID is set → update any `docs/{{DOC_BASENAME}}_*` or `docs/context_{{SUB_PLAN_ID}}.md` files you previously created. - • Else → update `docs/{{DOC_BASENAME}}.md`. - • **Check the box** in your plan file (`plan_{{SUB_PLAN_ID}}.md` or `plan.md`): change the leading `- [ ]` on the task (and any completed sub-steps) you just implemented to `- [x]`. +5. **Update docs & plan**: + • If `SPLIT=true` or SUB_PLAN_ID is set → update any `docs/{{DOC_BASENAME}}_*` or `docs/context_{{SUB_PLAN_ID}}.md` files you previously created. + • Else → update `docs/{{DOC_BASENAME}}.md`. + • **Check the box** in your plan file (`plan_{{SUB_PLAN_ID}}.md` or `plan.md`): change the leading `- [ ]` on the task (and any completed sub-steps) you just implemented to `- [x]`. • **Update documentation**: - In each modified source file, ensure any new or changed functions/classes have NumPy-style docstrings. - If you've added new public APIs, append their signature/purpose to the Level 2 API table in your context doc(s). - Save all doc files (`docs/{{DOC_BASENAME}}.md` or split docs). -5.5 **Quality Gate** - • Run flake8 and quick coverage as described in .copilot-instructions.md. +5.5 **Quality Gate** + • Run flake8 and quick coverage as described in .copilot-instructions.md. • **Final regression test**: Run full test suite to ensure no regressions: ```bash pytest -q --tb=short @@ -96,10 +96,10 @@ Implement the **next unchecked task** only from the current sub-plan. • If violations or test failures, pause and show first 10 issues, ask user whether to fix now. 6. **Draft commit**: - * Header ↠ `feat({{FEATURE_SLUG}}): ` ← **one sub-task only** + * Header ↠ `feat({{FEATURE_SLUG}}): ` ← **one sub-task only** * Body ↠ bullet list of the sub-steps you just did. -7. **Show changes & await approval**: +7. **Show changes & await approval**: Output `git diff --stat --staged` and await user approval. **When you're ready** to commit and push, type **y**. Then run: diff --git a/.lad/copilot_prompts/04_test_quality_systematic.md b/.lad/copilot_prompts/04_test_quality_systematic.md index ff7d55059..9b4c35d8c 100755 --- a/.lad/copilot_prompts/04_test_quality_systematic.md +++ b/.lad/copilot_prompts/04_test_quality_systematic.md @@ -69,16 +69,16 @@ class TestFailure: def execute_test_chunk_with_timeout_prevention(test_category: str) -> Dict[str, any]: """ Execute test category using proven chunking strategy to prevent timeouts - + Args: - test_category: Category like 'security', 'model_registry', 'integration' - + test_category: Category like 'unit', 'functional', 'integration' + Returns: Dict containing test results and execution metadata - + Example usage: - # Test security category with comprehensive error capture - security_results = execute_test_chunk_with_timeout_prevention('security') + # Test a category with comprehensive error capture + results = execute_test_chunk_with_timeout_prevention('unit') """ # [Copilot will generate implementation based on this comment structure] pass @@ -86,19 +86,19 @@ def execute_test_chunk_with_timeout_prevention(test_category: str) -> Dict[str, def aggregate_failure_patterns_across_categories(test_results: List[Dict]) -> Dict[TestFailureCategory, List[TestFailure]]: """ Perform holistic pattern recognition across ALL test failures - + Instead of analyzing failures sequentially, this function aggregates all failures first to identify: - Cascading failure patterns (one root cause affects multiple tests) - Cross-cutting concerns (similar issues across different modules) - Solution interaction opportunities (single fix resolves multiple issues) - + Args: test_results: List of test execution results from all categories - + Returns: Dictionary mapping failure categories to structured failure objects - + Implementation approach: 1. Extract all FAILED and ERROR entries from test outputs 2. Classify each failure using root cause taxonomy @@ -111,19 +111,19 @@ def aggregate_failure_patterns_across_categories(test_results: List[Dict]) -> Di def validate_test_against_industry_standards(test_failure: TestFailure) -> Dict[str, bool]: """ Multi-tier validation of test justification against industry standards - + Validates each test failure against: - Research Software Standard (30-60% baseline acceptable) - Enterprise Standard (85-95% expectation) - IEEE Testing Standard (industry best practices) - Solo Programmer Context (resource constraints) - + Args: test_failure: Structured test failure object - + Returns: Dictionary with justification status for each standard level - + Example output: { 'research_justified': True, @@ -142,22 +142,22 @@ def validate_test_against_industry_standards(test_failure: TestFailure) -> Dict[ def plan_phase_solution_optimization(failures: Dict[TestFailureCategory, List[TestFailure]]) -> Dict[str, any]: """ PLAN phase: Strategic solution planning with resource optimization - + Performs comprehensive solution interaction analysis: - Identifies fixes that can be batched together (compatible) - Maps dependency ordering (Fix A must complete before Fix B) - Assesses risk levels for regression prevention - Optimizes resource allocation for solo programmer context - + Priority Matrix (Enhanced for Solo Programmer): - P1-CRITICAL: Scientific validity + High impact/Low effort - P2-HIGH: System reliability + Quick wins enabling other fixes - P3-MEDIUM: Performance + Moderate effort with clear value - P4-LOW: Cosmetic + High effort/Low value (defer or remove) - + Args: failures: Categorized and structured test failures - + Returns: Implementation plan with optimized fix sequence """ @@ -167,18 +167,18 @@ def plan_phase_solution_optimization(failures: Dict[TestFailureCategory, List[Te def do_phase_systematic_implementation(implementation_plan: Dict) -> List[str]: """ DO phase: Execute fixes using optimized sequence - + Implementation strategy: 1. Quick wins first (high-impact/low-effort for momentum) 2. Dependency resolution (fixes that enable other fixes) 3. Batch compatible fixes (minimize context switching) 4. Risk management (high-risk fixes with validation) - + Integrates with TodoWrite-style progress tracking for session continuity - + Args: implementation_plan: Output from plan_phase_solution_optimization - + Returns: List of completed fix descriptions for check phase validation """ @@ -188,21 +188,21 @@ def do_phase_systematic_implementation(implementation_plan: Dict) -> List[str]: def check_phase_comprehensive_validation(completed_fixes: List[str]) -> Dict[str, any]: """ CHECK phase: Validate implementation with regression prevention - + Validation protocol: - Targeted validation for affected test categories - Integration validation (import testing) - Regression prevention for critical modules - Health metrics tracking (baseline vs current) - + Generates comparative health report: - Test collection success rate - Category-wise success rates - Critical system status validation - + Args: completed_fixes: List of fixes implemented in DO phase - + Returns: Comprehensive validation report with success metrics """ @@ -212,18 +212,18 @@ def check_phase_comprehensive_validation(completed_fixes: List[str]) -> Dict[str def act_phase_decision_framework(validation_report: Dict) -> str: """ ACT phase: Generate user decision prompt for next iteration - + Analyzes validation results and presents structured options: A) Continue cycles - Implement next priority fixes - B) Adjust approach - Modify strategy based on findings + B) Adjust approach - Modify strategy based on findings C) Add coverage analysis - Integrate coverage improvement D) Complete current level - Achieve target success threshold - + Provides specific metrics and recommendations for each option - + Args: validation_report: Output from check_phase_comprehensive_validation - + Returns: Formatted decision prompt string for user choice """ @@ -237,21 +237,21 @@ def act_phase_decision_framework(validation_report: Dict) -> str: def integrate_coverage_analysis_with_test_quality(module_name: str) -> Dict[str, any]: """ Coverage-driven test improvement using CoverUp-style methodology - + Links test failures to coverage gaps: - Identifies critical functions with <80% coverage requiring tests - Maps uncovered integration points to test failure patterns - Prioritizes test improvements by coverage impact - + Implementation approach: 1. Run coverage analysis for specified module 2. Parse coverage report for low-coverage functions 3. Cross-reference with existing test failures 4. Generate priority list for coverage-driven test creation - + Args: - module_name: Python module to analyze (e.g., 'emuses.model_registry') - + module_name: Python module to analyze (e.g., 'mypackage.submodule') + Returns: Coverage analysis with linked test improvement recommendations """ @@ -261,16 +261,16 @@ def integrate_coverage_analysis_with_test_quality(module_name: str) -> Dict[str, def generate_coverage_driven_tests(coverage_gaps: List[str], test_failures: List[TestFailure]) -> List[str]: """ Generate test code for critical coverage gaps - + Uses iterative improvement approach: - Focus on critical system components with <80% coverage - Prioritize uncovered integration points - Quality over quantity - meaningful tests vs coverage padding - + Args: coverage_gaps: List of functions/methods with insufficient coverage test_failures: Related test failures that might be coverage-related - + Returns: List of generated test code snippets ready for implementation """ @@ -284,15 +284,15 @@ def generate_coverage_driven_tests(coverage_gaps: List[str], test_failures: List def save_session_state_for_resumption(current_pdca_cycle: int, analysis_findings: Dict) -> None: """ Enhanced session state preservation for seamless resumption - + Saves comprehensive session state including: - Current PDCA cycle and phase - TodoWrite progress tracking - Analysis findings and patterns discovered - Critical context for next session - + Uses structured markdown files for human readability and tool parsing - + Args: current_pdca_cycle: Which PDCA iteration we're currently in analysis_findings: Key patterns and insights discovered @@ -303,13 +303,13 @@ def save_session_state_for_resumption(current_pdca_cycle: int, analysis_findings def load_session_state_and_resume() -> Dict[str, any]: """ Automatic session resumption with state detection - + Detects current state and determines next action: - Checks for existing TodoWrite tasks - Identifies current PDCA cycle phase - Loads previous analysis findings - Determines optimal resumption point - + Returns: Session state dictionary with resumption context """ @@ -319,16 +319,16 @@ def load_session_state_and_resume() -> Dict[str, any]: def optimize_context_for_token_efficiency(session_data: Dict) -> Dict[str, any]: """ Context optimization strategy for long-running sessions - + Implements equivalent of Claude's /compact command: - Identifies critical context to preserve - Archives resolved issues and outdated analysis - Maintains active analysis context - Saves detailed findings to permanent files - + Args: session_data: Current session context and analysis data - + Returns: Optimized context dictionary with preserved essentials """ @@ -350,13 +350,13 @@ test_analyzer = TestQualityAnalyzer() # Copilot will suggest class structure ### 2. Pattern Recognition ```python # Execute holistic pattern recognition across all test categories -# Aggregate failures from security, model_registry, integration, performance, tools +# Aggregate failures from all discovered test categories # Classify failures using root cause taxonomy: INFRASTRUCTURE, API_COMPATIBILITY, TEST_DESIGN, COVERAGE_GAPS, CONFIGURATION all_failures = aggregate_failure_patterns_across_categories(test_results) ``` -### 3. PDCA Cycle Execution +### 3. PDCA Cycle Execution ```python # PLAN: Strategic solution optimization for solo programmer context # Prioritize fixes: P1-CRITICAL (scientific validity), P2-HIGH (system reliability), P3-MEDIUM (performance), P4-LOW (cosmetic) @@ -402,4 +402,4 @@ session_state = load_session_state_and_resume() 5. **Context Provision**: Examples and usage patterns provided in function docstrings 6. **Explicit Parameter Documentation**: Clear argument descriptions help Copilot understand intent -This framework provides the same systematic test improvement capabilities as the Claude version while adapting to GitHub Copilot's strengths in function completion and comment-based prompting. \ No newline at end of file +This framework provides the same systematic test improvement capabilities as the Claude version while adapting to GitHub Copilot's strengths in function completion and comment-based prompting. diff --git a/.lad/copilot_prompts/04a_test_execution_infrastructure.md b/.lad/copilot_prompts/04a_test_execution_infrastructure.md index 803cd2175..f1ea2320f 100755 --- a/.lad/copilot_prompts/04a_test_execution_infrastructure.md +++ b/.lad/copilot_prompts/04a_test_execution_infrastructure.md @@ -40,34 +40,34 @@ class TestChunkSize(Enum): INDIVIDUAL = 1 # Timeout-prone tests def execute_test_chunk_with_timeout_prevention( - test_category: str, + test_category: str, chunk_size: Optional[int] = None, timeout_seconds: int = 120 ) -> TestExecutionResult: """ Execute test category using proven chunking strategy to prevent timeouts - + Implements intelligent chunking based on test category complexity: - - Security tests: 10-20 tests per chunk (fast, stable execution) - - Model registry: Split into logical chunks (local, API, database) + - Simple/unit tests: 10-20 tests per chunk (fast, stable execution) + - Large categories: Split into logical chunks by subgroup - Integration tests: 5-10 tests per chunk (complex setup) - Performance tests: Individual or small groups (timeout-prone) - + Args: - test_category: Category like 'security', 'model_registry', 'integration' + test_category: Category like 'unit', 'functional', 'integration' chunk_size: Override default chunk size if needed timeout_seconds: Maximum execution time per chunk - + Returns: TestExecutionResult with comprehensive execution metadata - + Example usage: - # Execute security tests with optimized chunking - security_results = execute_test_chunk_with_timeout_prevention('security') - - # Execute model registry with custom chunking - registry_results = execute_test_chunk_with_timeout_prevention( - 'model_registry', + # Execute unit tests with optimized chunking + unit_results = execute_test_chunk_with_timeout_prevention('unit') + + # Execute functional tests with custom chunking + functional_results = execute_test_chunk_with_timeout_prevention( + 'functional', chunk_size=8 ) """ @@ -82,16 +82,16 @@ def execute_test_chunk_with_timeout_prevention( def establish_comprehensive_test_baseline() -> Dict[str, TestExecutionResult]: """ Create complete test inventory and execute baseline analysis - + Performs comprehensive test discovery and categorization: - Test collection with error detection - Category-wise execution tracking - Health metrics establishment - Baseline statistics for comparison - + Returns: Dictionary mapping test categories to execution results - + Implementation approach: 1. Run pytest --collect-only for complete test discovery 2. Extract collection statistics and error rates @@ -107,19 +107,19 @@ def aggregate_test_results_across_categories( ) -> Dict[str, any]: """ Aggregate test execution results for comprehensive health analysis - + Combines results from all test categories to provide: - Overall success rate calculations - Category-wise performance comparison - Health metrics trending - Execution efficiency analysis - + Args: category_results: Results from all executed test categories - + Returns: Comprehensive health metrics dictionary - + Output structure: { 'total_tests': int, @@ -138,18 +138,18 @@ def generate_test_health_metrics_report( ) -> None: """ Generate comprehensive test health report with baseline statistics - + Creates structured markdown report containing: - Executive summary of test health - Category-wise success rates - Collection error analysis - Execution efficiency metrics - Baseline establishment confirmation - + Args: aggregated_results: Output from aggregate_test_results_across_categories output_file: Path for generated health report - + Report sections: 1. Overall Statistics 2. Category Performance Analysis @@ -167,21 +167,21 @@ def optimize_test_execution_for_token_efficiency( ) -> Tuple[str, str]: """ Execute tests with token-optimized output handling - + Implements proven patterns for large test suite execution: - Comprehensive output capture with intelligent filtering - Error and warning prioritization - Summary extraction and preservation - Detailed logging for later analysis - + Args: test_command: Complete pytest command to execute category: Test category for context-specific filtering max_output_lines: Maximum lines to return for immediate analysis - + Returns: Tuple of (filtered_output, full_output_file_path) - + Token optimization strategy: - Capture full output to file for comprehensive analysis - Filter critical information (errors, warnings, failures) @@ -197,17 +197,17 @@ def save_execution_context_for_analysis_phase( ) -> None: """ Preserve execution context for next phase (04b Analysis Framework) - + Creates structured context files needed for pattern analysis: - test_execution_baseline.md: Category-wise results - test_health_metrics.md: Overall statistics - comprehensive_test_output.txt: Aggregated results - test_context_summary.md: Context preservation - + Args: execution_results: Results from all test category executions health_metrics: Aggregated health analysis - + Context preservation strategy: 1. Structure results for pattern recognition 2. Preserve baseline for comparison tracking @@ -230,17 +230,17 @@ test_executor = TestExecutionInfrastructure() # Copilot will suggest class stru ### 2. Category-Specific Execution ```python -# Execute security tests with timeout prevention -# Use proven chunk size for fast, stable security test execution +# Execute unit tests with timeout prevention +# Use proven chunk size for fast, stable test execution # Generate comprehensive results with health metrics -security_results = execute_test_chunk_with_timeout_prevention('security') +unit_results = execute_test_chunk_with_timeout_prevention('unit') -# Execute model registry tests with intelligent chunking -# Split into logical groups: local, API, database tests +# Execute functional tests with intelligent chunking +# Split into logical groups based on test category complexity # Handle complex setup requirements with appropriate timeouts -registry_results = execute_test_chunk_with_timeout_prevention('model_registry') +functional_results = execute_test_chunk_with_timeout_prevention('functional') ``` ### 3. Comprehensive Baseline Establishment @@ -276,4 +276,4 @@ filtered_output, full_file = optimize_test_execution_for_token_efficiency( 5. **Token Awareness**: Built-in optimization for large output handling 6. **Context Preparation**: Structured output preparation for next phase -This module provides the foundation for systematic test improvement while leveraging GitHub Copilot's strengths in function completion and structured development patterns. \ No newline at end of file +This module provides the foundation for systematic test improvement while leveraging GitHub Copilot's strengths in function completion and structured development patterns. diff --git a/.lad/copilot_prompts/04c_test_improvement_cycles.md b/.lad/copilot_prompts/04c_test_improvement_cycles.md index fda1fbea8..7e57db1cd 100755 --- a/.lad/copilot_prompts/04c_test_improvement_cycles.md +++ b/.lad/copilot_prompts/04c_test_improvement_cycles.md @@ -48,15 +48,15 @@ class ProgressTracker: def __init__(self): self.tasks: Dict[str, ImplementationTask] = {} self.cycles: List[PDCACycle] = [] - + def add_task(self, task: ImplementationTask) -> None: """Add task to progress tracking""" pass - + def update_task_status(self, task_id: str, status: str) -> None: """Update task status with timestamp""" pass - + def get_progress_summary(self) -> Dict[str, any]: """Generate current progress summary""" pass @@ -67,20 +67,20 @@ def initialize_pdca_cycle_with_prioritized_tasks( ) -> Tuple[PDCACycle, ProgressTracker]: """ PLAN Phase: Initialize PDCA cycle with strategic solution planning - + Creates systematic implementation plan with TodoWrite-style tracking: - Priority-based task selection (P1-CRITICAL first) - Solution batching optimization for efficiency - Resource allocation and effort estimation - Success criteria definition with measurable outcomes - + Args: implementation_context: Output from analysis framework (04b) cycle_number: Current PDCA cycle iteration - + Returns: Tuple of (PDCACycle object, ProgressTracker instance) - + PLAN phase implementation: 1. Extract P1-CRITICAL and P2-HIGH tasks from context 2. Identify compatible tasks for batching @@ -88,7 +88,7 @@ def initialize_pdca_cycle_with_prioritized_tasks( 4. Estimate effort and set realistic cycle scope 5. Define success criteria and validation requirements 6. Initialize TodoWrite progress tracking - + Example task organization: P1-CRITICAL: Scientific validity + High impact/Low effort P2-HIGH: System reliability + Quick wins enabling other fixes @@ -104,21 +104,21 @@ def execute_systematic_implementation_with_progress_tracking( ) -> Dict[str, any]: """ DO Phase: Systematic implementation with real-time progress tracking - + Executes fixes using optimized sequence and tracks progress: - Mark current task as in_progress before beginning work - Implement fixes based on root cause analysis and strategy - Document implementation decisions and approach - Update progress tracker in real-time - Handle dependencies and validation requirements - + Args: pdca_cycle: Current PDCA cycle with selected tasks progress_tracker: TodoWrite-style progress tracking - + Returns: Implementation results with completed tasks and metadata - + DO phase implementation strategy: 1. Process tasks in dependency order 2. Mark each task in_progress before starting @@ -130,7 +130,7 @@ def execute_systematic_implementation_with_progress_tracking( 4. Document implementation approach and rationale 5. Mark tasks completed only after successful implementation 6. Handle blockers by creating new tasks or adjusting approach - + Implementation patterns: Quick wins first (momentum building) Dependency resolution (unblock other work) @@ -146,33 +146,33 @@ def perform_comprehensive_validation_with_regression_prevention( ) -> Dict[str, any]: """ CHECK Phase: Comprehensive validation with regression prevention - + Validates implementation results using systematic approach: - Targeted validation for affected test categories - Integration validation (import testing, basic functionality) - Regression prevention for critical systems - Health metrics update and comparison with baseline - + Args: implementation_results: Output from DO phase execution pdca_cycle: Current PDCA cycle with success criteria - + Returns: Comprehensive validation report with health metrics - + CHECK phase validation protocol: 1. Direct test validation: Run tests for implemented fixes 2. Integration validation: Verify imports and basic functionality 3. Regression testing: Ensure critical systems remain functional 4. Health metrics update: Compare current vs baseline success rates 5. Success criteria evaluation: Assess cycle objectives achievement - + Validation levels: Immediate: Affected tests pass without errors Integration: Related modules import and function correctly System: Critical test categories maintain high success rates Baseline: Overall health metrics show improvement or stability - + Health metrics tracking: - Test collection success rate - Category-wise success rate improvements @@ -189,21 +189,21 @@ def generate_user_decision_framework_with_options( ) -> str: """ ACT Phase: Generate structured user decision framework - + Analyzes validation results and presents strategic options: A) Continue cycles - Implement next priority fixes - B) Adjust approach - Modify strategy based on findings + B) Adjust approach - Modify strategy based on findings C) Add coverage analysis - Integrate coverage improvement D) Complete current level - Achieve target success threshold - + Args: validation_report: Results from CHECK phase validation pdca_cycle: Completed PDCA cycle with results progress_tracker: Current progress state - + Returns: Formatted decision prompt with specific recommendations - + ACT phase decision framework: 1. Analyze cycle completion and success metrics 2. Assess remaining priority tasks and effort required @@ -211,13 +211,13 @@ def generate_user_decision_framework_with_options( 4. Present structured options with specific metrics 5. Provide technical recommendation based on analysis 6. Consider resource optimization for solo programmer context - + Decision option details: A) CONTINUE: Next cycle focus, estimated effort, target improvement B) ADJUST: Strategy refinement needs, approach modifications C) COVERAGE: Coverage gap analysis, integration complexity D) COMPLETE: Achievement validation, resource optimization - + User decision tracking: - Track choice patterns for preference learning - Optimize future decision presentations @@ -233,26 +233,26 @@ def save_comprehensive_session_state_for_resumption( ) -> None: """ Enhanced session state preservation for seamless resumption - + Saves complete session state including: - Current PDCA cycle and phase - TodoWrite progress tracking state - Analysis findings and patterns discovered - Implementation decisions and approaches used - Critical context for next session continuation - + Args: pdca_cycle: Current PDCA cycle state progress_tracker: TodoWrite progress tracking cycle_findings: Key insights and patterns discovered - + Session state preservation: 1. PDCA cycle progress: Which cycle, phase, tasks status 2. TodoWrite state: All tasks with current status 3. Key findings: Successful approaches, patterns discovered 4. Implementation context: Decision rationale, approaches used 5. Next session preparation: Immediate actions, context to load - + File organization: - pdca_session_state.md: Comprehensive session overview - essential_context.md: Critical information for resumption @@ -268,20 +268,20 @@ def integrate_coverage_analysis_with_pdca_cycles( ) -> Dict[str, any]: """ Coverage-driven test enhancement integration (Option C) - + Links test failures to coverage gaps for comprehensive improvement: - Identifies critical functions with <80% coverage - Maps uncovered integration points to test failure patterns - Prioritizes coverage improvements by impact and effort - Integrates coverage tasks into PDCA cycle framework - + Args: current_implementation_context: Active PDCA cycle context coverage_focus_modules: Modules to analyze for coverage gaps - + Returns: Enhanced implementation context with coverage-driven tasks - + Coverage integration approach: 1. Run coverage analysis for specified modules 2. Identify critical gaps requiring test creation/improvement @@ -289,7 +289,7 @@ def integrate_coverage_analysis_with_pdca_cycles( 4. Prioritize coverage tasks by system criticality 5. Integrate coverage tasks into existing PDCA framework 6. Balance test quality fixes vs coverage enhancement - + CoverUp-style methodology: - Focus on critical system components with low coverage - Prioritize uncovered integration points @@ -305,20 +305,20 @@ def optimize_pdca_cycles_for_solo_programmer_efficiency( ) -> Dict[str, any]: """ Resource optimization for solo programmer productivity - + Optimizes PDCA cycle execution for individual developer constraints: - Time management and session length optimization - Context switching minimization through batching - Energy management and optimal task sequencing - Productivity pattern recognition and adaptation - + Args: implementation_plan: Current PDCA cycle implementation plan resource_constraints: Developer time, energy, focus constraints - + Returns: Optimized implementation plan for solo programmer efficiency - + Solo programmer optimizations: 1. Batch compatible fixes to minimize context switching 2. Sequence tasks by complexity and energy requirements @@ -326,7 +326,7 @@ def optimize_pdca_cycles_for_solo_programmer_efficiency( 4. Prioritize high-impact/low-effort combinations 5. Build momentum with quick wins before complex tasks 6. Plan break timing and energy management - + Efficiency strategies: - Start sessions with momentum-building quick wins - Group similar task types to maintain focus @@ -418,7 +418,7 @@ save_comprehensive_session_state_for_resumption( enhanced_context = integrate_coverage_analysis_with_pdca_cycles( current_implementation_context, - ['emuses.model_registry', 'emuses.analysis', 'emuses.security'] + ['mypackage.core', 'mypackage.utils', 'mypackage.api'] ) ``` @@ -432,4 +432,4 @@ enhanced_context = integrate_coverage_analysis_with_pdca_cycles( 6. **Decision Framework**: Structured user decision support with metrics and recommendations 7. **Validation Protocols**: Systematic regression prevention and health tracking -This module ensures systematic, measurable improvement toward 100% meaningful test success while maintaining productivity and preventing regressions through structured PDCA cycles optimized for individual developer workflows. \ No newline at end of file +This module ensures systematic, measurable improvement toward 100% meaningful test success while maintaining productivity and preventing regressions through structured PDCA cycles optimized for individual developer workflows. diff --git a/.lad/documentation_standards/MKDOCS_MATERIAL_FORMATTING_GUIDE.md b/.lad/documentation_standards/MKDOCS_MATERIAL_FORMATTING_GUIDE.md index 67f494c34..b5f0df4ed 100755 --- a/.lad/documentation_standards/MKDOCS_MATERIAL_FORMATTING_GUIDE.md +++ b/.lad/documentation_standards/MKDOCS_MATERIAL_FORMATTING_GUIDE.md @@ -1,7 +1,7 @@ # MkDocs Material Formatting Guide for Claude -**Version**: 1.0 -**Date**: 2025-08-17 +**Version**: 1.0 +**Date**: 2025-08-17 **Purpose**: LAD Framework documentation standards to prevent systematic markdown errors in MkDocs Material projects --- @@ -167,7 +167,7 @@ def process_data(): ``` ```bash -emuses analyze --input data.csv +mycommand analyze --input data.csv ``` ```yaml @@ -269,5 +269,5 @@ Technical details for developers. --- -*LAD Framework Documentation Standards v1.0* -*Research-based guidelines for error-free technical documentation* \ No newline at end of file +*LAD Framework Documentation Standards v1.0* +*Research-based guidelines for error-free technical documentation*