diff --git a/.lad/.copilot-instructions.md b/.lad/.copilot-instructions.md
index a17d77bab..a0c0ec543 100755
--- a/.lad/.copilot-instructions.md
+++ b/.lad/.copilot-instructions.md
@@ -1,6 +1,6 @@
 # Global Copilot Instructions
 
-* Prioritize **minimal scope**: only edit code directly implicated by the failing test.  
+* Prioritize **minimal scope**: only edit code directly implicated by the failing test.
 * Protect existing functionality: do **not** delete or refactor code outside the immediate test context.
 * Before deleting any code, follow the "Coverage & Code Safety" guidelines below.
 
@@ -8,140 +8,82 @@ Copilot, do not modify any files under .lad/.
 All edits must occur outside .lad/, or in prompts/ when explicitly updating LAD itself.
 
 Coding & formatting
-* Follow PEP 8; run Black.
+* Follow PEP 8; formatting enforced by pre-commit hooks (black, isort).
 * Use type hints everywhere.
-* External dependencies limited to numpy, pandas, requests.
-* Target Python 3.11.
+* Respect existing project dependencies declared in pyproject.toml.
 
 Testing & linting
-* Write tests using component-appropriate strategy (see Testing Strategy below).
-* Run flake8 with `--max-complexity=10`; keep complexity ≤ 10.
+* Write tests using pytest; run via `tox -e py3` or `python -m pytest dandi`.
+* Tests requiring the DANDI archive use Docker Compose fixtures.
+* Mark AI-generated tests with `@pytest.mark.ai_generated`.
 * Every function/class **must** include a **NumPy-style docstring** (Sections: Parameters, Returns, Raises, Examples).
 
 ## Testing Strategy by Component Type
 
-**API Endpoints & Web Services:**
-* Use **integration testing** - import the real FastAPI/Django/Flask app
-* Mock only external dependencies (databases, external APIs, file systems)
-* Test actual HTTP routing, validation, serialization, and error handling
-* Verify real request/response behavior and framework integration
+**CLI Commands:**
+* Use click's `CliRunner` for testing CLI entry points
+* Test argument parsing, output formatting, and error messages
+* Mock API calls and filesystem where appropriate
 
-**Business Logic & Algorithms:**
-* Use **unit testing** - mock all dependencies completely
-* Test logic in complete isolation, focus on edge cases
-* Maximize test speed and reliability
-* Test pure business logic without framework concerns
+**API Client Operations (upload, download, move, etc.):**
+* Use **integration testing** with Docker Compose fixtures for archive interactions
+* Mock only external services not under test
+* Test actual HTTP interactions, authentication, and error handling
 
-**Data Processing & Utilities:**
+**File Processing & Utilities:**
 * Use **unit testing** with minimal dependencies
-* Use test data fixtures for predictable inputs
+* Use test data fixtures (tmp_path, simple NWB files) for predictable inputs
 * Focus on input/output correctness and error handling
 
 ## Regression Prevention
 
 **Before making changes:**
-* Run full test suite to establish baseline: `pytest -q --tb=short`
+* Run full test suite to establish baseline: `tox -e py3` or `python -m pytest dandi`
 * Identify dependencies: `grep -r "function_name" . --include="*.py"`
 * Understand impact scope before modifications
 
 **During development:**
-* Run affected tests after each change: `pytest -q tests/test_modified_module.py`
+* Run affected tests after each change: `python -m pytest dandi/tests/test_modified_module.py`
 * Preserve public API interfaces or update all callers
 * Make minimal changes focused on the failing test
 
 **Before commit:**
-* Run full test suite: `pytest -q --tb=short`
+* Run full test suite: `tox -e py3`
 * Verify no regressions introduced
 * Ensure test coverage maintained or improved
 
-## Code Quality Setup (One-time per project)
+## Code Quality Setup
 
-**1. Install quality tools:**
-```bash
-pip install flake8 pytest coverage radon flake8-radon black
-```
-
-**2. Configure .flake8 file in project root:**
-```ini
-[flake8]
-max-complexity = 10
-radon-max-cc = 10
-exclude = 
-    __pycache__,
-    .git,
-    .lad,
-    .venv,
-    venv,
-    build,
-    dist
-```
-
-**3. Configure .coveragerc file (see kickoff prompt for template)**
-
-**4. Verify setup:**
-```bash
-flake8 --version                    # Should show flake8-radon plugin
-radon --version                     # Confirm radon installation
-pytest --cov=. --version           # Confirm coverage plugin
-```
-
-## Installing & Configuring Radon
+**This project already has quality tooling configured.** Do not create new config files; use existing ones.
 
-**Install Radon and its Flake8 plugin:**
+**Verify setup:**
 ```bash
-pip install radon flake8-radon
+pre-commit install               # Install pre-commit hooks if not present
+tox -e lint                      # Run linting
+tox -e typing                    # Run type checking
+python -m pytest dandi           # Run tests
 ```
-This installs Radon's CLI and enables the `--radon-max-cc` option in Flake8.
 
-**Enable Radon in Flake8** by adding to `.flake8` or `setup.cfg`:
-```ini
-[flake8]
-max-complexity = 10
-radon-max-cc = 10
-```
-Functions exceeding cyclomatic complexity 10 will be flagged as errors (C901).
-
-**Verify Radon raw metrics:**
-```bash
-radon raw path/to/your/module.py
-```
-Outputs LOC, LLOC, comments, blank lines—helping you spot oversized modules quickly.
-
-**(Optional) Measure Maintainability Index:**
-```bash
-radon mi path/to/your/module.py
-```
-Gives a 0–100 score indicating code maintainability.
+**Existing configuration locations:**
+- **Linting/formatting**: `.pre-commit-config.yaml` (black, isort, flake8)
+- **Pytest config**: `tox.ini` under `[pytest]` section
+- **Type checking**: `tox.ini` under `[testenv:typing]`
+- **Dependencies**: `pyproject.toml`
 
 Coverage & Code Safety
-* For safety checks, do **not** run coverage inside VS Code.  
-  Instead, ask the user:
-  > "Please run in your terminal:  
-  > ```bash
-  > coverage run -m pytest [test_files] -q && coverage html
-  > ```  
-  > then reply **coverage complete**."
-
 * Before deleting code, verify:
   1. 0% coverage via `coverage report --show-missing`
-  2. Absence from Level-2 API docs  
-  If both hold, prompt:
-  
-  Delete <name>? (y/n)
-  Reason: 0% covered and not documented.
-  (Tip: use VS Code "Find All References" on <name>.)
+  2. No references found via grep
+  If both hold, prompt for confirmation before deletion.
 
 Commits
-* Use Conventional Commits. Example:  
-  `feat(pipeline-filter): add ROI masking helper`
-* Keep body as bullet list of sub-tasks completed.
+* Follow existing project conventions for commit messages.
+* pre-commit hooks will auto-fix formatting; if commit fails due to auto-fixes, re-run the commit.
 
 Docs
-* High-level docs live under the target project's `docs/` and are organised in three nested levels using `<details>` tags.
+* High-level docs live under the target project's `docs/` directory (Sphinx RST format).
 
 * After completing each **main task** (top-level checklist item), run:
-  • `flake8 {{PROJECT_NAME}} --max-complexity=10`
-  • `python -m pytest --cov={{PROJECT_NAME}} --cov-context=test -q --maxfail=1`
+  • `tox -e lint`
+  • `python -m pytest dandi -q --maxfail=1`
   If either step fails, pause for user guidance.
-
-* **Radon checks:** Use `radon raw <file>` to get SLOC; use `radon mi <file>` to check maintainability. If `raw` LOC > 500 or MI < 65, propose splitting the module.
diff --git a/.lad/CLAUDE.md b/.lad/CLAUDE.md
index 1fa510f06..5228ad6f4 100755
--- a/.lad/CLAUDE.md
+++ b/.lad/CLAUDE.md
@@ -4,10 +4,10 @@
 *Auto-updated by LAD workflows - current system understanding*
 
 ## Code Style Requirements
-- **Docstrings**: NumPy-style required for all functions/classes
-- **Linting**: Flake8 compliance (max-complexity 10)
-- **Testing**: TDD approach, component-aware strategies
-- **Coverage**: 90%+ target for new code
+- **Docstrings**: NumPy-style for public APIs
+- **Formatting**: Black (line length 100), isort (profile="black"), enforced via pre-commit
+- **Linting**: `tox -e lint`; type checking: `tox -e typing`
+- **Testing**: pytest via `tox -e py3`; Docker Compose for integration tests
 
 ## Communication Guidelines
 **Objective, European-Style Communication**:
@@ -26,9 +26,10 @@
 - **Progress tracking**: Update both TodoWrite and plan.md files consistently
 
 ## Testing Strategy Guidelines
-- **API Endpoints**: Integration testing (real app + mocked external deps)
-- **Business Logic**: Unit testing (complete isolation + mocks)
-- **Data Processing**: Unit testing (minimal deps + test fixtures)
+- **CLI Commands**: click CliRunner + mocked API calls
+- **API Client Operations**: Integration testing with Docker Compose fixtures
+- **File Processing & Utilities**: Unit testing (tmp_path + test data fixtures)
+- **AI-generated tests**: Mark with `@pytest.mark.ai_generated`
 
 ## Project Structure Patterns
 *Learned from exploration - common patterns and conventions*
@@ -46,6 +47,8 @@
 
 ### Token Optimization for Large Codebases
 **Standard test commands:**
+- **Full suite**: `tox -e py3` or `python -m pytest dandi`
+- **Single test**: `python -m pytest dandi/tests/test_file.py::test_function -v`
 - **Large test suites**: Use `2>&1 | tail -n 100` for pytest commands to capture only final results/failures
 - **Coverage reports**: Use `tail -n 150` for comprehensive coverage output to include summary
 - **Keep targeted tests unchanged**: Single test runs (`pytest -xvs`) don't need redirection
@@ -94,4 +97,4 @@
 - *No anti-patterns logged*
 
 ---
-*Last updated by Claude Code LAD Framework*
\ No newline at end of file
+*Last updated by Claude Code LAD Framework*
diff --git a/.lad/LAD_RECIPE.md b/.lad/LAD_RECIPE.md
index 390bfdd12..9438a923b 100755
--- a/.lad/LAD_RECIPE.md
+++ b/.lad/LAD_RECIPE.md
@@ -3,11 +3,11 @@
 > **Goal**: Provide repeatable workflows for implementing complex Python features iteratively and safely.
 >
 > **Two Optimized Approaches:**
-> 
+>
 > ## 🚀 Claude Code Workflow (Recommended for 2025)
 > **3-phase autonomous workflow optimized for command-line development**
 > 1. **Autonomous Context & Planning** — Dynamic codebase exploration + TDD planning
-> 2. **Iterative Implementation** — TDD loop with continuous quality monitoring  
+> 2. **Iterative Implementation** — TDD loop with continuous quality monitoring
 > 3. **Quality & Finalization** — Self-review + comprehensive validation
 >
 > ## 🛠️ GitHub Copilot Chat Workflow (VSCode)
@@ -39,7 +39,7 @@
 │   ├── 04b_test_analysis_framework.md          # 🆕 Pattern recognition
 │   ├── 04c_test_improvement_cycles.md          # 🆕 PDCA methodology
 │   └── 04d_test_session_management.md          # 🆕 Session continuity
-├── copilot_prompts/                            # 🛠️ Copilot Chat workflow  
+├── copilot_prompts/                            # 🛠️ Copilot Chat workflow
 │   ├── 00_feature_kickoff.md
 │   ├── 01_context_gathering.md
 │   ├── 02_plan_feature.md
@@ -56,14 +56,14 @@
 │   ├── 05_code_review_package.md
 │   └── 06_self_review_with_chatgpt.md
 └── .vscode/                                    # optional for Copilot workflow
-    ├── settings.json               
+    ├── settings.json
     └── extensions.json
 ```
 
 Import the complete `.lad/` directory into any target project once on main.
 
-* Target Python 3.11.
-* Commit messages follow Conventional Commits.
+* Target the Python versions supported by the project (see pyproject.toml).
+* Commit messages follow project conventions.
 * All generated docs follow the *plain summary + nested `<details>`* convention.
 
 ---
@@ -102,10 +102,10 @@ Import the complete `.lad/` directory into any target project once on main.
 | **4c. Test Improvement Cycles** | `claude_prompts/04c_test_improvement_cycles.md` | ~30-60 min | PDCA cycles, TodoWrite integration, systematic implementation with validation |
 | **4d. Test Session Management** | `claude_prompts/04d_test_session_management.md` | ~5-10 min | Session continuity, context optimization, adaptive decision framework |
 
-**Key Benefits**: 
+**Key Benefits**:
 - 🎯 **Autonomous execution** — Minimal intervention points with autonomous tool usage
 - ⚡ **3-5x faster development** — Autonomous execution with real-time feedback
-- 🔄 **Continuous quality** — Integrated testing and regression prevention  
+- 🔄 **Continuous quality** — Integrated testing and regression prevention
 - 📊 **Progress visibility** — TodoWrite integration for status tracking
 - 🛡️ **Quality assurance** — Comprehensive validation and testing
 - 🔬 **Systematic improvement** — PDCA cycles for test quality optimization
@@ -113,7 +113,7 @@ Import the complete `.lad/` directory into any target project once on main.
 
 ### 2.4 Claude Code Workflow Features
 
-**Autonomous Context Gathering**: 
+**Autonomous Context Gathering**:
 - Uses Task/Glob/Grep tools for codebase exploration
 - No need to manually open files or navigate directories
 - Dynamic context based on feature requirements
@@ -206,7 +206,7 @@ Import the complete `.lad/` directory into any target project once on main.
 
 **Common Anti-Patterns to Avoid**:
 - ❌ Starting implementation without baseline testing
-- ❌ Running multiple tasks in_progress simultaneously  
+- ❌ Running multiple tasks in_progress simultaneously
 - ❌ Skipping validation steps in test improvement cycles
 - ❌ Not using `/compact` when context becomes unwieldy
 - ❌ Manual context management instead of using LAD session state
@@ -279,7 +279,7 @@ Import the complete `.lad/` directory into any target project once on main.
 
 **Usage Pattern**:
 ```python
-# Initialize comprehensive test analysis environment  
+# Initialize comprehensive test analysis environment
 # Purpose: Systematic test quality improvement for solo programmers
 # Methodology: PDCA cycles with holistic pattern recognition
 
@@ -293,14 +293,14 @@ categorized_failures = aggregate_failure_patterns_across_categories(test_results
 
 **Splitting Benefits:**
 - **Foundation-First**: Core models and infrastructure implemented first
-- **Domain Separation**: Security, performance, and API concerns handled separately  
+- **Domain Separation**: Security, performance, and API concerns handled separately
 - **Context Inheritance**: Each sub-plan builds on previous implementations
 - **Manageable Scope**: Each sub-plan stays ≤6 tasks, ≤25 sub-tasks
 
 **Sub-Plan Structure:**
 - `plan_0a_foundation.md` - Core models, job management, infrastructure
 - `plan_0b_{{domain}}.md` - Business logic, pipeline integration
-- `plan_0c_interface.md` - API endpoints, external interfaces  
+- `plan_0c_interface.md` - API endpoints, external interfaces
 - `plan_0d_security.md` - Security, performance, compatibility
 
 **Context Evolution:** As each sub-plan completes, context files for subsequent sub-plans are updated with new APIs, interfaces, and integration points, ensuring later phases have complete system visibility.
@@ -309,36 +309,36 @@ categorized_failures = aggregate_failure_patterns_across_categories(test_results
 
 **LAD uses component-appropriate testing strategies** to ensure both comprehensive coverage and efficient development:
 
-**API Endpoints & Web Services:**
-- **Integration Testing**: Import and test the real FastAPI/Django/Flask app
-- **Mock External Dependencies**: Only databases, external APIs, file systems
-- **Test Framework Behavior**: HTTP routing, validation, serialization, error handling
-- **Why**: APIs are integration points - the framework behavior is part of what you're building
-
-**Business Logic & Algorithms:**
-- **Unit Testing**: Mock all dependencies, test in complete isolation
-- **Focus**: Edge cases, error conditions, algorithmic correctness
-- **Benefits**: Fast execution, complete control, reliable testing
-- **Why**: Pure logic should be testable without external concerns
-
-**Data Processing & Utilities:**
-- **Unit Testing**: Minimal dependencies, test data fixtures
-- **Focus**: Input/output correctness, transformation accuracy
+**CLI Commands:**
+- **Click CliRunner Testing**: Test CLI entry points with click's test runner
+- **Mock External Dependencies**: API calls, filesystem, network
+- **Test Behavior**: Argument parsing, output formatting, error messages, exit codes
+- **Why**: CLI is the user-facing interface - test user workflows end-to-end
+
+**API Client & Operations (upload, download, move, etc.):**
+- **Integration Testing**: Use Docker Compose fixtures for DANDI archive interactions
+- **Mock External Dependencies**: Only services not under test
+- **Test Behavior**: HTTP interactions, authentication, error handling, retries
+- **Why**: Operations involve real API interactions that need integration coverage
+
+**File Processing & Utilities:**
+- **Unit Testing**: Minimal dependencies, test data fixtures (tmp_path, NWB files)
+- **Focus**: Input/output correctness, metadata extraction, validation
 - **Benefits**: Predictable test data, isolated behavior verification
 
-**Example - API Testing:**
+**Example - CLI Testing:**
 ```python
-# ✅ Integration testing for API endpoints
-from myapp.app import create_app  # Real app
+# ✅ Integration testing for CLI commands
+from click.testing import CliRunner
 from unittest.mock import patch
 
-def test_api_endpoint():
-    app = create_app()
-    with patch('myapp.database.get_user') as mock_db:  # Mock external deps
-        mock_db.return_value = {"id": 1, "name": "test"}
-        client = TestClient(app)  # Test real routing/validation
-        response = client.get("/api/users/1")
-        assert response.status_code == 200
+from dandi.cli.command import main
+
+def test_cli_command():
+    runner = CliRunner()
+    with patch('dandi.dandiapi.DandiAPIClient') as mock_client:
+        result = runner.invoke(main, ["ls", "DANDI:000027"])
+        assert result.exit_code == 0
 ```
 
 ---
@@ -441,8 +441,8 @@ The agent may run commands (push, commit), but will:
 
 ## 9 ⚙️ Settings & Linting
 
-* Lint using **Flake8**.
-* Commit messages follow **Conventional Commits**.
+* Lint and format via **pre-commit hooks** (black, isort, flake8).
+* Run linting: `tox -e lint`; type checking: `tox -e typing`.
 * Docstrings follow **NumPy style**.
 
 ---
@@ -500,7 +500,7 @@ The agent may run commands (push, commit), but will:
 **Knowledge Accumulation Patterns**:
 - **Successful approaches**: Preserve working patterns in CLAUDE.md
 - **Failed approaches**: Document what to avoid and why
-- **User preferences**: Learn decision patterns for framework adaptation  
+- **User preferences**: Learn decision patterns for framework adaptation
 - **Process optimization**: Compound improvement across multiple sessions
 
 **Context File Organization**:
@@ -547,4 +547,4 @@ Enjoy faster, safer feature development with comprehensive test quality improvem
 - **90%+ test success rates** through systematic improvement
 - **Seamless session resumption** across interruptions and context switches
 
-This enhanced LAD framework represents the culmination of real-world usage patterns, systematic test improvement methodologies, and cross-session productivity optimization for solo programmers working on complex research software.
\ No newline at end of file
+This enhanced LAD framework represents the culmination of real-world usage patterns, systematic test improvement methodologies, and cross-session productivity optimization for solo programmers working on complex research software.
diff --git a/.lad/claude_prompts/00_feature_kickoff.md b/.lad/claude_prompts/00_feature_kickoff.md
index 9050b5d81..6f26dabb6 100755
--- a/.lad/claude_prompts/00_feature_kickoff.md
+++ b/.lad/claude_prompts/00_feature_kickoff.md
@@ -5,19 +5,19 @@ You are Claude, an expert software architect setting up a robust development env
 
 **Autonomous Capabilities**: File operations (Read, Write, Edit), command execution (Bash), environment validation, and configuration setup.
 
-**Quality Standards**: 
+**Quality Standards**:
 - Flake8 compliance (max-complexity 10)
 - Test coverage ≥90% for new code
 - NumPy-style docstrings required
 - Conventional commit standards
 
-**Objectivity Guidelines**: 
+**Objectivity Guidelines**:
 - Challenge assumptions - Ask "How do I know this is true?"
 - State limitations clearly - "I cannot verify..." or "This assumes..."
 - Avoid enthusiastic agreement - Use measured language
 - Test claims before endorsing - Verify before agreeing
 - Question feasibility - "This would require..." or "The constraint is..."
-- Admit uncertainty - "I'm not confident about..." 
+- Admit uncertainty - "I'm not confident about..."
 - Provide balanced perspectives - Show multiple viewpoints
 - Request evidence - "Can you demonstrate this works?"
 </system>
@@ -38,81 +38,43 @@ You are Claude, an expert software architect setting up a robust development env
    - Validate framework integrity (don't modify `.lad/` contents)
 
 2. **Python Environment**:
-   - Check Python version (3.11+ required)
+   - Check Python version matches project's supported versions (see pyproject.toml)
    - Verify required packages are installable
-   - Test basic development tools
+   - Test basic development tools (tox, pre-commit)
 
 3. **Git Repository**:
    - Confirm we're in a git repository
    - Check current branch status
    - Verify clean working directory or document current state
 
-### Step 2: Quality Standards Setup
+### Step 2: Quality Standards Verification
 
-**Create/verify quality configuration files**:
+**Verify existing quality configuration** (do NOT create new config files if they already exist):
 
-1. **Flake8 Configuration** (`.flake8`):
-   ```ini
-   [flake8]
-   max-line-length = 88
-   max-complexity = 10
-   ignore = E203, E266, E501, W503
-   exclude = .git,__pycache__,docs/,build/,dist/,.lad/
-   ```
-
-2. **Coverage Configuration** (`.coveragerc`):
-   ```ini
-   [run]
-   branch = True
-   source = .
-   omit = 
-       */tests/*
-       */test_*
-       */__pycache__/*
-       */.*
-       .lad/*
-       setup.py
-       */venv/*
-       */env/*
-
-   [report]
-   show_missing = True
-   skip_covered = False
-   
-   [html]
-   directory = coverage_html
-   ```
+1. **Pre-commit hooks**: Check `.pre-commit-config.yaml` exists; run `pre-commit install` if hooks not installed
+2. **Linting config**: Verify via `tox -e lint`
+3. **Pytest config**: Check `[pytest]` section in `tox.ini`
+4. **Type checking**: Verify via `tox -e typing`
 
-3. **Pytest Configuration** (add to `pytest.ini` or `pyproject.toml` if missing):
-   ```ini
-   [tool:pytest]
-   testpaths = tests
-   python_files = test_*.py
-   python_classes = Test*
-   python_functions = test_*
-   addopts = --strict-markers --strict-config
-   markers =
-       slow: marks tests as slow (deselect with '-m "not slow"')
-       integration: marks tests as integration tests
-   ```
+**Only create configuration files for NEW projects that lack them.**
 
 ### Step 3: Baseline Quality Assessment
 
 **Establish current state**:
 1. **Test Suite Baseline**:
    ```bash
-   pytest --collect-only  # Count existing tests
-   pytest -q --tb=short   # Run existing tests
+   python -m pytest dandi --collect-only  # Count existing tests
+   python -m pytest dandi -q --tb=short   # Run existing tests
    ```
 
 2. **Coverage Baseline**:
    ```bash
-   pytest --cov=. --cov-report=term-missing --cov-report=html
+   python -m pytest dandi --cov=dandi --cov-report=term-missing
    ```
 
 3. **Code Quality Baseline**:
    ```bash
-   flake8 --statistics
+   tox -e lint
    ```
 
 4. **Document Baseline**:
@@ -200,7 +162,7 @@ You are Claude, an expert software architect setting up a robust development env
 - Feature context is prepared for autonomous implementation
 - All tools and configurations are functional
 
-**Important**: 
+**Important**:
 - Never modify files in `.lad/` folder - this contains the framework
 - All feature work goes in `docs/` folder
 - Preserve existing project structure and configurations
@@ -209,4 +171,4 @@ You are Claude, an expert software architect setting up a robust development env
 ### Next Phase
 After successful kickoff, proceed to Phase 1: Autonomous Context Planning using `.lad/claude_prompts/01_autonomous_context_planning.md`
 
-</user>
\ No newline at end of file
+</user>
diff --git a/.lad/claude_prompts/04_test_quality_systematic.md b/.lad/claude_prompts/04_test_quality_systematic.md
index 2d2b1ec36..5e9033ebe 100755
--- a/.lad/claude_prompts/04_test_quality_systematic.md
+++ b/.lad/claude_prompts/04_test_quality_systematic.md
@@ -12,7 +12,7 @@ You are Claude performing systematic test quality analysis and remediation with
 <command> 2>&1 | tee full_output.txt | grep -iE "(warning|error|failed|exception|fatal|critical)" | tail -n 30; echo "--- FINAL OUTPUT ---"; tail -n 100 full_output.txt
 ```
 
-**Research Software Quality Standards**: 
+**Research Software Quality Standards**:
 - Scientific reproducibility maintained across test fixes
 - Test effectiveness prioritized over coverage metrics
 - Research impact assessment for all test failures
@@ -39,18 +39,17 @@ You are Claude performing systematic test quality analysis and remediation with
 **Intelligent Chunking Strategy**:
 ```bash
 # Category-based execution with proven chunk sizing
-pytest tests/security/ -v --tb=short 2>&1 | tee security_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR|warnings|collected)" | tail -n 15
+# Simple/fast categories - run in full
+pytest tests/{{category_1}}/ -v --tb=short 2>&1 | tee {{category_1}}_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR|warnings|collected)" | tail -n 15
 
-# Model registry chunking (large category)
-pytest tests/model_registry/test_local*.py tests/model_registry/test_api*.py tests/model_registry/test_database*.py -v --tb=short 2>&1 | tee registry_chunk1.txt | tail -n 10
+# Large categories - split into logical chunks
+pytest tests/{{large_category}}/test_subset1*.py tests/{{large_category}}/test_subset2*.py -v --tb=short 2>&1 | tee category_chunk1.txt | tail -n 10
 
-# Performance and tools (timeout-prone categories)
-pytest tests/performance/ -v --tb=short 2>&1 | tee performance_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR)" | tail -n 10
-pytest tests/tools/ -v --tb=short 2>&1 | tee tools_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR)" | tail -n 10
+# Timeout-prone categories - smaller chunks or individual execution
+pytest tests/{{slow_category}}/ -v --tb=short 2>&1 | tee slow_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR)" | tail -n 10
 
-# Integration and multi-user (complex categories)
-pytest tests/integration/test_unified*.py tests/integration/test_cross*.py -v --tb=short 2>&1 | tee integration_chunk1.txt | tail -n 10
-pytest tests/multi-user-service/test_auth*.py tests/multi-user-service/test_workspace*.py -v --tb=short 2>&1 | tee multiuser_chunk1.txt | tail -n 10
+# Complex categories with setup requirements
+pytest tests/{{complex_category}}/test_subset1*.py tests/{{complex_category}}/test_subset2*.py -v --tb=short 2>&1 | tee complex_chunk1.txt | tail -n 10
 ```
 
 **Comprehensive Baseline Establishment**:
@@ -84,7 +83,7 @@ python -c "
 import re
 with open('all_failures.txt') as f:
     failures = f.readlines()
-    
+
 # Group by failure types
 import_failures = [f for f in failures if 'import' in f.lower() or 'modulenotfound' in f.lower()]
 api_failures = [f for f in failures if 'attribute' in f.lower() or 'missing' in f.lower()]
@@ -121,7 +120,7 @@ For each SKIPPED test, validate against multiple standards:
 - Research impact if fixed: [Scientific validity / Workflow / Performance / Cosmetic]
 
 **Enterprise Standard (85-95% pass rate expectation)**:
-- Justified: [Y/N] + Reasoning  
+- Justified: [Y/N] + Reasoning
 - Business impact if fixed: [Critical / High / Medium / Low]
 
 **IEEE Testing Standard (Industry best practices)**:
@@ -149,7 +148,7 @@ echo "# Test Quality Improvement Plan - $(date)" > notes/test_analysis/improveme
 
 **Priority Matrix (Enhanced for Solo Programmer)**:
 - **P1-CRITICAL**: Scientific validity + High impact/Low effort fixes
-- **P2-HIGH**: System reliability + Quick wins enabling other fixes  
+- **P2-HIGH**: System reliability + Quick wins enabling other fixes
 - **P3-MEDIUM**: Performance + Moderate effort with clear value
 - **P4-LOW**: Cosmetic + High effort/Low value (defer or remove)
 
@@ -179,7 +178,7 @@ echo "# Test Quality Improvement Plan - $(date)" > notes/test_analysis/improveme
 # Initialize test quality improvement TodoWrite
 TodoWrite tasks:
 1. Infrastructure fixes (P1-CRITICAL): Import/dependency issues
-2. API compatibility fixes (P1-P2): Method signature updates  
+2. API compatibility fixes (P1-P2): Method signature updates
 3. Test design improvements (P2-P3): Brittle test redesign
 4. Coverage gap filling (P3): Integration point testing
 5. Configuration standardization (P4): Settings/path cleanup
@@ -205,7 +204,7 @@ echo "# Test Fix Decision Analysis - {{fix_category}}" > notes/test_decisions/{{
 # Targeted validation
 pytest tests/{{affected_category}}/ -v --tb=short 2>&1 | tail -n 20
 
-# Integration validation  
+# Integration validation
 python -c "import {{affected_module}}; print('Import successful')"
 
 # Regression prevention
@@ -222,7 +221,7 @@ echo "## Baseline vs Current Status" >> test_health_report.md
 pytest --collect-only 2>&1 | grep "collected\|error" >> test_health_report.md
 
 # Category-wise success rates
-for category in security model_registry integration performance tools; do
+for category in $(find tests/ -mindepth 1 -maxdepth 1 -type d -printf '%f\n' | sort); do
     echo "### $category category:" >> test_health_report.md
     pytest tests/$category/ -q --tb=no 2>&1 | grep "passed\|failed\|skipped" >> test_health_report.md
 done
@@ -235,18 +234,18 @@ done
 **TEST QUALITY IMPROVEMENT CYCLE COMPLETE**
 
 **Progress Summary**:
-- Fixed: {{number}} test failures 
+- Fixed: {{number}} test failures
 - Success rate improvement: {{baseline}}% → {{current}}%
 - Priority fixes completed: {{P1_count}} P1, {{P2_count}} P2, {{P3_count}} P3
 
 **Current Status**:
-- Critical systems (Security/Model Registry): {{status}}
+- Critical systems: {{status}}
 - Integration tests: {{status}}
 - Total test health: {{overall_percentage}}%
 
 **Remaining Issues**:
 - {{count}} P1-CRITICAL remaining
-- {{count}} P2-HIGH remaining  
+- {{count}} P2-HIGH remaining
 - {{count}} P3-MEDIUM remaining
 - {{count}} justified skips (validated against industry standards)
 
@@ -276,7 +275,7 @@ done
 
 **Success Criteria Thresholds** (Configurable based on context):
 - **Research Software**: >90% success for critical systems, >70% overall
-- **Enterprise Standard**: >95% success for critical systems, >85% overall  
+- **Enterprise Standard**: >95% success for critical systems, >85% overall
 - **Solo Programmer**: >100% critical systems, >80% overall (realistic for resource constraints)
 
 ### Coverage Integration Framework
@@ -315,7 +314,7 @@ grep -n "missing coverage" coverage_{{module}}.txt
 ```bash
 # Save comprehensive session state
 echo "# Test Quality Session State - $(date)" > notes/session_state.md
-echo "## TodoWrite Progress:" >> notes/session_state.md  
+echo "## TodoWrite Progress:" >> notes/session_state.md
 # [TodoWrite state documentation]
 
 echo "## Current PDCA Cycle:" >> notes/session_state.md
@@ -364,7 +363,7 @@ echo "## Context for Resumption:" >> notes/session_state.md
 
 **Research Software Compliance**:
 - [ ] Scientific validity tests: 100% success
-- [ ] Computational accuracy tests: 100% success  
+- [ ] Computational accuracy tests: 100% success
 - [ ] Research workflow tests: >95% success
 - [ ] Overall test collection: >90% success
 
@@ -408,4 +407,4 @@ echo "## Context for Resumption:" >> notes/session_state.md
 4. **Continuous Improvement Process**: Sustainable test maintenance procedures
 
 This enhanced framework combines research software rigor with enterprise-grade systematic improvement methodologies, adapted for solo programmer resource constraints while ensuring production-ready quality standards.
-</user>
\ No newline at end of file
+</user>
diff --git a/.lad/claude_prompts/04a_test_execution_infrastructure.md b/.lad/claude_prompts/04a_test_execution_infrastructure.md
index 7c92da433..2c39d4af7 100755
--- a/.lad/claude_prompts/04a_test_execution_infrastructure.md
+++ b/.lad/claude_prompts/04a_test_execution_infrastructure.md
@@ -39,7 +39,7 @@ echo "## Direct References:" >> impact_analysis.md
 grep -r "$target_function" --include="*.py" . >> impact_analysis.md
 
 # Check import dependencies
-echo "## Import Dependencies:" >> impact_analysis.md  
+echo "## Import Dependencies:" >> impact_analysis.md
 grep -r "from.*import.*$target_function\|import.*$target_function" --include="*.py" . >> impact_analysis.md
 
 # Identify calling patterns
@@ -65,17 +65,8 @@ grep -r "$target_function" docs/API_REFERENCE.md docs/**/api*.md 2>/dev/null >>
 # Map critical system interactions
 echo "## Integration Points:" >> impact_analysis.md
 
-# Statistical analysis pipeline interactions
-grep -r "$target_function" emuses/**/statistical*.py emuses/**/analysis*.py 2>/dev/null >> impact_analysis.md
-
-# Model registry interactions
-grep -r "$target_function" emuses/**/model_registry*.py emuses/**/registry*.py 2>/dev/null >> impact_analysis.md
-
-# Multi-user service compatibility  
-grep -r "$target_function" emuses/**/service*.py emuses/**/multi_user*.py 2>/dev/null >> impact_analysis.md
-
-# CLI and API endpoints
-grep -r "$target_function" emuses/cli/*.py emuses/api/*.py 2>/dev/null >> impact_analysis.md
+# Find all modules that reference the target function
+grep -r "$target_function" --include="*.py" . 2>/dev/null >> impact_analysis.md
 ```
 
 **4. Test Impact Prediction**:
@@ -146,25 +137,13 @@ echo "git reset --hard $(git rev-parse HEAD)" >> impact_analysis.md
 **Immediate Validation** (run after each change):
 ```bash
 # Test affected categories immediately
-pytest $(grep -l "$target_function" tests/**/*.py 2>/dev/null) -x --tb=short
-
-# Quick integration smoke test
-python scripts/dev_test_runner.py
-
-# Verify documentation examples still work
-python -c "exec(open('docs/examples/validate_examples.py').read())" 2>/dev/null || echo "No example validation script"
+pytest $(grep -l "$target_function" dandi/tests/*.py 2>/dev/null) -x --tb=short
 ```
 
 **Comprehensive Validation** (before committing):
 ```bash
-# Full category testing for affected areas
-affected_categories=$(grep -r "$target_function" tests/ --include="*.py" | cut -d'/' -f2 | sort -u | tr '\n' ' ')
-for category in $affected_categories; do
-    pytest tests/$category/ -q --tb=short
-done
-
-# Cross-integration validation
-pytest tests/integration/ -k "$target_function" -v --tb=short 2>/dev/null || echo "No integration tests found"
+# Full test suite validation
+python -m pytest dandi -q --tb=short
 ```
 
 ### ⚠️ **Emergency Rollback Procedure**
@@ -188,30 +167,23 @@ echo "Recovery: Baseline restored, ready for alternative approach" >> impact_ana
 
 #### Intelligent Chunking Strategy (Timeout Prevention)
 
-**Proven Chunk Sizing for Different Test Categories**:
+**Chunk Sizing for Different Test Categories**:
 
 ```bash
-# Security tests (typically fast, stable execution)
-pytest tests/security/ -v --tb=short 2>&1 | tee security_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR|warnings|collected)" | tail -n 15
-
-# Model registry (large category - requires chunking)
-pytest tests/model_registry/test_local*.py tests/model_registry/test_api*.py tests/model_registry/test_database*.py -v --tb=short 2>&1 | tee registry_chunk1.txt | tail -n 10
-
-pytest tests/model_registry/test_advanced*.py tests/model_registry/test_analytics*.py tests/model_registry/test_benchmarking*.py -v --tb=short 2>&1 | tee registry_chunk2.txt | tail -n 10
+# Fast unit tests (no Docker needed)
+python -m pytest dandi/tests/test_utils.py dandi/tests/test_metadata.py -v --tb=short 2>&1 | tee unit_results.txt | tail -n 15
 
-# Integration tests (complex, potentially slow)
-pytest tests/integration/test_unified*.py tests/integration/test_cross*.py -v --tb=short 2>&1 | tee integration_chunk1.txt | tail -n 10
+# CLI tests
+python -m pytest dandi/tests/test_command.py -v --tb=short 2>&1 | tee cli_results.txt | tail -n 15
 
-# Performance tests (timeout-prone)
-pytest tests/performance/ -v --tb=short 2>&1 | tee performance_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR)" | tail -n 10
+# Operation tests (may need Docker for some)
+python -m pytest dandi/tests/test_download.py dandi/tests/test_upload.py -v --tb=short 2>&1 | tee operations_results.txt | tail -n 15
 
-# Tools and CLI (mixed complexity)
-pytest tests/tools/ -v --tb=short 2>&1 | tee tools_results.txt | grep -E "(PASSED|FAILED|SKIPPED|ERROR)" | tail -n 10
+# Move/organize tests
+python -m pytest dandi/tests/test_move.py dandi/tests/test_organize.py -v --tb=short 2>&1 | tee move_results.txt | tail -n 15
 
-pytest tests/enhanced-cli-typer/test_cli_integration.py tests/enhanced-cli-typer/test_service_client.py -v --tb=short 2>&1 | tee cli_chunk1.txt | tail -n 10
-
-# Multi-user service (complex setup requirements)
-pytest tests/multi-user-service/test_auth*.py tests/multi-user-service/test_workspace*.py -v --tb=short 2>&1 | tee multiuser_chunk1.txt | tail -n 10
+# Full suite (uses tox for reproducibility)
+tox -e py3 2>&1 | tee full_results.txt | tail -n 50
 ```
 
 **Dynamic Chunk Size Guidelines**:
@@ -246,12 +218,11 @@ with open('test_collection_baseline.txt') as f:
 echo "# Test Execution Baseline - $(date)" > test_execution_baseline.md
 
 # Execute and track each category
-for category in security model_registry integration performance tools multi-user-service enhanced-cli-typer; do
+for category in unit cli operations move; do
     echo "## $category Category Results" >> test_execution_baseline.md
-    if [ -f "${category}_results.txt" ] || ls ${category}_chunk*.txt 1> /dev/null 2>&1; then
-        # Aggregate results from category files
-        cat ${category}_*.txt 2>/dev/null | grep -E "(PASSED|FAILED|SKIPPED|ERROR)" | tail -n 5 >> test_execution_baseline.md
-        cat ${category}_*.txt 2>/dev/null | grep "===.*===" | tail -n 1 >> test_execution_baseline.md
+    if [ -f "${category}_results.txt" ]; then
+        grep -E "(PASSED|FAILED|SKIPPED|ERROR)" "${category}_results.txt" | tail -n 5 >> test_execution_baseline.md
+        grep "===.*===" "${category}_results.txt" | tail -n 1 >> test_execution_baseline.md
     else
         echo "Category not executed" >> test_execution_baseline.md
     fi
@@ -275,7 +246,7 @@ python -c "
 import re
 with open('comprehensive_test_output.txt') as f:
     content = f.read()
-    
+
 # Extract final summary lines that show totals
 summary_lines = [line for line in content.split('\n') if '=====' in line and ('passed' in line or 'failed' in line)]
 
@@ -289,7 +260,7 @@ for line in summary_lines:
     failed = re.findall(r'(\d+) failed', line)
     skipped = re.findall(r'(\d+) skipped', line)
     warnings = re.findall(r'(\d+) warning', line)
-    
+
     if passed: total_passed += int(passed[0])
     if failed: total_failed += int(failed[0])
     if skipped: total_skipped += int(skipped[0])
@@ -340,7 +311,7 @@ echo "## Next Phase: Ready for analysis framework (04b)" >> test_context_summary
 
 **Readiness for Next Phase**:
 - [ ] `test_execution_baseline.md` contains category results
-- [ ] `test_health_metrics.md` shows overall statistics  
+- [ ] `test_health_metrics.md` shows overall statistics
 - [ ] `comprehensive_test_output.txt` available for pattern analysis
 - [ ] Context preserved for analysis phase (04b)
 
@@ -369,4 +340,4 @@ echo "## Next Phase: Ready for analysis framework (04b)" >> test_context_summary
 **Usage**: Complete this phase before proceeding to `04b_test_analysis_framework.md` for holistic pattern recognition and root cause analysis.
 
 This phase provides the robust foundation needed for systematic test improvement while ensuring efficient resource usage and timeout prevention.
-</user>
\ No newline at end of file
+</user>
diff --git a/.lad/claude_prompts/04c_test_improvement_cycles.md b/.lad/claude_prompts/04c_test_improvement_cycles.md
index 891010108..e263b018e 100755
--- a/.lad/claude_prompts/04c_test_improvement_cycles.md
+++ b/.lad/claude_prompts/04c_test_improvement_cycles.md
@@ -42,7 +42,7 @@ grep -r "$target_area" tests/ --include="*.py" | cut -d':' -f1 | sort -u >> cycl
 
 **Risk-Based Implementation Strategy**:
 - **Low Risk**: Test fixture improvements, test data corrections → Standard validation
-- **Medium Risk**: Test logic changes, assertion updates → Focused category validation  
+- **Medium Risk**: Test logic changes, assertion updates → Focused category validation
 - **High Risk**: Core functionality fixes, algorithm changes → Comprehensive validation
 
 #### PDCA Integration with Risk Management
@@ -124,13 +124,13 @@ TodoWrite initialization based on analysis results:
 # Implement first task in current cycle
 
 # Example implementation pattern:
-echo "Starting implementation of: {{current_task}}" 
+echo "Starting implementation of: {{current_task}}"
 echo "PDCA Cycle {{N}}, DO Phase - Task {{M}}" > current_implementation_log.md
 
 # [Implement specific fix based on root cause analysis]
 # Infrastructure fix example:
 # - Update import statements
-# - Fix dependency issues  
+# - Fix dependency issues
 # - Resolve environment setup
 
 # API compatibility fix example:
@@ -138,7 +138,7 @@ echo "PDCA Cycle {{N}}, DO Phase - Task {{M}}" > current_implementation_log.md
 # - Fix parameter mismatches
 # - Resolve interface changes
 
-# Test design fix example:  
+# Test design fix example:
 # - Update test expectations
 # - Fix brittle test logic
 # - Improve test reliability
@@ -184,15 +184,15 @@ echo "## CHECK Phase Validation - Task: {{current_task}}" >> current_implementat
 # 1. Direct test validation
 pytest tests/{{affected_category}}/ -v --tb=short 2>&1 | tail -n 20
 
-# 2. Integration validation  
+# 2. Integration validation
 python -c "import {{affected_module}}; print('Import successful')"
 
 # 3. Regression prevention for critical systems
-pytest tests/security/ tests/model_registry/test_local*.py -q --tb=short 2>&1 | tail -n 10
+pytest tests/{{critical_category}}/ -q --tb=short 2>&1 | tail -n 10
 
 # 4. Update health metrics
 echo "### Validation Results:" >> current_implementation_log.md
-echo "- Target tests now passing: {{Y_or_N}}" >> current_implementation_log.md  
+echo "- Target tests now passing: {{Y_or_N}}" >> current_implementation_log.md
 echo "- No regressions in critical systems: {{Y_or_N}}" >> current_implementation_log.md
 echo "- Integration points working: {{Y_or_N}}" >> current_implementation_log.md
 
@@ -207,7 +207,7 @@ echo "- Integration points working: {{Y_or_N}}" >> current_implementation_log.md
 echo "# Updated Test Health Report - PDCA Cycle {{N}}" > cycle_{{N}}_health_report.md
 
 # Re-run key categories to measure improvement
-for category in security model_registry integration performance tools; do
+for category in $(find tests/ -mindepth 1 -maxdepth 1 -type d -printf '%f\n' | sort); do
     echo "## $category Category Status:" >> cycle_{{N}}_health_report.md
     if pytest tests/$category/ -q --tb=no 2>/dev/null; then
         pytest tests/$category/ -q --tb=no 2>&1 | grep -E "(passed|failed|skipped)" >> cycle_{{N}}_health_report.md
@@ -238,7 +238,7 @@ echo "- Remaining P1-P2 issues: {{remaining_high_priority}}" >> cycle_{{N}}_heal
 - **Priority Fixes**: {{P1_completed}} P1, {{P2_completed}} P2 completed
 
 **Current Status**:
-- **Critical Systems**: {{security_status}}, {{model_registry_status}}, {{integration_status}}
+- **Critical Systems**: {{category_1_status}}, {{category_2_status}}, {{category_3_status}}
 - **Overall Health**: {{current_percentage}}% success rate
 - **Industry Compliance**: {{research_standard_status}}, {{enterprise_standard_status}}
 
@@ -256,14 +256,14 @@ echo "- Remaining P1-P2 issues: {{remaining_high_priority}}" >> cycle_{{N}}_heal
    - Estimated effort: {{next_cycle_time_estimate}}
    - Target improvement: {{target_success_rate}}%
 
-**B) 🔧 ADJUST APPROACH** - Modify strategy based on findings  
+**B) 🔧 ADJUST APPROACH** - Modify strategy based on findings
    - Will pause for approach refinement
    - Address: {{any_systemic_issues_discovered}}
    - Update: {{priority_matrix_or_batching_strategy}}
    - Reassess: {{resource_allocation_or_complexity}}
 
 **C) 📊 ADD COVERAGE ANALYSIS** - Integrate test coverage improvement
-   - Will run comprehensive coverage analysis  
+   - Will run comprehensive coverage analysis
    - Identify: {{critical_code_gaps_requiring_tests}}
    - Balance: {{test_quality_vs_coverage_enhancement}}
    - Estimated scope: {{coverage_improvement_effort}}
@@ -299,7 +299,7 @@ echo "- Completed this session: {{completed_tasks}}" >> notes/pdca_session_state
 
 echo "## TodoWrite State:" >> notes/pdca_session_state.md
 echo "- Total tasks: {{total_count}}" >> notes/pdca_session_state.md
-echo "- Completed: {{completed_count}}" >> notes/pdca_session_state.md  
+echo "- Completed: {{completed_count}}" >> notes/pdca_session_state.md
 echo "- In progress: {{in_progress_count}}" >> notes/pdca_session_state.md
 echo "- Pending: {{pending_count}}" >> notes/pdca_session_state.md
 
@@ -325,7 +325,7 @@ echo "# Essential Context for Continuation" > pdca_essential_context.md
 echo "## Current Achievement Level:" >> pdca_essential_context.md
 echo "- Success rate: {{current_percentage}}%" >> pdca_essential_context.md
 echo "- Industry standard compliance: {{status}}" >> pdca_essential_context.md
-echo "- Critical systems status: {{security_registry_integration_status}}" >> pdca_essential_context.md
+echo "- Critical systems status: {{critical_systems_status}}" >> pdca_essential_context.md
 
 echo "## Active PDCA Context:" >> pdca_essential_context.md
 echo "- Cycle: {{N}}, Phase: {{current_phase}}" >> pdca_essential_context.md
@@ -353,7 +353,7 @@ cat cycle_*_health_report.md >> PROJECT_STATUS.md 2>/dev/null || true
 echo "# Coverage Analysis Integration - PDCA Cycle {{N}}" > coverage_integration_analysis.md
 
 # Run coverage for key modules
-pytest --cov=emuses --cov-report=term-missing tests/ 2>&1 | tee comprehensive_coverage.txt
+pytest --cov=. --cov-report=term-missing tests/ 2>&1 | tee comprehensive_coverage.txt
 
 # Identify critical functions with <80% coverage
 python -c "
@@ -372,7 +372,7 @@ cat critical_coverage_gaps.txt >> coverage_integration_analysis.md
 
 echo "## Integration with Current Test Quality:" >> coverage_integration_analysis.md
 echo "- Current test success rate: {{percentage}}%" >> coverage_integration_analysis.md
-echo "- Coverage enhancement opportunities: {{count}} critical gaps" >> coverage_integration_analysis.md  
+echo "- Coverage enhancement opportunities: {{count}} critical gaps" >> coverage_integration_analysis.md
 echo "- Resource allocation: {{balance_quality_fixes_vs_coverage}}" >> coverage_integration_analysis.md
 ```
 
@@ -418,4 +418,4 @@ echo "- Resource allocation: {{balance_quality_fixes_vs_coverage}}" >> coverage_
 **Usage**: Execute PDCA cycles until target success criteria achieved, then proceed to `04d_test_session_management.md` for advanced session continuity and user decision optimization.
 
 This phase ensures systematic, measurable improvement toward 100% meaningful test success while maintaining productivity and preventing regressions.
-</user>
\ No newline at end of file
+</user>
diff --git a/.lad/copilot_prompts/04_implement_next_task.md b/.lad/copilot_prompts/04_implement_next_task.md
index 75a288579..7b45c328d 100755
--- a/.lad/copilot_prompts/04_implement_next_task.md
+++ b/.lad/copilot_prompts/04_implement_next_task.md
@@ -6,7 +6,7 @@ You are Claude in Agent Mode.
 - After each task, update context files for subsequent sub-plans (e.g., update `context_0b_*.md` after 0a, etc.).
 - Track completion and integration for each sub-plan. On sub-plan completion, verify integration points and update the next sub-plan's context.
 
-**Pre-flight Check:**  
+**Pre-flight Check:**
 1. **Full regression test**: Run the complete test suite to establish baseline:
    ```bash
    pytest -q --tb=short
@@ -22,9 +22,9 @@ You are Claude in Agent Mode.
 3. **Coverage baseline**: Establish current coverage before changes:
    ```bash
    pytest --cov=. --cov-report=term-missing --tb=no -q | grep "TOTAL"
-   ```  
+   ```
 
-**Scope Guard:** Before making any edits, identify the minimal code region needed to satisfy the current failing test. Do **not** modify or delete code outside this region.  
+**Scope Guard:** Before making any edits, identify the minimal code region needed to satisfy the current failing test. Do **not** modify or delete code outside this region.
 
 **Regression Prevention:**
 1. **Dependency Analysis**: Before changing any function/class, run:
@@ -59,17 +59,17 @@ You are Claude in Agent Mode.
 Implement the **next unchecked task** only from the current sub-plan.
 
 **Workflow**
-1. **Write the failing test first.**  
+1. **Write the failing test first.**
    **Testing Strategy by Component Type:**
-   • **API Endpoints & Web Services**: Use integration testing - import the real FastAPI/Django app, mock only external dependencies (databases, APIs, file systems). Test actual HTTP routing, validation, serialization, and error handling.
+   • **API Endpoints & Web Services**: Use integration testing - import the real application, mock only external dependencies (databases, APIs, file systems). Test actual HTTP routing, validation, serialization, and error handling.
    • **Business Logic & Algorithms**: Use unit testing - mock all dependencies, test logic in complete isolation, focus on edge cases.
    • **Data Processing & Utilities**: Use unit testing with minimal dependencies, use test data fixtures.
-   
-   • If you need to store intermediate notes or dependency maps, write them to `docs/_scratch/{{FEATURE_SLUG}}.md` and reference this file in subsequent sub-tasks.  
+
+   • If you need to store intermediate notes or dependency maps, write them to `docs/_scratch/{{FEATURE_SLUG}}.md` and reference this file in subsequent sub-tasks.
    • If the next sub-task will touch >200 lines of code or >10 files, break it into 2–5 indented sub-sub-tasks in the plan, commit that plan update, then proceed with implementation.
 
-2. **Modify minimal code** to pass the new test without breaking existing ones.  
-3. **Ensure NumPy-style docstrings** on all additions.  
+2. **Modify minimal code** to pass the new test without breaking existing ones.
+3. **Ensure NumPy-style docstrings** on all additions.
 4. **Run** `pytest -q` **repeatedly until green.**
 
 4.5 **Continuous Regression Check**: After each code change, run a quick regression test:
@@ -79,16 +79,16 @@ Implement the **next unchecked task** only from the current sub-plan.
    ```
    If any existing tests fail, fix immediately before continuing.
 
-5. **Update docs & plan**:  
-   • If `SPLIT=true` or SUB_PLAN_ID is set → update any `docs/{{DOC_BASENAME}}_*` or `docs/context_{{SUB_PLAN_ID}}.md` files you previously created.  
-   • Else → update `docs/{{DOC_BASENAME}}.md`.  
-   • **Check the box** in your plan file (`plan_{{SUB_PLAN_ID}}.md` or `plan.md`): change the leading `- [ ]` on the task (and any completed sub-steps) you just implemented to `- [x]`.  
+5. **Update docs & plan**:
+   • If `SPLIT=true` or SUB_PLAN_ID is set → update any `docs/{{DOC_BASENAME}}_*` or `docs/context_{{SUB_PLAN_ID}}.md` files you previously created.
+   • Else → update `docs/{{DOC_BASENAME}}.md`.
+   • **Check the box** in your plan file (`plan_{{SUB_PLAN_ID}}.md` or `plan.md`): change the leading `- [ ]` on the task (and any completed sub-steps) you just implemented to `- [x]`.
    • **Update documentation**:
      - In each modified source file, ensure any new or changed functions/classes have NumPy-style docstrings.
      - If you've added new public APIs, append their signature/purpose to the Level 2 API table in your context doc(s).     - Save all doc files (`docs/{{DOC_BASENAME}}.md` or split docs).
 
-5.5 **Quality Gate**  
-   • Run flake8 and quick coverage as described in .copilot-instructions.md.  
+5.5 **Quality Gate**
+   • Run flake8 and quick coverage as described in .copilot-instructions.md.
    • **Final regression test**: Run full test suite to ensure no regressions:
      ```bash
      pytest -q --tb=short
@@ -96,10 +96,10 @@ Implement the **next unchecked task** only from the current sub-plan.
    • If violations or test failures, pause and show first 10 issues, ask user whether to fix now.
 
 6. **Draft commit**:
-   * Header ↠ `feat({{FEATURE_SLUG}}): <concise phrase>`  ← **one sub-task only**  
+   * Header ↠ `feat({{FEATURE_SLUG}}): <concise phrase>`  ← **one sub-task only**
    * Body   ↠ bullet list of the sub-steps you just did.
 
-7. **Show changes & await approval**:  
+7. **Show changes & await approval**:
    Output `git diff --stat --staged` and await user approval.
 
 **When you're ready** to commit and push, type **y**. Then run:
diff --git a/.lad/copilot_prompts/04_test_quality_systematic.md b/.lad/copilot_prompts/04_test_quality_systematic.md
index ff7d55059..9b4c35d8c 100755
--- a/.lad/copilot_prompts/04_test_quality_systematic.md
+++ b/.lad/copilot_prompts/04_test_quality_systematic.md
@@ -69,16 +69,16 @@ class TestFailure:
 def execute_test_chunk_with_timeout_prevention(test_category: str) -> Dict[str, any]:
     """
     Execute test category using proven chunking strategy to prevent timeouts
-    
+
     Args:
-        test_category: Category like 'security', 'model_registry', 'integration'
-        
+        test_category: Category like 'unit', 'functional', 'integration'
+
     Returns:
         Dict containing test results and execution metadata
-        
+
     Example usage:
-        # Test security category with comprehensive error capture
-        security_results = execute_test_chunk_with_timeout_prevention('security')
+        # Test a category with comprehensive error capture
+        results = execute_test_chunk_with_timeout_prevention('unit')
     """
     # [Copilot will generate implementation based on this comment structure]
     pass
@@ -86,19 +86,19 @@ def execute_test_chunk_with_timeout_prevention(test_category: str) -> Dict[str,
 def aggregate_failure_patterns_across_categories(test_results: List[Dict]) -> Dict[TestFailureCategory, List[TestFailure]]:
     """
     Perform holistic pattern recognition across ALL test failures
-    
+
     Instead of analyzing failures sequentially, this function aggregates
     all failures first to identify:
     - Cascading failure patterns (one root cause affects multiple tests)
     - Cross-cutting concerns (similar issues across different modules)
     - Solution interaction opportunities (single fix resolves multiple issues)
-    
+
     Args:
         test_results: List of test execution results from all categories
-        
+
     Returns:
         Dictionary mapping failure categories to structured failure objects
-        
+
     Implementation approach:
         1. Extract all FAILED and ERROR entries from test outputs
         2. Classify each failure using root cause taxonomy
@@ -111,19 +111,19 @@ def aggregate_failure_patterns_across_categories(test_results: List[Dict]) -> Di
 def validate_test_against_industry_standards(test_failure: TestFailure) -> Dict[str, bool]:
     """
     Multi-tier validation of test justification against industry standards
-    
+
     Validates each test failure against:
     - Research Software Standard (30-60% baseline acceptable)
     - Enterprise Standard (85-95% expectation)
     - IEEE Testing Standard (industry best practices)
     - Solo Programmer Context (resource constraints)
-    
+
     Args:
         test_failure: Structured test failure object
-        
+
     Returns:
         Dictionary with justification status for each standard level
-        
+
     Example output:
         {
             'research_justified': True,
@@ -142,22 +142,22 @@ def validate_test_against_industry_standards(test_failure: TestFailure) -> Dict[
 def plan_phase_solution_optimization(failures: Dict[TestFailureCategory, List[TestFailure]]) -> Dict[str, any]:
     """
     PLAN phase: Strategic solution planning with resource optimization
-    
+
     Performs comprehensive solution interaction analysis:
     - Identifies fixes that can be batched together (compatible)
     - Maps dependency ordering (Fix A must complete before Fix B)
     - Assesses risk levels for regression prevention
     - Optimizes resource allocation for solo programmer context
-    
+
     Priority Matrix (Enhanced for Solo Programmer):
     - P1-CRITICAL: Scientific validity + High impact/Low effort
     - P2-HIGH: System reliability + Quick wins enabling other fixes
     - P3-MEDIUM: Performance + Moderate effort with clear value
     - P4-LOW: Cosmetic + High effort/Low value (defer or remove)
-    
+
     Args:
         failures: Categorized and structured test failures
-        
+
     Returns:
         Implementation plan with optimized fix sequence
     """
@@ -167,18 +167,18 @@ def plan_phase_solution_optimization(failures: Dict[TestFailureCategory, List[Te
 def do_phase_systematic_implementation(implementation_plan: Dict) -> List[str]:
     """
     DO phase: Execute fixes using optimized sequence
-    
+
     Implementation strategy:
     1. Quick wins first (high-impact/low-effort for momentum)
     2. Dependency resolution (fixes that enable other fixes)
     3. Batch compatible fixes (minimize context switching)
     4. Risk management (high-risk fixes with validation)
-    
+
     Integrates with TodoWrite-style progress tracking for session continuity
-    
+
     Args:
         implementation_plan: Output from plan_phase_solution_optimization
-        
+
     Returns:
         List of completed fix descriptions for check phase validation
     """
@@ -188,21 +188,21 @@ def do_phase_systematic_implementation(implementation_plan: Dict) -> List[str]:
 def check_phase_comprehensive_validation(completed_fixes: List[str]) -> Dict[str, any]:
     """
     CHECK phase: Validate implementation with regression prevention
-    
+
     Validation protocol:
     - Targeted validation for affected test categories
     - Integration validation (import testing)
     - Regression prevention for critical modules
     - Health metrics tracking (baseline vs current)
-    
+
     Generates comparative health report:
     - Test collection success rate
     - Category-wise success rates
     - Critical system status validation
-    
+
     Args:
         completed_fixes: List of fixes implemented in DO phase
-        
+
     Returns:
         Comprehensive validation report with success metrics
     """
@@ -212,18 +212,18 @@ def check_phase_comprehensive_validation(completed_fixes: List[str]) -> Dict[str
 def act_phase_decision_framework(validation_report: Dict) -> str:
     """
     ACT phase: Generate user decision prompt for next iteration
-    
+
     Analyzes validation results and presents structured options:
     A) Continue cycles - Implement next priority fixes
-    B) Adjust approach - Modify strategy based on findings  
+    B) Adjust approach - Modify strategy based on findings
     C) Add coverage analysis - Integrate coverage improvement
     D) Complete current level - Achieve target success threshold
-    
+
     Provides specific metrics and recommendations for each option
-    
+
     Args:
         validation_report: Output from check_phase_comprehensive_validation
-        
+
     Returns:
         Formatted decision prompt string for user choice
     """
@@ -237,21 +237,21 @@ def act_phase_decision_framework(validation_report: Dict) -> str:
 def integrate_coverage_analysis_with_test_quality(module_name: str) -> Dict[str, any]:
     """
     Coverage-driven test improvement using CoverUp-style methodology
-    
+
     Links test failures to coverage gaps:
     - Identifies critical functions with <80% coverage requiring tests
     - Maps uncovered integration points to test failure patterns
     - Prioritizes test improvements by coverage impact
-    
+
     Implementation approach:
     1. Run coverage analysis for specified module
     2. Parse coverage report for low-coverage functions
     3. Cross-reference with existing test failures
     4. Generate priority list for coverage-driven test creation
-    
+
     Args:
-        module_name: Python module to analyze (e.g., 'emuses.model_registry')
-        
+        module_name: Python module to analyze (e.g., 'mypackage.submodule')
+
     Returns:
         Coverage analysis with linked test improvement recommendations
     """
@@ -261,16 +261,16 @@ def integrate_coverage_analysis_with_test_quality(module_name: str) -> Dict[str,
 def generate_coverage_driven_tests(coverage_gaps: List[str], test_failures: List[TestFailure]) -> List[str]:
     """
     Generate test code for critical coverage gaps
-    
+
     Uses iterative improvement approach:
     - Focus on critical system components with <80% coverage
     - Prioritize uncovered integration points
     - Quality over quantity - meaningful tests vs coverage padding
-    
+
     Args:
         coverage_gaps: List of functions/methods with insufficient coverage
         test_failures: Related test failures that might be coverage-related
-        
+
     Returns:
         List of generated test code snippets ready for implementation
     """
@@ -284,15 +284,15 @@ def generate_coverage_driven_tests(coverage_gaps: List[str], test_failures: List
 def save_session_state_for_resumption(current_pdca_cycle: int, analysis_findings: Dict) -> None:
     """
     Enhanced session state preservation for seamless resumption
-    
+
     Saves comprehensive session state including:
     - Current PDCA cycle and phase
     - TodoWrite progress tracking
     - Analysis findings and patterns discovered
     - Critical context for next session
-    
+
     Uses structured markdown files for human readability and tool parsing
-    
+
     Args:
         current_pdca_cycle: Which PDCA iteration we're currently in
         analysis_findings: Key patterns and insights discovered
@@ -303,13 +303,13 @@ def save_session_state_for_resumption(current_pdca_cycle: int, analysis_findings
 def load_session_state_and_resume() -> Dict[str, any]:
     """
     Automatic session resumption with state detection
-    
+
     Detects current state and determines next action:
     - Checks for existing TodoWrite tasks
     - Identifies current PDCA cycle phase
     - Loads previous analysis findings
     - Determines optimal resumption point
-    
+
     Returns:
         Session state dictionary with resumption context
     """
@@ -319,16 +319,16 @@ def load_session_state_and_resume() -> Dict[str, any]:
 def optimize_context_for_token_efficiency(session_data: Dict) -> Dict[str, any]:
     """
     Context optimization strategy for long-running sessions
-    
+
     Implements equivalent of Claude's /compact command:
     - Identifies critical context to preserve
     - Archives resolved issues and outdated analysis
     - Maintains active analysis context
     - Saves detailed findings to permanent files
-    
+
     Args:
         session_data: Current session context and analysis data
-        
+
     Returns:
         Optimized context dictionary with preserved essentials
     """
@@ -350,13 +350,13 @@ test_analyzer = TestQualityAnalyzer()  # Copilot will suggest class structure
 ### 2. Pattern Recognition
 ```python
 # Execute holistic pattern recognition across all test categories
-# Aggregate failures from security, model_registry, integration, performance, tools
+# Aggregate failures from all discovered test categories
 # Classify failures using root cause taxonomy: INFRASTRUCTURE, API_COMPATIBILITY, TEST_DESIGN, COVERAGE_GAPS, CONFIGURATION
 
 all_failures = aggregate_failure_patterns_across_categories(test_results)
 ```
 
-### 3. PDCA Cycle Execution  
+### 3. PDCA Cycle Execution
 ```python
 # PLAN: Strategic solution optimization for solo programmer context
 # Prioritize fixes: P1-CRITICAL (scientific validity), P2-HIGH (system reliability), P3-MEDIUM (performance), P4-LOW (cosmetic)
@@ -402,4 +402,4 @@ session_state = load_session_state_and_resume()
 5. **Context Provision**: Examples and usage patterns provided in function docstrings
 6. **Explicit Parameter Documentation**: Clear argument descriptions help Copilot understand intent
 
-This framework provides the same systematic test improvement capabilities as the Claude version while adapting to GitHub Copilot's strengths in function completion and comment-based prompting.
\ No newline at end of file
+This framework provides the same systematic test improvement capabilities as the Claude version while adapting to GitHub Copilot's strengths in function completion and comment-based prompting.
diff --git a/.lad/copilot_prompts/04a_test_execution_infrastructure.md b/.lad/copilot_prompts/04a_test_execution_infrastructure.md
index 803cd2175..f1ea2320f 100755
--- a/.lad/copilot_prompts/04a_test_execution_infrastructure.md
+++ b/.lad/copilot_prompts/04a_test_execution_infrastructure.md
@@ -40,34 +40,34 @@ class TestChunkSize(Enum):
     INDIVIDUAL = 1   # Timeout-prone tests
 
 def execute_test_chunk_with_timeout_prevention(
-    test_category: str, 
+    test_category: str,
     chunk_size: Optional[int] = None,
     timeout_seconds: int = 120
 ) -> TestExecutionResult:
     """
     Execute test category using proven chunking strategy to prevent timeouts
-    
+
     Implements intelligent chunking based on test category complexity:
-    - Security tests: 10-20 tests per chunk (fast, stable execution)
-    - Model registry: Split into logical chunks (local, API, database)
+    - Simple/unit tests: 10-20 tests per chunk (fast, stable execution)
+    - Large categories: Split into logical chunks by subgroup
     - Integration tests: 5-10 tests per chunk (complex setup)
     - Performance tests: Individual or small groups (timeout-prone)
-    
+
     Args:
-        test_category: Category like 'security', 'model_registry', 'integration'
+        test_category: Category like 'unit', 'functional', 'integration'
         chunk_size: Override default chunk size if needed
         timeout_seconds: Maximum execution time per chunk
-        
+
     Returns:
         TestExecutionResult with comprehensive execution metadata
-        
+
     Example usage:
-        # Execute security tests with optimized chunking
-        security_results = execute_test_chunk_with_timeout_prevention('security')
-        
-        # Execute model registry with custom chunking
-        registry_results = execute_test_chunk_with_timeout_prevention(
-            'model_registry', 
+        # Execute unit tests with optimized chunking
+        unit_results = execute_test_chunk_with_timeout_prevention('unit')
+
+        # Execute functional tests with custom chunking
+        functional_results = execute_test_chunk_with_timeout_prevention(
+            'functional',
             chunk_size=8
         )
     """
@@ -82,16 +82,16 @@ def execute_test_chunk_with_timeout_prevention(
 def establish_comprehensive_test_baseline() -> Dict[str, TestExecutionResult]:
     """
     Create complete test inventory and execute baseline analysis
-    
+
     Performs comprehensive test discovery and categorization:
     - Test collection with error detection
     - Category-wise execution tracking
     - Health metrics establishment
     - Baseline statistics for comparison
-    
+
     Returns:
         Dictionary mapping test categories to execution results
-        
+
     Implementation approach:
         1. Run pytest --collect-only for complete test discovery
         2. Extract collection statistics and error rates
@@ -107,19 +107,19 @@ def aggregate_test_results_across_categories(
 ) -> Dict[str, any]:
     """
     Aggregate test execution results for comprehensive health analysis
-    
+
     Combines results from all test categories to provide:
     - Overall success rate calculations
     - Category-wise performance comparison
     - Health metrics trending
     - Execution efficiency analysis
-    
+
     Args:
         category_results: Results from all executed test categories
-        
+
     Returns:
         Comprehensive health metrics dictionary
-        
+
     Output structure:
         {
             'total_tests': int,
@@ -138,18 +138,18 @@ def generate_test_health_metrics_report(
 ) -> None:
     """
     Generate comprehensive test health report with baseline statistics
-    
+
     Creates structured markdown report containing:
     - Executive summary of test health
     - Category-wise success rates
     - Collection error analysis
     - Execution efficiency metrics
     - Baseline establishment confirmation
-    
+
     Args:
         aggregated_results: Output from aggregate_test_results_across_categories
         output_file: Path for generated health report
-        
+
     Report sections:
         1. Overall Statistics
         2. Category Performance Analysis
@@ -167,21 +167,21 @@ def optimize_test_execution_for_token_efficiency(
 ) -> Tuple[str, str]:
     """
     Execute tests with token-optimized output handling
-    
+
     Implements proven patterns for large test suite execution:
     - Comprehensive output capture with intelligent filtering
     - Error and warning prioritization
     - Summary extraction and preservation
     - Detailed logging for later analysis
-    
+
     Args:
         test_command: Complete pytest command to execute
         category: Test category for context-specific filtering
         max_output_lines: Maximum lines to return for immediate analysis
-        
+
     Returns:
         Tuple of (filtered_output, full_output_file_path)
-        
+
     Token optimization strategy:
         - Capture full output to file for comprehensive analysis
         - Filter critical information (errors, warnings, failures)
@@ -197,17 +197,17 @@ def save_execution_context_for_analysis_phase(
 ) -> None:
     """
     Preserve execution context for next phase (04b Analysis Framework)
-    
+
     Creates structured context files needed for pattern analysis:
     - test_execution_baseline.md: Category-wise results
     - test_health_metrics.md: Overall statistics
     - comprehensive_test_output.txt: Aggregated results
     - test_context_summary.md: Context preservation
-    
+
     Args:
         execution_results: Results from all test category executions
         health_metrics: Aggregated health analysis
-        
+
     Context preservation strategy:
         1. Structure results for pattern recognition
         2. Preserve baseline for comparison tracking
@@ -230,17 +230,17 @@ test_executor = TestExecutionInfrastructure()  # Copilot will suggest class stru
 
 ### 2. Category-Specific Execution
 ```python
-# Execute security tests with timeout prevention
-# Use proven chunk size for fast, stable security test execution
+# Execute unit tests with timeout prevention
+# Use proven chunk size for fast, stable test execution
 # Generate comprehensive results with health metrics
 
-security_results = execute_test_chunk_with_timeout_prevention('security')
+unit_results = execute_test_chunk_with_timeout_prevention('unit')
 
-# Execute model registry tests with intelligent chunking
-# Split into logical groups: local, API, database tests
+# Execute functional tests with intelligent chunking
+# Split into logical groups based on test category complexity
 # Handle complex setup requirements with appropriate timeouts
 
-registry_results = execute_test_chunk_with_timeout_prevention('model_registry')
+functional_results = execute_test_chunk_with_timeout_prevention('functional')
 ```
 
 ### 3. Comprehensive Baseline Establishment
@@ -276,4 +276,4 @@ filtered_output, full_file = optimize_test_execution_for_token_efficiency(
 5. **Token Awareness**: Built-in optimization for large output handling
 6. **Context Preparation**: Structured output preparation for next phase
 
-This module provides the foundation for systematic test improvement while leveraging GitHub Copilot's strengths in function completion and structured development patterns.
\ No newline at end of file
+This module provides the foundation for systematic test improvement while leveraging GitHub Copilot's strengths in function completion and structured development patterns.
diff --git a/.lad/copilot_prompts/04c_test_improvement_cycles.md b/.lad/copilot_prompts/04c_test_improvement_cycles.md
index fda1fbea8..7e57db1cd 100755
--- a/.lad/copilot_prompts/04c_test_improvement_cycles.md
+++ b/.lad/copilot_prompts/04c_test_improvement_cycles.md
@@ -48,15 +48,15 @@ class ProgressTracker:
     def __init__(self):
         self.tasks: Dict[str, ImplementationTask] = {}
         self.cycles: List[PDCACycle] = []
-        
+
     def add_task(self, task: ImplementationTask) -> None:
         """Add task to progress tracking"""
         pass
-        
+
     def update_task_status(self, task_id: str, status: str) -> None:
         """Update task status with timestamp"""
         pass
-        
+
     def get_progress_summary(self) -> Dict[str, any]:
         """Generate current progress summary"""
         pass
@@ -67,20 +67,20 @@ def initialize_pdca_cycle_with_prioritized_tasks(
 ) -> Tuple[PDCACycle, ProgressTracker]:
     """
     PLAN Phase: Initialize PDCA cycle with strategic solution planning
-    
+
     Creates systematic implementation plan with TodoWrite-style tracking:
     - Priority-based task selection (P1-CRITICAL first)
     - Solution batching optimization for efficiency
     - Resource allocation and effort estimation
     - Success criteria definition with measurable outcomes
-    
+
     Args:
         implementation_context: Output from analysis framework (04b)
         cycle_number: Current PDCA cycle iteration
-        
+
     Returns:
         Tuple of (PDCACycle object, ProgressTracker instance)
-        
+
     PLAN phase implementation:
         1. Extract P1-CRITICAL and P2-HIGH tasks from context
         2. Identify compatible tasks for batching
@@ -88,7 +88,7 @@ def initialize_pdca_cycle_with_prioritized_tasks(
         4. Estimate effort and set realistic cycle scope
         5. Define success criteria and validation requirements
         6. Initialize TodoWrite progress tracking
-        
+
     Example task organization:
         P1-CRITICAL: Scientific validity + High impact/Low effort
         P2-HIGH: System reliability + Quick wins enabling other fixes
@@ -104,21 +104,21 @@ def execute_systematic_implementation_with_progress_tracking(
 ) -> Dict[str, any]:
     """
     DO Phase: Systematic implementation with real-time progress tracking
-    
+
     Executes fixes using optimized sequence and tracks progress:
     - Mark current task as in_progress before beginning work
     - Implement fixes based on root cause analysis and strategy
     - Document implementation decisions and approach
     - Update progress tracker in real-time
     - Handle dependencies and validation requirements
-    
+
     Args:
         pdca_cycle: Current PDCA cycle with selected tasks
         progress_tracker: TodoWrite-style progress tracking
-        
+
     Returns:
         Implementation results with completed tasks and metadata
-        
+
     DO phase implementation strategy:
         1. Process tasks in dependency order
         2. Mark each task in_progress before starting
@@ -130,7 +130,7 @@ def execute_systematic_implementation_with_progress_tracking(
         4. Document implementation approach and rationale
         5. Mark tasks completed only after successful implementation
         6. Handle blockers by creating new tasks or adjusting approach
-        
+
     Implementation patterns:
         Quick wins first (momentum building)
         Dependency resolution (unblock other work)
@@ -146,33 +146,33 @@ def perform_comprehensive_validation_with_regression_prevention(
 ) -> Dict[str, any]:
     """
     CHECK Phase: Comprehensive validation with regression prevention
-    
+
     Validates implementation results using systematic approach:
     - Targeted validation for affected test categories
     - Integration validation (import testing, basic functionality)
     - Regression prevention for critical systems
     - Health metrics update and comparison with baseline
-    
+
     Args:
         implementation_results: Output from DO phase execution
         pdca_cycle: Current PDCA cycle with success criteria
-        
+
     Returns:
         Comprehensive validation report with health metrics
-        
+
     CHECK phase validation protocol:
         1. Direct test validation: Run tests for implemented fixes
         2. Integration validation: Verify imports and basic functionality
         3. Regression testing: Ensure critical systems remain functional
         4. Health metrics update: Compare current vs baseline success rates
         5. Success criteria evaluation: Assess cycle objectives achievement
-        
+
     Validation levels:
         Immediate: Affected tests pass without errors
         Integration: Related modules import and function correctly
         System: Critical test categories maintain high success rates
         Baseline: Overall health metrics show improvement or stability
-        
+
     Health metrics tracking:
         - Test collection success rate
         - Category-wise success rate improvements
@@ -189,21 +189,21 @@ def generate_user_decision_framework_with_options(
 ) -> str:
     """
     ACT Phase: Generate structured user decision framework
-    
+
     Analyzes validation results and presents strategic options:
     A) Continue cycles - Implement next priority fixes
-    B) Adjust approach - Modify strategy based on findings  
+    B) Adjust approach - Modify strategy based on findings
     C) Add coverage analysis - Integrate coverage improvement
     D) Complete current level - Achieve target success threshold
-    
+
     Args:
         validation_report: Results from CHECK phase validation
         pdca_cycle: Completed PDCA cycle with results
         progress_tracker: Current progress state
-        
+
     Returns:
         Formatted decision prompt with specific recommendations
-        
+
     ACT phase decision framework:
         1. Analyze cycle completion and success metrics
         2. Assess remaining priority tasks and effort required
@@ -211,13 +211,13 @@ def generate_user_decision_framework_with_options(
         4. Present structured options with specific metrics
         5. Provide technical recommendation based on analysis
         6. Consider resource optimization for solo programmer context
-        
+
     Decision option details:
         A) CONTINUE: Next cycle focus, estimated effort, target improvement
         B) ADJUST: Strategy refinement needs, approach modifications
         C) COVERAGE: Coverage gap analysis, integration complexity
         D) COMPLETE: Achievement validation, resource optimization
-        
+
     User decision tracking:
         - Track choice patterns for preference learning
         - Optimize future decision presentations
@@ -233,26 +233,26 @@ def save_comprehensive_session_state_for_resumption(
 ) -> None:
     """
     Enhanced session state preservation for seamless resumption
-    
+
     Saves complete session state including:
     - Current PDCA cycle and phase
     - TodoWrite progress tracking state
     - Analysis findings and patterns discovered
     - Implementation decisions and approaches used
     - Critical context for next session continuation
-    
+
     Args:
         pdca_cycle: Current PDCA cycle state
         progress_tracker: TodoWrite progress tracking
         cycle_findings: Key insights and patterns discovered
-        
+
     Session state preservation:
         1. PDCA cycle progress: Which cycle, phase, tasks status
         2. TodoWrite state: All tasks with current status
         3. Key findings: Successful approaches, patterns discovered
         4. Implementation context: Decision rationale, approaches used
         5. Next session preparation: Immediate actions, context to load
-        
+
     File organization:
         - pdca_session_state.md: Comprehensive session overview
         - essential_context.md: Critical information for resumption
@@ -268,20 +268,20 @@ def integrate_coverage_analysis_with_pdca_cycles(
 ) -> Dict[str, any]:
     """
     Coverage-driven test enhancement integration (Option C)
-    
+
     Links test failures to coverage gaps for comprehensive improvement:
     - Identifies critical functions with <80% coverage
     - Maps uncovered integration points to test failure patterns
     - Prioritizes coverage improvements by impact and effort
     - Integrates coverage tasks into PDCA cycle framework
-    
+
     Args:
         current_implementation_context: Active PDCA cycle context
         coverage_focus_modules: Modules to analyze for coverage gaps
-        
+
     Returns:
         Enhanced implementation context with coverage-driven tasks
-        
+
     Coverage integration approach:
         1. Run coverage analysis for specified modules
         2. Identify critical gaps requiring test creation/improvement
@@ -289,7 +289,7 @@ def integrate_coverage_analysis_with_pdca_cycles(
         4. Prioritize coverage tasks by system criticality
         5. Integrate coverage tasks into existing PDCA framework
         6. Balance test quality fixes vs coverage enhancement
-        
+
     CoverUp-style methodology:
         - Focus on critical system components with low coverage
         - Prioritize uncovered integration points
@@ -305,20 +305,20 @@ def optimize_pdca_cycles_for_solo_programmer_efficiency(
 ) -> Dict[str, any]:
     """
     Resource optimization for solo programmer productivity
-    
+
     Optimizes PDCA cycle execution for individual developer constraints:
     - Time management and session length optimization
     - Context switching minimization through batching
     - Energy management and optimal task sequencing
     - Productivity pattern recognition and adaptation
-    
+
     Args:
         implementation_plan: Current PDCA cycle implementation plan
         resource_constraints: Developer time, energy, focus constraints
-        
+
     Returns:
         Optimized implementation plan for solo programmer efficiency
-        
+
     Solo programmer optimizations:
         1. Batch compatible fixes to minimize context switching
         2. Sequence tasks by complexity and energy requirements
@@ -326,7 +326,7 @@ def optimize_pdca_cycles_for_solo_programmer_efficiency(
         4. Prioritize high-impact/low-effort combinations
         5. Build momentum with quick wins before complex tasks
         6. Plan break timing and energy management
-        
+
     Efficiency strategies:
         - Start sessions with momentum-building quick wins
         - Group similar task types to maintain focus
@@ -418,7 +418,7 @@ save_comprehensive_session_state_for_resumption(
 
 enhanced_context = integrate_coverage_analysis_with_pdca_cycles(
     current_implementation_context,
-    ['emuses.model_registry', 'emuses.analysis', 'emuses.security']
+    ['mypackage.core', 'mypackage.utils', 'mypackage.api']
 )
 ```
 
@@ -432,4 +432,4 @@ enhanced_context = integrate_coverage_analysis_with_pdca_cycles(
 6. **Decision Framework**: Structured user decision support with metrics and recommendations
 7. **Validation Protocols**: Systematic regression prevention and health tracking
 
-This module ensures systematic, measurable improvement toward 100% meaningful test success while maintaining productivity and preventing regressions through structured PDCA cycles optimized for individual developer workflows.
\ No newline at end of file
+This module ensures systematic, measurable improvement toward 100% meaningful test success while maintaining productivity and preventing regressions through structured PDCA cycles optimized for individual developer workflows.
diff --git a/.lad/documentation_standards/MKDOCS_MATERIAL_FORMATTING_GUIDE.md b/.lad/documentation_standards/MKDOCS_MATERIAL_FORMATTING_GUIDE.md
index 67f494c34..b5f0df4ed 100755
--- a/.lad/documentation_standards/MKDOCS_MATERIAL_FORMATTING_GUIDE.md
+++ b/.lad/documentation_standards/MKDOCS_MATERIAL_FORMATTING_GUIDE.md
@@ -1,7 +1,7 @@
 # MkDocs Material Formatting Guide for Claude
 
-**Version**: 1.0  
-**Date**: 2025-08-17  
+**Version**: 1.0
+**Date**: 2025-08-17
 **Purpose**: LAD Framework documentation standards to prevent systematic markdown errors in MkDocs Material projects
 
 ---
@@ -167,7 +167,7 @@ def process_data():
 ```
 
 ```bash
-emuses analyze --input data.csv
+mycommand analyze --input data.csv
 ```
 
 ```yaml
@@ -269,5 +269,5 @@ Technical details for developers.
 
 ---
 
-*LAD Framework Documentation Standards v1.0*  
-*Research-based guidelines for error-free technical documentation*
\ No newline at end of file
+*LAD Framework Documentation Standards v1.0*
+*Research-based guidelines for error-free technical documentation*