-
Notifications
You must be signed in to change notification settings - Fork 0
Implementation Plan
mandla-enkosi edited this page Apr 6, 2025
·
7 revisions
-
Objective:
Build the core functionality of the Code Repository Cleaner as a static, front-end–only application using React. This stage focuses on file upload, initial analysis, filtering via gitignore rules, sensitive data scanning with standard regex patterns, basic binary/media filtering, and dual export options (zip and concatenated document). A light ad integration will be embedded using static scripts. -
Expected Result:
A functional MVP that allows users to upload a code directory, receive an initial analysis, and process files through the filtering and sensitive data scanning pipeline. The tool will output a cleaned repository as a downloadable zip archive or concatenated document. The UI will include basic progress feedback and unobtrusive ad placements.
-
Pre-Stage Requirements:
- Verified file upload capability (including directory selection using
webkitdirectory). - Integrated core libraries: React, Material UI, JSZip, Micromatch, and static ad scripts.
- Environment configuration: Node.js/npm installed, and project scaffolding via create-Vite for react-ts.
- Verified file upload capability (including directory selection using
-
External Dependencies:
- Third-party libraries for zip generation and pattern matching.
- Ad network scripts (e.g., Google AdSense or custom internal ad code).
-
Visual Aids:
- Architecture Diagram:
flowchart TD
A[React UI Components] --> B[File Upload & Analysis Module]
B --> C[File Processing Engine]
C --> D[Gitignore Filtering Module]
C --> E[Sensitive Data Scanner]
C --> F[Media/Binary Filter]
D & E & F --> G[Initial Export List]
G --> H[File Override Panel]
H --> I[Final Export Engine]
I --> J[Output: Cleaned Zip / Concatenated Document]
A --> K[Ad Integration Module]
K --> A
-
Modules/Functionalities/Files:
-
Components: (Leveraging core logic and state management from custom hooks in
/src/hooks)-
FileUploader.tsx(file/directory upload) -
AnalysisDashboard.tsx(initial analysis results) -
ConfigurationPanel.tsx(filtering and export options) -
FileOverridePanel.tsx(hierarchical view of all files with toggles/checkboxes for user selection) -
ProgressIndicator.tsx(processing status) -
ExportOptions.tsx(export controls) -
AdBanner.tsx(ad placement)
-
-
Hooks:
-
/hooks/useFileProcessing.ts: Manages the state machine for a single cleaning job (status, files, configuration used, overrides, results) and orchestrates worker interaction viauseWorkerManager -
/hooks/useWorkerManager.ts: Manages the Web Worker lifecycle and provides a typed interface for communication (sending tasks, receiving messages) -
/hooks/useConfiguration.ts: Manages global/persistent user settings (e.g., preferred export format)
-
-
Modules:
-
/modules/fileProcessing.ts(orchestration of processing tasks) -
/modules/gitignoreFilter.ts(applies gitignore rules) -
/modules/sensitiveScanner.ts(scans and obfuscates sensitive data) -
/modules/mediaBinaryFilter.ts(filters binaries, caches, media files) -
/modules/exportEngine.ts(generates zip or concatenated output)
-
-
Web Workers:
-
/workers/processor.worker.ts(offloads heavy processing)
-
-
Utilities:
-
/utils/api.ts(ad integration helpers) -
/utils/storage.ts(IndexedDB/local storage utilities)
-
-
Entry Points & Styles:
-
App.tsx,main.tsx,styles/Theme.tsandstyles/index.css
-
-
Continuous Integration (CI):
-
.github/workflows/ci.ymlfor GitHub Actions performing linting, testing (vitest --run), and building
-
-
Components: (Leveraging core logic and state management from custom hooks in
-
Interface Definitions:
-
File Processing:
-
processFiles(files: File[]): Promise<ProcessedData>- where
ProcessedDatamight be{ filesToInclude: ProcessedFile[], removedFileCount: number, sensitiveDataFound: boolean }
- where
applyGitignoreRules(files: File[], rules: string[]): File[]overrideFileSelection(files: File[], selections: Record<string, boolean>): File[]
-
-
Sensitive Scanner:
scanSensitiveData(content: string): string
-
Export Engine:
generateZip(processedData: ProcessedData): Blob
-
File Processing:
-
Local State & Error Handling:
- Define state for file upload status, processing progress, and export readiness.
- Errors such as
FileUploadError,ProcessingError, andExportErrorshould trigger UI error messages through React Error Boundaries.
Tests will be written and executed using vitest.
Unit tests will be implemented in files named *.test.ts(x) co-located with their corresponding source files in /src.
Component and hook tests will utilize @testing-library/react.
-
Unit Tests:
- Test file upload parsing and validation.
- Test gitignore rule application with multiple scenarios.
- Validate sensitive data scanning with known input strings.
- Verify export engine outputs (zip file and concatenated document) using simulated processed data.
- Test FileOverridePanel component to ensure that manual toggling correctly updates the selection state.
- Test the new
overrideFileSelectionfunction with various scenarios (e.g., all files toggled off, mixed selections, etc.). - Test the custom React hooks (
useFileProcessing,useWorkerManager,useConfiguration) independently using@testing-library/react'srenderHookto verify their state transitions, logic, and interactions (mocking worker communication foruseFileProcessingwhen testing it, mocking hook interactions when testing components).
-
Integration Tests:
- Simulate a complete user flow from file upload to export.
- Use React Testing Library to ensure component interactions (upload → configuration → processing → export) work as expected.
- Validate the end-to-end flow: Upload → Automatic Processing → Manual Override → Export.
- Ensure that the final output respects both the automatic filtering and the manual overrides.
- Use React Testing Library (
@testing-library/react) withvitestto ensure component interactions work as expected. Component tests should focus on rendering and user interaction, often mocking the custom hooks they consume (useFileProcessing, etc.) to isolate component logic, especially for hooks managing complex state or side effects.
-
Edge Cases & Negative Testing:
- Test handling of empty directories, extremely large files, and unsupported file types.
- Simulate failure scenarios (e.g., file read errors) and ensure graceful degradation.
-
Internal Module Integration:
- Confirm that the file upload module correctly passes files to the processing engine.
- Validate that outputs from gitignore filtering, sensitive scanning, and media filtering combine seamlessly into the export engine.
-
Interface Integration:
- Ensure that the Export Engine generates correct outputs from processed data.
- Verify that the Ad Integration module loads static ad scripts without affecting core functionality.
-
Internal Documentation:
- Inline code comments and module-level documentation.
- Developer guide covering module interactions and state/error management.
- Documentation detailing the purpose, state managed, parameters, return values, key interactions (e.g., with workers or other hooks), and usage context for the core custom hooks (
useFileProcessing,useWorkerManager,useConfiguration).
-
External Developer Guides:
- User manual explaining how to use the tool.
- Quick-start guide for local project setup.
-
Change Logs & Versioning:
- Maintain a CHANGELOG.md file to document feature additions, fixes, and version releases.
-
Key Considerations:
- Implement chunked processing using Web Workers to maintain UI responsiveness.
- Ensure the tool is compatible with all major browsers.
- Keep ad integration unobtrusive to the core user experience.
- Component tests (using Vitest and likely React Testing Library) will involve interacting with MUI components, should query MUI components effectively (e.g., by role, label), and must avoid testing MUI's internal implementation details.
-
Risks & Mitigation:
- Large repositories may strain browser memory; mitigate with chunked processing and fallback messages.
- Browser inconsistencies with the File API—provide clear user guidance and alternative upload methods.
- Increased bundle size from Material UI should be mitigated
-
Other Notes:
- Prioritize privacy by ensuring all processing is client-side.
- Maintain a clean separation between core functionality and ad content.
- CI Pipeline: GitHub Actions will automate checks (lint, test, build) via:
npm cinpm run lintnpm run test -- --runnpm run build
-
Objective:
Enhance the MVP by improving performance and user feedback, and by introducing basic customization options. This stage will refine the Web Worker integration, add cancelable and chunked processing, and expand configuration settings. Additionally, the ad integration module will be refined based on initial user feedback. -
Expected Result:
- Improved responsiveness and memory management for processing large codebases.
- Detailed progress reporting and the ability to cancel ongoing operations.
- Enhanced configuration options for users.
- Refined ad module capable of dynamic switching between internal and external ads.
-
Pre-Stage Requirements:
- Successfully deployed and validated MVP functionalities.
- Verified file upload, processing, and export operations.
- Updated library versions as necessary for enhanced processing (e.g., any improved Web Worker libraries).
- Continued use of static site hosting for seamless updates.
-
External Dependencies:
- Continued use of React, JSZip, and Micromatch.
- Optionally integrate IndexedDB utilities for enhanced storage management.
-
Visual Aids:
- Enhanced Workflow Diagram:
flowchart LR
A[Enhanced React UI] --> B[Advanced File Upload & Analysis]
B --> C[Optimized File Processing Engine]
C --> D[Improved Gitignore & Sensitive Scanner Modules]
C --> E[Enhanced Export Engine]
D & E --> F[Refined Output Delivery]
A --> G[Enhanced Ad Integration Module]
-
Modules/Functionalities/Files:
- Refine existing React components: Update
ProgressIndicator.tsxfor detailed metrics. - Refactor
/modules/fileProcessing.tsfor cancelable, chunked processing with Web Worker integration. - Enhance
/workers/processor.worker.tsto support cancelable tasks. - Extend
/utils/storage.tsfor potential IndexedDB integration. - Update
ConfigurationPanel.tsxto include additional customization options. - Modify
AdBanner.tsxfor dynamic ad switching and performance tracking. - Refactor
/hooks/useFileProcessing.tsand/hooks/useWorkerManager.tsto integrate cancelable operations and handle detailed progress reporting from the worker. - Extend
/hooks/useConfiguration.tsor/hooks/useFileProcessing.tsto manage state for basic custom sensitive data patterns.
- Refine existing React components: Update
-
Interface Definitions:
- Extend functions with additional parameters for progress callbacks:
processFiles(files: File[], onProgress: (progress: number) => void): Promise<ProcessedData>
- Define new interfaces for enhanced configuration settings.
- Extend functions with additional parameters for progress callbacks:
-
Local State & Error Handling:
- Introduce additional state variables for Web Worker status and user configuration.
- Update error handling to capture and report Web Worker termination and configuration errors.
-
Unit Tests:
- Extend file processing tests to cover chunking and cancellation.
- Test new configuration options to ensure they correctly influence processing behavior.
-
Integration Tests:
- Run end-to-end tests simulating long-running processes and cancellation.
- Validate that enhanced progress reporting is accurately reflected in the UI.
-
Edge Cases & Negative Testing:
- Simulate interruptions and verify that cancellation recovers gracefully.
- Stress-test large repositories to ensure improved memory management.
-
Internal Module Integration:
- Validate Web Worker integration: ensure that the enhanced file processing updates the UI responsively.
- Confirm that enhanced configuration options integrate properly with the processing engine.
-
Interface Integration:
- Verify that new progress callbacks and error handlers are correctly consumed across modules.
- Ensure compatibility with the MVP’s export outputs.
-
Internal Documentation:
- Update developer guides with enhanced Web Worker and configuration details.
- Document new parameters and state management updates inline.
-
External Developer Guides:
- Revise user guides to include instructions for the enhanced configuration and progress features.
-
Change Logs & Versioning:
- Update CHANGELOG.md with details of performance and UI enhancements.
-
Key Considerations:
- Focus on maintaining a responsive UI even under heavy processing loads.
- Ensure enhancements do not compromise cross-browser compatibility.
-
Risks & Mitigation:
- Complexity of cancelable operations may introduce bugs—extensive testing and fallback strategies are essential.
- Customization features could overwhelm users; use defaults and clear UI guidance.
-
Other Notes:
- Continue to monitor the impact of ad integration as performance is optimized.
-
Objective:
Expand the application with advanced features such as customizable sensitive data detection, selective file retention, and a minimal JavaScript API for integration with development tools and CI/CD pipelines. -
Expected Result:
- Advanced configuration options that allow users to fine-tune sensitive data scanning.
- Selective retention controls for overriding default exclusion rules.
- A minimal API that exposes key functionality for programmatic access.
- Further export optimizations including LLM-specific formatting (e.g., token count estimation).
-
Pre-Stage Requirements:
- Stable MVP and Enhancement stages in production.
- Positive user feedback and analytics supporting demand for advanced options.
- Updated dependencies to support additional API integrations.
- Environment: Local server setup may be required for API simulation.
-
External Dependencies:
- Additional libraries if necessary for API creation or advanced processing.
-
Visual Aids:
- Future Integration Diagram:
flowchart LR
A[React UI with Advanced Options] --> B[Customizable Sensitive Data Module]
B --> C[Selective Retention Module]
C --> D[Advanced Export Engine]
A --> E[JavaScript API Integration]
-
Modules/Functionalities/Files:
- Add new React component
AdvancedConfigPanel.tsxfor advanced settings. - Extend
/modules/sensitiveScanner.tsto support custom pattern input. - Create new module(s) for selective file retention logic.
- Create a new directory
/apicontaining modules that expose a minimal JavaScript API. - Update
/modules/exportEngine.tsfor LLM-specific formatting options. - Extend
/hooks/useConfiguration.tsor/hooks/useFileProcessing.tsto manage state for advanced configuration options (custom sensitive patterns, LLM formatting choices). - Note: Selective retention logic is managed within
useFileProcessingas part of the override state established in MVP. - Note: The JavaScript API in
/apiwill likely interact with the core processing modules directly, not necessarily requiring new UI-focused hooks.
- Add new React component
-
Interface Definitions:
- New functions for customizable scanning:
setCustomPatterns(patterns: Pattern[]): voidscanWithCustomPatterns(content: string, patterns?: Pattern[]): string
- Define interfaces for the exposed JavaScript API (e.g.,
cleanRepository(config: Config): ProcessedData).
- New functions for customizable scanning:
-
Local State & Error Handling:
- Extend state management to store advanced configuration settings.
- Expand error handling for new modules and API integration failures.
-
Unit Tests:
- Develop tests for new advanced sensitive data and selective retention modules.
- Test new API functions with various configurations.
-
Integration Tests:
- Validate end-to-end processing with advanced options enabled.
- Ensure API endpoints are functioning as documented.
-
Edge Cases & Negative Testing:
- Test scenarios with conflicting advanced configuration options.
- Simulate API call failures and ensure graceful degradation.
-
Internal Module Integration:
- Confirm that new advanced modules integrate without breaking core functionality.
-
Interface Integration:
- Validate that the API interfaces work as expected when called from external tools.
- Ensure backwards compatibility with outputs from previous stages.
-
Internal Documentation:
- Update developer documentation with details on advanced module contracts and API usage.
-
External Developer Guides:
- Create comprehensive API documentation and user guides for advanced features.
-
Change Logs & Versioning:
- Document all new functionalities and integrations in the versioned changelog.
-
Key Considerations:
- Maintain the core simplicity and performance even with advanced features.
- Provide clear default configurations to avoid overwhelming non-technical users.
-
Risks & Mitigation:
- Increased complexity may lead to integration challenges—extensive testing is required.
- Customization options might confuse some users; provide comprehensive documentation and sensible defaults.
-
Other Notes:
- Regularly review user feedback to decide which advanced features to prioritize.
- Monitor API usage and performance analytics to guide future iterations.