-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation
Description
Dynamic Page Monitor - Codebase Overview
Architecture Overview
This Chrome extension monitors user-defined web pages for content changes and provides detailed analysis tools. The codebase has evolved from using Chrome’s internal storage to SQLite for better data management.
Core Components
1. Manifest & Entry Point
- manifest.json: Defines extension permissions, background service worker, and web-accessible resources
- Version 2.0 with comprehensive permissions for storage, tabs, scripting, notifications, and alarms
2. Background Service Worker
- background.js: Main orchestrator handling page monitoring, content sanitization, and change detection
- sqlite_background.js: Enhanced version using SQLite database (appears to be newer implementation)
- database.js: SQLite database module for structured data storage
3. Content Processing Engine
Intelligent Content Sanitization
// Key features of the sanitization system:
- Removes dynamic elements (ads, navigation, timestamps)
- Extracts meaningful content (headings, paragraphs, tables)
- Configurable inclusion/exclusion rules
- Priority-based content extraction
- Normalized text processing for consistent comparisonChange Detection Algorithm
- Hash-based comparison for efficiency
- Word-level change tracking (words added/removed)
- Percentage-based change calculation
- Historical comparison support
4. User Interface Components
Main Popup (popup.html/js)
- URL configuration and management
- Manual check triggers
- Recent changes display
- Quick access to export and reports
Data Export System (data-export.html/js)
- Comprehensive data export (JSON/CSV)
- Statistics dashboard
- Data cleanup utilities
- Storage management
Search & Reports (search-reports.html/js)
- Advanced content search with regex support
- Historical data analysis
- Filter by URL, date range, content type
- Export search results and generate reports
Diff Viewer (diff.html/js)
- Side-by-side content comparison
- Highlighted changes (additions/removals)
- Raw HTML vs processed content views
- Context-aware difference display
Configuration (sanitization_config.html)
- Customizable content processing rules
- Test URL processing capabilities
- Real-time preview of sanitization effects
5. Data Storage Evolution
Original System (Chrome Storage)
- Used chrome.storage.local for data persistence
- JSON-based structure for snapshots and history
- Limited by storage quotas and performance
SQLite Integration
- sql-wasm.js: WebAssembly SQLite implementation
- sqlite_database.js: Database abstraction layer
- Structured schema with proper indexing
- Better performance for large datasets
Database Schema
-- Key tables:
- monitored_urls: Tracked websites with metadata
- content_snapshots: Stored page content with hashes
- change_history: Change events with metrics
- search_analytics: Search usage tracking
- content_keywords: Content categorization
- performance_metrics: MonitoringMetadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation