Skip to content

Codebase summary #2

@Chungmaster1337

Description

@Chungmaster1337

Dynamic Page Monitor - Codebase Overview

Architecture Overview

This Chrome extension monitors user-defined web pages for content changes and provides detailed analysis tools. The codebase has evolved from using Chrome’s internal storage to SQLite for better data management.

Core Components

1. Manifest & Entry Point

  • manifest.json: Defines extension permissions, background service worker, and web-accessible resources
  • Version 2.0 with comprehensive permissions for storage, tabs, scripting, notifications, and alarms

2. Background Service Worker

  • background.js: Main orchestrator handling page monitoring, content sanitization, and change detection
  • sqlite_background.js: Enhanced version using SQLite database (appears to be newer implementation)
  • database.js: SQLite database module for structured data storage

3. Content Processing Engine

Intelligent Content Sanitization

// Key features of the sanitization system:
- Removes dynamic elements (ads, navigation, timestamps)
- Extracts meaningful content (headings, paragraphs, tables)
- Configurable inclusion/exclusion rules
- Priority-based content extraction
- Normalized text processing for consistent comparison

Change Detection Algorithm

  • Hash-based comparison for efficiency
  • Word-level change tracking (words added/removed)
  • Percentage-based change calculation
  • Historical comparison support

4. User Interface Components

Main Popup (popup.html/js)

  • URL configuration and management
  • Manual check triggers
  • Recent changes display
  • Quick access to export and reports

Data Export System (data-export.html/js)

  • Comprehensive data export (JSON/CSV)
  • Statistics dashboard
  • Data cleanup utilities
  • Storage management

Search & Reports (search-reports.html/js)

  • Advanced content search with regex support
  • Historical data analysis
  • Filter by URL, date range, content type
  • Export search results and generate reports

Diff Viewer (diff.html/js)

  • Side-by-side content comparison
  • Highlighted changes (additions/removals)
  • Raw HTML vs processed content views
  • Context-aware difference display

Configuration (sanitization_config.html)

  • Customizable content processing rules
  • Test URL processing capabilities
  • Real-time preview of sanitization effects

5. Data Storage Evolution

Original System (Chrome Storage)

  • Used chrome.storage.local for data persistence
  • JSON-based structure for snapshots and history
  • Limited by storage quotas and performance

SQLite Integration

  • sql-wasm.js: WebAssembly SQLite implementation
  • sqlite_database.js: Database abstraction layer
  • Structured schema with proper indexing
  • Better performance for large datasets

Database Schema

-- Key tables:
- monitored_urls: Tracked websites with metadata
- content_snapshots: Stored page content with hashes
- change_history: Change events with metrics
- search_analytics: Search usage tracking
- content_keywords: Content categorization
- performance_metrics: Monitoring

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions