Skip to content

wisemanIV/vigil

Repository files navigation

πŸ›‘οΈ Vigil - Enterprise Data Loss Prevention for Chrome

Vigil Logo

Advanced, AI-powered data loss prevention that protects sensitive corporate information from leaking through clipboard operations, file uploads, and screenshots.

License: MIT GitHub Stars GitHub Issues

Features β€’ Quick Start β€’ Architecture β€’ Configuration β€’ Development β€’ Contributing


🎯 Features

πŸ”’ Multi-Layer Protection

  • πŸ“‹ Clipboard Monitoring - Real-time analysis of paste operations with context-aware blocking
  • πŸ“ File Upload Protection - Comprehensive file metadata and content analysis before uploads
  • πŸ“· Screenshot Protection - Automatic detection and prevention of sensitive data capture
  • πŸ” Content Classification - 0-100 point scoring system for PUBLIC/INTERNAL/CONFIDENTIAL/HIGHLY CONFIDENTIAL
  • 🌐 Destination Risk Analysis - Smart blocking based on where data is being shared (HIGH/MEDIUM/LOW risk destinations)

🧠 Advanced AI Analysis

  • Semantic Understanding - Context-aware analysis using machine learning models
  • Pattern Detection - Credit cards, SSNs, API keys, passwords, financial data, and proprietary algorithms
  • Bulk Data Detection - Prevents customer lists, employee databases, and data exports from leaking
  • Company-Specific Intelligence - Configurable detection of internal terminology, project codes, and confidentiality markers
  • File Metadata Analysis - Fast risk assessment using filename patterns, file types, size, and recency without reading content

🏒 Enterprise-Grade Features

  • Configurable Risk Tolerance - Customizable scoring thresholds and blocking policies per organization
  • Company Customization - Adaptable to any organization's specific terminology, domains, and confidentiality markers
  • Destination-Aware Blocking - Different policies for internal vs external sharing destinations
  • Real-time Policy Enforcement - Dynamic blocking based on content sensitivity + destination risk combinations
  • Comprehensive Audit Logging - Complete history of all protection events and policy decisions
  • Privacy-First Design - 100% local processing, no data sent to external servers

πŸŽ›οΈ Flexible Configuration

  • Global Configuration System - Centralized management of all detection patterns and risk thresholds
  • Runtime Customization - Update policies without extension redeployment
  • Multi-Tier Risk Classification - Separate thresholds for content analysis, file metadata, and destination risks
  • Bulk Data Thresholds - Configurable limits for email lists, phone numbers, and PII detection
  • Company-Specific Patterns - Custom detection for internal project names, confidentiality markers, and business terms

πŸš€ Quick Start

Installation

Option 1: Install from Release

  1. Download the latest release from Releases
  2. Extract the ZIP file
  3. Open Chrome and navigate to chrome://extensions/
  4. Enable "Developer mode" (toggle in top-right corner)
  5. Click "Load unpacked" and select the extracted folder
  6. Grant necessary permissions when prompted

Option 2: Build from Source

# Clone the repository
git clone https://github.com/yourusername/vigil.git
cd vigil

# Install dependencies
npm install

# Build the extension
npm run build

# Load in Chrome
# Navigate to chrome://extensions/
# Enable Developer mode
# Click "Load unpacked" and select the dist folder

Initial Setup

  1. Configure Company Settings - Edit /src/config/global-config.js:

    company: {
        name: 'Your Company Name',
        aliases: ['company_alias', 'short_name'],
        domains: ['yourcompany.com', 'internal.company.com'],
        confidentialityMarkers: ['confidential', 'internal', 'yourcompany_confidential']
    }
  2. Adjust Risk Tolerance - Customize scoring thresholds:

    riskTolerance: {
        thresholds: {
            critical: 75,    // Scores >= 75 = CRITICAL risk
            high: 50,        // Scores 50-74 = HIGH risk
            medium: 25,      // Scores 25-49 = MEDIUM risk
            low: 0           // Scores 0-24 = LOW risk
        }
    }
  3. Test Protection - Try pasting sensitive content or uploading files to verify blocking works correctly

πŸ—οΈ Architecture

Core Components

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Content.js    β”‚    β”‚  Background.js  β”‚    β”‚ Global Config   β”‚
β”‚                 β”‚    β”‚                 β”‚    β”‚                 β”‚
β”‚ β€’ Paste Monitor │◄──►│ β€’ Fast Analyzer │◄──►│ β€’ Company Info  β”‚
β”‚ β€’ Upload Monitorβ”‚    β”‚ β€’ Hybrid Analyzerβ”‚   β”‚ β€’ Risk Toleranceβ”‚
β”‚ β€’ Screenshot    β”‚    β”‚ β€’ Decision Logic β”‚   β”‚ β€’ Detection     β”‚
β”‚   Protection    β”‚    β”‚                 β”‚    β”‚   Patterns      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚                       β”‚
         β–Ό                       β–Ό                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Monitors        β”‚    β”‚ Analyzers       β”‚    β”‚ Configuration   β”‚
β”‚                 β”‚    β”‚                 β”‚    β”‚                 β”‚
β”‚ β€’ Upload        β”‚    β”‚ β€’ Content       β”‚    β”‚ β€’ Risk Thresholdsβ”‚
β”‚ β€’ Screenshot    β”‚    β”‚   Classificationβ”‚    β”‚ β€’ Detection     β”‚
β”‚ β€’ Paste Events  β”‚    β”‚ β€’ File Metadata β”‚    β”‚   Patterns      β”‚
β”‚                 β”‚    β”‚ β€’ Destination   β”‚    β”‚ β€’ Bulk Data     β”‚
β”‚                 β”‚    β”‚   Risk Classifierβ”‚   β”‚   Settings      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Analysis Pipeline

  1. Content Interception - Monitors clipboard, uploads, and screen capture attempts
  2. Fast Pattern Analysis - Immediate detection of obvious sensitive patterns (SSN, credit cards, API keys)
  3. Content Classification - 0-100 point scoring across 5 categories:
    • Content Sensitivity (0-30 points)
    • Identifier Presence (0-25 points)
    • Temporal Sensitivity (0-20 points)
    • Competitive Impact (0-15 points)
    • Legal/Regulatory Risk (0-10 points)
  4. File Metadata Analysis - Fast risk assessment using file properties without reading content
  5. Destination Risk Assessment - Classification of target websites/services (HIGH/MEDIUM/LOW)
  6. Risk Multiplication - Content score adjusted based on destination risk
  7. Policy Decision - Block/allow/warn based on combined risk assessment
  8. User Interaction - Confirmation dialogs for borderline cases with detailed findings

βš™οΈ Configuration

Company Configuration

Customize the DLP system for your organization by editing the global configuration:

// /src/config/global-config.js

export class VigilConfig {
    constructor() {
        this.company = {
            name: 'Your Company Inc',
            aliases: ['yourcompany', 'your_company', 'yc_inc'],
            domains: ['yourcompany.com', 'internal.company.com'],
            confidentialityMarkers: [
                'confidential', 'internal', 'private', 'restricted',
                'yourcompany_confidential', 'yc_internal', 'company_private'
            ]
        };
        
        this.riskTolerance = {
            // Adjust thresholds based on your security posture
            thresholds: {
                critical: 80,    // More strict
                high: 60,        // More strict
                medium: 30,      // More strict
                low: 0
            },
            policies: {
                blockCritical: true,
                blockHighToHighRisk: true,
                warnOnMedium: true,
                logAll: true
            }
        };
    }
}

Detection Patterns

Add custom detection patterns for your organization:

this.detectionPatterns = {
    financial: {
        keywords: [
            'financial', 'revenue', 'profit', 'budget',
            // Add company-specific financial terms
            'fy2024_budget', 'q4_forecast', 'company_financials'
        ],
        patterns: [
            /fy\d{2,4}/i,
            /q[1-4][\s_-]?\d{2,4}/i,
            // Add regex patterns for your financial data formats
            /budget_\d{4}/i
        ],
        score: 25  // Adjust scoring based on sensitivity
    }
}

Risk Tolerance Tuning

Fine-tune blocking behavior:

// Stricter organization
riskTolerance: {
    contentClassification: {
        highlyConfidential: 60,  // Lower threshold = more sensitive
        confidential: 40,
        internal: 20,
        public: 0
    },
    policies: {
        blockCritical: true,
        blockHighToHighRisk: true,
        blockMediumToHighRisk: true,  // Additional blocking
        warnOnMedium: true,
        logAll: true
    }
}

// More permissive organization  
riskTolerance: {
    contentClassification: {
        highlyConfidential: 85,  // Higher threshold = less sensitive
        confidential: 65,
        internal: 45,
        public: 0
    },
    policies: {
        blockCritical: true,
        blockHighToHighRisk: false,  // Only warn
        warnOnMedium: false,         // No warnings for medium risk
        logAll: true
    }
}

πŸ“Š Understanding Risk Scoring

Content Classification Scoring (0-100 points)

Category Points Examples
Content Sensitivity 0-30 Financial data, customer lists, trade secrets, strategic plans
Identifier Presence 0-25 Customer names, employee PII, project codes, confidential markings
Temporal Sensitivity 0-20 Future release dates, current quarter data, upcoming announcements
Competitive Impact 0-15 Competitive advantages, market positioning, proprietary methods
Legal/Regulatory Risk 0-10 GDPR data, SOX compliance, NDA violations, export controls

File Metadata Scoring

Risk Factor Score Examples
Confidentiality Markers 25 "CONFIDENTIAL_report.pdf", "internal_strategy.docx"
Financial Indicators 20 "Q4_2024_Financial_Report.xlsx", "budget_analysis.csv"
Strategic Content 20 "strategic_roadmap_2025.pptx", "competitive_analysis.xlsx"
Customer Data 18 "customer_database_export.csv", "user_analytics_report.pdf"
File Type Risk 5-20 Spreadsheets (15), Databases (20), Presentations (12)
File Recency 5-15 Modified today (15), This week (10), This month (5)
File Size 10-25 >10MB (10), >50MB (15), >100MB (20), Large CSV/Excel (25)

Classification Levels

Score Range Classification Action
71-100 HIGHLY CONFIDENTIAL ❌ Block automatically
51-70 CONFIDENTIAL ❌ Block automatically
31-50 INTERNAL ⚠️ Block to HIGH risk destinations
0-30 PUBLIC βœ… Allow with destination-based warnings

πŸ› οΈ Development

Prerequisites

  • Node.js 18+ and npm
  • Chrome/Chromium browser
  • Git

Setup Development Environment

# Clone and setup
git clone https://github.com/yourusername/vigil.git
cd vigil
npm install

# Run tests
npm test

# Run specific test suites
npm test -- chatgpt-paste.test.js
npm test -- file-metadata.test.js
npm test -- global-config.test.js

# Development build (with source maps)
npm run build:dev

# Production build
npm run build

Project Structure

vigil/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ analyzers/           # Core analysis engines
β”‚   β”‚   β”œβ”€β”€ fast-analyzer.js           # Main analyzer orchestrator
β”‚   β”‚   β”œβ”€β”€ content-classification-detector.js  # 0-100 point content scoring
β”‚   β”‚   β”œβ”€β”€ destination-risk-classifier.js      # Website risk classification
β”‚   β”‚   β”œβ”€β”€ file-metadata-analyzer.js          # File metadata analysis
β”‚   β”‚   β”œβ”€β”€ bulk-data-detector.js              # Bulk PII detection
β”‚   β”‚   └── hybrid-analyzer.js                 # Fast + ML analysis
β”‚   β”œβ”€β”€ config/
β”‚   β”‚   └── global-config.js          # Central configuration system
β”‚   β”œβ”€β”€ monitors/            # Data interception
β”‚   β”‚   β”œβ”€β”€ upload-monitor.js         # File upload monitoring
β”‚   β”‚   └── screenshot-monitor.js     # Screen capture prevention
β”‚   β”œβ”€β”€ ui/
β”‚   β”‚   └── confirmation-dialog.js    # User confirmation interface
β”‚   β”œβ”€β”€ content.js          # Content script (main entry)
β”‚   β”œβ”€β”€ background.js       # Service worker
β”‚   └── manifest.json       # Extension manifest
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ specs/              # Test suites
β”‚   └── helpers/            # Test utilities
└── icons/                  # Extension icons

Adding New Detection Patterns

  1. Edit Global Configuration:
// In /src/config/global-config.js
this.detectionPatterns = {
    newCategory: {
        keywords: ['keyword1', 'keyword2'],
        patterns: [/pattern1/gi, /pattern2/gi],
        score: 15,
        category: 'New Category',
        reason: 'Description of why this is sensitive'
    }
}
  1. Update Analyzers (if needed):
// In content-classification-detector.js
const newCategoryConfig = this.config.getDetectionPatterns('newCategory');
  1. Add Tests:
// In global-config.test.js
test('should get new category detection patterns', () => {
    const patterns = config.getDetectionPatterns('newCategory');
    expect(patterns.keywords).toContain('keyword1');
    expect(patterns.score).toBe(15);
});

Testing

The extension includes comprehensive test suites covering all major functionality:

  • Unit Tests - Individual analyzer components
  • Integration Tests - Full analysis pipeline
  • Browser Tests - Real Chrome extension environment testing
  • Configuration Tests - Global configuration system validation
# Run all tests
npm test

# Run with coverage
npm run test:coverage

# Run specific test categories
npm test -- --testNamePattern="Content Classification"
npm test -- --testNamePattern="File Metadata"
npm test -- --testNamePattern="Global Configuration"

# Run browser integration tests
npm test -- chatgpt-paste.test.js
npm test -- sheets-paste.test.js

πŸ“ˆ Performance

  • Fast Analysis: Pattern detection completes in <10ms
  • File Metadata: Risk assessment in <5ms without reading file content
  • Memory Efficient: <50MB RAM usage during active monitoring
  • Local Processing: No network calls, all analysis happens locally
  • Minimal CPU Impact: Background analysis optimized for performance

πŸ”§ Troubleshooting

Common Issues

Extension Not Loading

# Check manifest syntax
npm run lint

# Verify all dependencies
npm install

# Check console for errors
# Open chrome://extensions/, click "Errors" button

Tests Failing

# Clear jest cache
npx jest --clearCache

# Run tests with verbose output
npm test -- --verbose

# Check for ES module issues
node --experimental-vm-modules node_modules/jest/bin/jest.js

Configuration Not Working

  • Verify JSON syntax in global-config.js
  • Check that patterns compile as valid RegExp objects
  • Ensure score values are numbers, not strings

🀝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

Development Workflow

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature-name
  3. Make changes and add tests
  4. Run test suite: npm test
  5. Commit changes: git commit -m "Description"
  6. Push to branch: git push origin feature-name
  7. Create Pull Request

Code Standards

  • ESNext JavaScript with modules
  • Comprehensive JSDoc comments
  • Test coverage for all new features
  • Configuration-driven design
  • Privacy-first implementation

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • TensorFlow.js team for machine learning capabilities
  • Chrome Extensions documentation and community
  • Security research community for DLP best practices

Vigil - Protecting your data, preserving your privacy
Built with ❀️ for enterprise security

About

Vigil - Enterprise Data Loss Prevention for Chrome

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •