Skip to content

Security: danoliva713/CI-Insight

Security

security.md

CI Insight Security Documentation

This document outlines the security model, threat analysis, and best practices for CI Insight.

Security Philosophy

CI Insight is designed with a defense-in-depth approach:

  1. Privacy-first: Redact secrets, minimize data retention
  2. Secure by default: Safe defaults, explicit opt-in for risky features
  3. Transparent: Open source, auditable code, clear logging
  4. Graceful degradation: Security failures are logged, not silent

Threat Model

Assets

  1. CI/CD Logs: May contain secrets, API keys, credentials
  2. Application Data: Failure patterns, repository names, commit SHAs
  3. System Access: API endpoints, webhook receivers

Threat Actors

  1. External Attackers: Attempting to access data or disrupt service
  2. Malicious Insiders: Users with legitimate access abusing privileges
  3. Accidental Exposure: Misconfiguration leading to data leaks

Attack Vectors

Vector Threat Mitigation
Spoofed Webhooks Attacker sends fake CI events Webhook signature validation
Secret Leakage Secrets in logs exposed via API/UI Log redaction before storage
Unauthorized Access External access to internal API CORS policy, optional authentication
SQL Injection Malicious input in API queries SQLAlchemy ORM, parameterized queries
XSS Malicious content in failure messages React auto-escaping, CSP headers
AI Provider Leak Secrets sent to OpenAI Redaction before AI processing
SSRF Webhook callbacks to internal services Validate webhook sources

Security Controls

1. Log Redaction

Purpose: Prevent secrets from being stored or exposed

Implementation (app/services/analysis/redactor.py):

  • Regex-based pattern matching
  • Runs before database storage
  • Runs before sending to AI providers

Patterns Detected:

  • API keys: api_key=..., apikey=...
  • Tokens: token=..., bearer ...
  • GitHub tokens: ghp_..., gho_..., etc.
  • AWS keys: AKIA...
  • Passwords: password=..., passwd=...
  • Database URLs: postgres://user:pass@...
  • Private keys: -----BEGIN PRIVATE KEY-----
  • JWT tokens: eyJ...
  • Email addresses: user@example.com

Configuration:

ENABLE_LOG_REDACTION=true   # Recommended: always enabled
STORE_RAW_LOGS=false        # Recommended: never store raw logs

Limitations:

  • Not 100% foolproof (custom secret formats may slip through)
  • Trade-off: Aggressive redaction may over-redact useful info
  • Recommendation: Review and enhance patterns for your use case

Testing:

pytest app/tests/test_redactor.py

2. Webhook Signature Validation

Purpose: Verify webhooks are from legitimate sources

GitHub Actions

Algorithm: HMAC-SHA256

Configuration:

GITHUB_WEBHOOK_SECRET=your-secret-here
ENABLE_WEBHOOK_VALIDATION=true

GitHub Setup:

  1. Repository → Settings → Webhooks → Add webhook
  2. Set Secret field
  3. CI Insight validates X-Hub-Signature-256 header

Implementation (app/core/security.py:verify_github_signature):

def verify_github_signature(payload_body: bytes, signature_header: str) -> bool:
    mac = hmac.new(secret, msg=payload_body, digestmod=hashlib.sha256)
    expected = mac.hexdigest()
    return hmac.compare_digest(signature_header, f"sha256={expected}")

Jenkins

Algorithm: Token-based

Configuration:

JENKINS_WEBHOOK_SECRET=your-token-here
ENABLE_WEBHOOK_VALIDATION=true

Jenkins Setup:

  1. Configure → Notification Endpoint
  2. Add ?token=your-token-here to URL or send in header
  3. CI Insight validates token equality

Dev Mode:

ENABLE_WEBHOOK_VALIDATION=false  # Disables validation for testing

Warning: Only use dev mode in trusted environments!

3. CORS Policy

Purpose: Restrict which origins can access the API

Default Configuration:

CORS_ORIGINS=http://localhost:3000,http://localhost,http://frontend

Production Setup:

CORS_ORIGINS=https://ci-insight.example.com

Implementation (app/main.py):

app.add_middleware(
    CORSMiddleware,
    allow_origins=settings.cors_origins_list,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

4. SQL Injection Prevention

Mitigation: SQLAlchemy ORM with parameterized queries

Safe ✅:

db.query(Failure).filter(Failure.category == category).all()

Unsafe ❌ (not used in codebase):

db.execute(f"SELECT * FROM failures WHERE category = '{category}'")

Testing: API integration tests verify query safety

5. XSS Prevention

Mitigation:

  • React's automatic escaping
  • No dangerouslySetInnerHTML used
  • Future: Add Content-Security-Policy headers

Example (React auto-escapes):

<div>{failure.error_message}</div>  // ✅ Safe: auto-escaped

6. AI Provider Privacy

Risk: Sending secrets to external AI services

Mitigations:

  1. Redaction First: Logs redacted before AI processing
  2. Provider Choice: User controls where data goes
    • none: No AI, no external data
    • openai: Data sent to OpenAI (user choice)
    • local: Fully offline, no external calls

Configuration:

AI_PROVIDER=local          # Default: offline TF-IDF
OPENAI_API_KEY=            # Only needed if AI_PROVIDER=openai

Best Practice: Use local or none for sensitive environments

7. Rate Limiting

Current State: Basic in-app limit (60 req/min)

Production Recommendation:

  • Deploy behind reverse proxy (Nginx, Traefik)
  • Implement rate limiting at proxy level
  • Example Nginx config:
    limit_req_zone $binary_remote_addr zone=webhook:10m rate=10r/s;
    location /api/ingest/ {
        limit_req zone=webhook burst=20;
    }

8. Database Security

SQLite (default):

  • File-based, no network exposure
  • Suitable for single-instance deployments
  • Stored in Docker volume (isolated)

PostgreSQL (recommended for production):

  • Network-isolated (Docker network)
  • Strong password required
  • Use connection pooling
  • Regular backups

Configuration:

# PostgreSQL
DATABASE_URL=postgresql://ci_insight:strong_password@db:5432/ci_insight

# Ensure DB credentials are in .env, not committed

Production Deployment Checklist

Before Deploying

  • Set strong GITHUB_WEBHOOK_SECRET and JENKINS_WEBHOOK_SECRET
  • Enable webhook validation: ENABLE_WEBHOOK_VALIDATION=true
  • Configure CORS to specific origins (not *)
  • Use PostgreSQL instead of SQLite
  • Enable log redaction: ENABLE_LOG_REDACTION=true
  • Disable raw log storage: STORE_RAW_LOGS=false
  • Review AI provider choice: AI_PROVIDER=local for sensitive data
  • Set DEBUG=false
  • Use environment variables (not hardcoded secrets)
  • Run behind HTTPS (reverse proxy with SSL)

Infrastructure

  • Deploy behind reverse proxy (Nginx/Traefik/Cloudflare)
  • Implement rate limiting at proxy level
  • Set up firewall rules (only expose ports 80/443)
  • Use private Docker network
  • Enable container health checks
  • Set up log aggregation (ELK, Splunk, etc.)
  • Configure alerting for errors
  • Regular database backups
  • Patch OS and dependencies regularly

Monitoring

  • Monitor failed authentication attempts
  • Alert on unusual webhook volumes
  • Track API error rates
  • Monitor disk space (logs, database)
  • Set up uptime monitoring
  • Review logs for security events

Incident Response

If Secrets Are Leaked

  1. Immediate: Rotate compromised credentials
  2. Assess: Check logs for unauthorized access
  3. Remediate: Fix redaction patterns if needed
  4. Review: Audit all stored data for other secrets
  5. Test: Verify new patterns catch the leaked format

If Unauthorized Access Detected

  1. Block: Update firewall/CORS rules
  2. Audit: Review all API calls from suspicious IPs
  3. Rotate: Change webhook secrets
  4. Investigate: Check for data exfiltration
  5. Patch: Fix vulnerability if found

If Database Compromised

  1. Disconnect: Take database offline
  2. Assess: Determine scope of breach
  3. Restore: From backup if corrupted
  4. Harden: Review and fix access controls
  5. Notify: Stakeholders if PII/sensitive data affected

Security Reporting

For Production Users:

  • Report security issues to: security@example.com
  • Provide: Description, reproduction steps, impact
  • We'll respond within 48 hours
  • Responsible disclosure appreciated

For Developers:

Compliance Considerations

GDPR (if applicable)

  • Log redaction helps minimize PII
  • Provide data export/deletion endpoints
  • Document data retention policy
  • Add consent management if needed

SOC 2 (if applicable)

  • Structured logging for audit trails
  • Access controls (add authentication)
  • Encryption at rest and in transit
  • Regular security reviews

Remember: Security is a process, not a product. Regularly review and update these controls as threats evolve.

There aren’t any published security advisories