Project Heimdall - Comprehensive Penetration Testing System

Overview

Project Heimdall is a sophisticated, AI-powered penetration testing framework that integrates multiple security tools and automates vulnerability assessment through intelligent test plan generation and execution.

⚙️ New Orchestration v4 Flow (orchestration4.py)

The primary end-to-end experience now runs through src/heimdall/orchestration/orchestration4.py. It coordinates all agents and tools to plan, execute, and report a security assessment.

Entry points:
- CLI: heimdall-scan
- Module: python -m heimdall.orchestration.orchestration4
- Python API: from heimdall.orchestration.orchestration4 import run_orchestration
Flow:
1. Launch Playwright via WebProxy and capture network traffic
2. Navigate to the base URL and extract page data with PageDataExtractor
3. Summarize content with ContextManagerAgent (token-aware context trimming)
4. Generate plans with PlannerAgent
5. Iteratively select and execute actions with ActionerAgent using Gemini function calling
6. Execute browser/security functions via PlaywrightTools and tools.tool_calls
7. Aggregate findings and generate a PDF via ReporterAgent
Runtime knobs (run_orchestration): expand_scope (bool), max_iterations (int), keep_messages (int)

🏗️ Architecture of the system

┌─────────────────────────────────────────────────────────┐
│                Project Heimdall                         │
│                                                         │
│  ┌─────────────┐    ┌──────────────────────────────────┐ │
│  │ PlannerAgent│───▶│        ToolCall System           │ │
│  │             │    │                                  │ │
│  │ • Gemini LLM│    │ ┌─────────────┐ ┌──────────────┐ │ │
│  │ • OWASP     │    │ │   Security  │ │   Browser    │ │ │
│  │   Based     │    │ │    Tools    │ │  Automation  │ │ │
│  │ • YAML      │    │ │             │ │              │ │ │
│  │   Parsing   │    │ │ • SQLMap    │ │ • Playwright │ │ │
│  │             │    │ │ • Nmap      │ │ • Dynamic    │ │ │
│  └─────────────┘    │ │ • Nikto     │ │   Testing    │ │ │
│                     │ │ • OWASP ZAP │ │ • Form       │ │ │
│                     │ │ • Hydra     │ │   Automation │ │ │
│                     │ │ • Gobuster  │ │              │ │ │
│                     │ └─────────────┘ └──────────────┘ │ │
│                     └──────────────────────────────────┘ │
│                                   │                      │
│                     ┌─────────────▼──────────────────────┐ │
│                     │      Vulnerability Report         │ │
│                     │ • Severity Classification         │ │
│                     │ • Tool Performance Metrics       │ │
│                     │ • Recommendations                 │ │
│                     │ • JSON Export                     │ │
│                     └───────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘

🧠 Core Components

1. PlannerAgent (`agents/planner.py`)

Purpose: AI-powered security test plan generation using Google's Gemini 2.0-flash model.

Key Features:

LLM-Powered Analysis: Uses Gemini 2.0-flash for intelligent security assessment planning
OWASP-Based Methodology: Follows industry-standard vulnerability testing frameworks
YAML Parsing: Structured plan extraction with fallback mechanisms
Comprehensive Input Analysis: Processes HTML, forms, APIs, headers, and reconnaissance data

System Prompt Capabilities:

Security Testing Areas:
  - Authentication & Session Management
  - Input Validation (SQL Injection, XSS, Command Injection)
  - Authorization & Access Control (IDOR, Privilege Escalation)
  - Business Logic Vulnerabilities
  - Information Disclosure
  - File Upload Security
  - API Security Testing
  - Network Infrastructure Assessment

Example Generated Plan:

{
  "title": "SQL Injection via Authentication Bypass",
  "description": "Test the /auth/login endpoint for SQL injection vulnerabilities using various payloads including union-based, boolean-based, and time-based techniques. Focus on username and password parameters with payloads like ' OR '1'='1' -- and UNION SELECT statements."
}

2. Action Execution & Tool Calling (`agents/actioner.py`, `tools/tool_calls.py`)

Purpose: Generate strategic actions from plans and execute the right tool per iteration using Gemini function calling.

Architecture:

Function-calling orchestration: Single tool call per iteration (browser or security)
Security and browser tools catalog: SQLi, XSS, Nmap, API discovery, IDOR, JWT, plus Playwright actions
Error handling: Resilient execution; fallbacks when tool-calling fails
Results aggregation: Findings captured and forwarded to ReporterAgent

Supported Security & Browser Tools (subset)

Category	Tools	Purpose
SQL Injection	SQLMap	Automated SQL injection detection and exploitation
XSS Testing	XSStrike, Browser Automation	Cross-site scripting vulnerability detection
Network Scanning	Nmap, Masscan	Port scanning and service enumeration
Web Scanning	OWASP ZAP	Web server vulnerability scanning
Directory Enumeration	Gobuster, FFUF, DIRB	Hidden directory and file discovery
Authentication	Hydra, Custom Scripts	Brute force and authentication bypass testing
Browser Automation	Playwright	Dynamic testing and manual verification

Action Selection Logic (high-level example)

# The ActionerAgent provides a "Tool Selector Command".
# If the named function exists in the allowed tools list, it is executed with provided args.
if actioner_command == "sql_injection_test(url=..., parameter=...)":
    → Call `tools.tool_calls.sql_injection_test(...)`
elif actioner_command == "xss_test(url=..., parameter=...)":
    → Call `tools.tool_calls.xss_test(...)`

3. Browser Automation (`tools/browser.py`)

Purpose: Playwright-based dynamic security testing.

Capabilities:

Dynamic Form Testing: Automated form interaction and payload injection
Session Management: Cookie and authentication state handling
JavaScript Execution: Custom security testing scripts
Screenshot Capture: Visual evidence collection
Response Analysis: Content parsing and vulnerability detection

Security Testing Actions:

# Core browser automation methods
goto(page, url)                    # Navigate to target URL
fill(page, selector, payload)      # Inject test payloads
click(page, selector)              # Interact with elements
execute_js(page, script)           # Run custom security scripts
submit(page, form_selector)        # Submit forms with payloads

🔍 Testing Methodologies

SQL Injection Testing

async def sql_injection_testing(plan):
    # 1. SQLMap automated testing
    sqlmap_result = await run_sqlmap(target_url, plan)

    # 2. Manual browser-based testing
    browser_result = await manual_sql_injection_test(target_url, plan)

    # 3. OWASP ZAP SQL injection scan
    zap_result = await zap_sql_injection_scan(target_url)

    # Aggregate results and recommendations
    return combined_results

XSS Testing

async def xss_testing(plan):
    # 1. XSStrike automated XSS detection
    xsstrike_result = await run_xsstrike(target_url, plan)

    # 2. Browser-based payload injection
    browser_result = await manual_xss_test(target_url, plan)

    # 3. ZAP XSS scanning
    zap_result = await zap_xss_scan(target_url)

    return aggregated_xss_results

API Security Testing

async def api_security_testing(plan):
    # 1. API endpoint enumeration
    ffuf_result = await run_ffuf_api_enum(target_url)

    # 2. IDOR vulnerability testing
    idor_result = await test_idor_vulnerabilities(target_url, plan)

    # 3. Authorization bypass testing
    auth_result = await test_api_authorization(target_url, plan)

    return api_security_results

📊 Vulnerability Reporting

ToolResult Structure

@dataclass
class ToolResult:
    success: bool
    tool_name: str
    command: str
    output: str
    error: str = ""
    execution_time: float = 0.0
    vulnerabilities_found: List[Dict] = None
    recommendations: List[str] = None

Vulnerability Classification

# Severity levels with automatic classification
vulnerability = {
    'type': 'SQL Injection',
    'severity': 'Critical',  # Critical, High, Medium, Low
    'tool': 'SQLMap',
    'evidence': 'Injectable parameter found: username',
    'recommendation': 'Use parameterized queries'
}

Comprehensive Report Generation

{
  "summary": {
    "total_tests_executed": 8,
    "total_execution_time": 45.2,
    "total_vulnerabilities_found": 12,
    "critical_vulnerabilities": 3,
    "high_vulnerabilities": 4,
    "medium_vulnerabilities": 4,
    "low_vulnerabilities": 1
  },
  "vulnerabilities_by_severity": { ... },
  "vulnerabilities_by_type": { ... },
  "tools_performance": { ... },
  "recommendations": [ ... ]
}

🛠️ Configuration & Setup

Tool Availability Detection

available_tools = {
    'nmap': check_command('nmap'),
    'sqlmap': check_command('sqlmap'),
    'nikto': check_command('nikto'),
    'zap': ZAP_AVAILABLE,
    'playwright': PLAYWRIGHT_AVAILABLE,
    # ... additional tools
}

Configuration Options

config = {
    'zap_proxy': 'http://127.0.0.1:8080',
    'zap_api_key': None,
    'timeout': 300,  # 5 minutes default
    'max_threads': 5,
    'output_dir': './pentest_results',
    'wordlists': {
        'common': '/usr/share/wordlists/dirb/common.txt',
        'big': '/usr/share/wordlists/dirb/big.txt'
    }
}

🎯 Usage Examples

Basic Integration

from agents.planner import PlannerAgent
from tools.tool_calls import ToolCall

# Initialize components
planner = PlannerAgent(desc="Security testing planner")
tool_executor = ToolCall(config={'timeout': 60})

# Generate test plans
plans = planner.plan(target_data)

# Execute each plan
for plan in plans:
    result = await tool_executor.execute_plan_step(plan)
    print(f"Tool: {result.tool_name}, Vulnerabilities: {len(result.vulnerabilities_found)}")

Full Orchestration (v4)

Run from CLI (installed via pyproject.toml script):

heimdall-scan

Run as a Python module:

python -m heimdall.orchestration.orchestration4

Run from Python:

from heimdall.orchestration.orchestration4 import run_orchestration

# knobs: expand_scope=True, max_iterations=10, keep_messages=12
run_orchestration(expand_scope=False, max_iterations=1, keep_messages=5)

Change the target URL:

Edit the base_url in heimdall/orchestration/orchestration4.py, or
Use the provided test wrapper run_local_test.py which overrides run_orchestration to point to a test site

🔐 Security Testing Capabilities

Authentication & Session Management

✅ Login form bypass testing
✅ Session fixation detection
✅ Session timeout validation
✅ Multi-factor authentication bypass
✅ Credential brute force attacks

Input Validation Testing

✅ SQL injection (Union, Boolean, Time-based)
✅ Cross-site scripting (Reflected, Stored, DOM)
✅ Command injection
✅ LDAP injection
✅ XML injection

Authorization & Access Control

✅ Insecure Direct Object Reference (IDOR)
✅ Privilege escalation testing
✅ Horizontal access control bypass
✅ Vertical access control bypass
✅ Role-based access testing

API Security

✅ REST API enumeration
✅ GraphQL testing
✅ API versioning vulnerabilities
✅ Rate limiting bypass
✅ API key exposure

Infrastructure Security

✅ Network port scanning
✅ Service enumeration
✅ SSL/TLS configuration testing
✅ Web server vulnerability scanning
✅ Directory enumeration

📈 Performance & Scalability

Concurrent Execution

Async/await patterns for parallel tool execution
Configurable timeouts to prevent hanging
Resource pooling for browser instances
Thread management for CPU-intensive tasks

Error Handling

Graceful degradation when tools are unavailable
Fallback mechanisms for failed tests
Comprehensive logging for debugging
Timeout protection for long-running scans

Output Management

Structured JSON reports for programmatic processing
Human-readable summaries for analysts
Evidence preservation with screenshots and logs
Export capabilities for integration with other tools

🚀 Getting Started

Install Dependencies:

pip install playwright zapv2 requests
playwright install

Install Security Tools:

# On Ubuntu/Debian
apt-get install nmap sqlmap nikto gobuster ffuf hydra

# OWASP ZAP (download from official site)
# Configure ZAP API access

Run Orchestration:

# Option A: CLI
heimdall-scan

# Option B: module
python -m heimdall.orchestration.orchestration4

# Option C: Python import
python - <<'PY'

from heimdall.orchestration.orchestration4 import run_orchestration run_orchestration(expand_scope=False, max_iterations=1, keep_messages=5) PY


4. **Environment Variables**:

```bash
# Required for LLM-powered agents
export GEMINI_API_KEY=...        # Planner/Actioner/Reporter when using Gemini
export FIREWORKS_API_KEY=...     # ContextManager when using Fireworks models

📋 Key Features Summary

✅ AI-Powered Planning: Gemini 2.0-flash for intelligent test generation
✅ 26+ Security Tools: Integration with industry-standard tools
✅ Automated Tool Selection: Smart mapping based on plan content
✅ Browser Automation: Playwright for dynamic testing
✅ Comprehensive Reporting: Detailed vulnerability analysis
✅ OWASP Methodology: Industry-standard testing approaches
✅ Concurrent Execution: High-performance parallel testing
✅ Extensible Architecture: Easy to add new tools and methods
✅ Production Ready: Error handling and timeout management
✅ Evidence Collection: Screenshots, logs, and detailed output

This system represents a sophisticated, enterprise-grade penetration testing framework that combines the power of AI planning with comprehensive tool automation for effective security assessment.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
src/heimdall		src/heimdall
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
main.py		main.py
nmims_site_report.md		nmims_site_report.md
prompts.txt		prompts.txt
pyproject.toml		pyproject.toml
run_local_test.py		run_local_test.py
run_orchestration.py		run_orchestration.py
run_security_scan.py		run_security_scan.py
test_browser.py		test_browser.py
test_setup.py		test_setup.py

License

Avinier/heimdall

Folders and files

Latest commit

History

Repository files navigation