Skip to content

GhostPWN/ghostpwn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GhostPwn

GhostPWN

Autonomous Web Penetration Testing Agent
Multi-provider LLM support · Human-in-the-loop · Lightweight architecture

Bun TypeScript OpenTUI License Status


Overview

GhostPWN is an autonomous web penetration testing agent designed for academic research in offensive security. It orchestrates multiple LLM providers to perform grey-box web application testing through a multi-agent pipeline, with human oversight at every critical decision point.

The core research contribution is comparative analysis of LLM providers (Claude, GPT, Gemini) on offensive security tasks — evaluating reasoning quality, vulnerability detection accuracy, and exploit generation across providers.


Architecture

┌─────────────────────────────────────────────────────┐
│                    GhostPWN Agent                   │
│                                                     │
│  ┌──────────┐  ┌──────────┐  ┌─────────┐  ┌──────┐  │
│  │  Recon   │→ │ Analysis │→ │ Exploit │→ │Report│  │
│  │  Agent   │  │  Agent   │  │  Agent  │  │Agent │  │
│  └──────────┘  └──────────┘  └─────────┘  └──────┘  │
│       │              │             │           │    │
│       └──────────────┴─────────────┴───────────┘    │
│                        │                            │
│              ┌─────────▼─────────┐                  │
│              │  Vercel AI SDK    │                  │
│              │  (Orchestrator)   │                  │
│              └─────────┬─────────┘                  │
│         ┌──────────────┼──────────────┐             │
│         ▼              ▼              ▼             │
│    ┌─────────┐   ┌──────────┐   ┌────────┐          │
│    │ OpenAI  │   │Anthropic │   │ Other  │          │
│    └─────────┘   └──────────┘   └────────┘          │
│                                                     │
│  ┌──────────────────┐  ┌────────────────────────┐   │
│  │      SQLite      │  │  OpenTUI Terminal UI   │   │
│  └──────────────────┘  └────────────────────────┘   │
│                                                     │
│  ┌──────────────────────────────────────────────┐   │
│  │    HTTP Clients: Playwright · Selenium ·     │   │
│  └──────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────┘

Pipeline: Recon → Analysis → Exploit → Report

Each agent operates autonomously within its phase but requires human approval before exploit execution.


Tech Stack

Layer Technology
Runtime Bun
Language TypeScript (strict, no any)
Framework SolidJS (OpenTUI template)
LLM Orchestration Vercel AI SDK
Terminal UI OpenTUI (React-based TUI)
Database SQLite
HTTP Clients Playwright, Selenium, requests

Core Deliverables

  1. Agent Orchestration Framework — Multi-agent pipeline powered by Vercel AI SDK with phase-based task delegation.
  2. OpenTUI Terminal Interface — Real-time state sync and interactive terminal UI for monitoring agent progress.
  3. Multi-Provider LLM Integration — Comparative execution across OpenAI, Anthropic, and local models.
  4. Human-in-the-Loop Workflow — Mandatory human validation before exploit execution.
  5. SQLite Knowledge Base — Persistent storage for reconnaissance data, vulnerability fingerprints, and session history.
  6. Academic Documentation — Publication-ready analysis and methodology documentation.

Key Constraints

  • No Docker/Temporal dependencies — Self-contained, single-process architecture.
  • Lightweight — Minimal external dependencies; runs on a single machine.
  • Research-grade — Code quality and documentation suitable for academic publication.
  • Ethical boundaries — Designed for authorized testing against known vulnerable applications (OWASP Juice Shop, DVWA).

Evaluation Targets

Target Type Purpose
OWASP Juice Shop Intentionally vulnerable Grey-box web app testing
DVWA Intentionally vulnerable Baseline vulnerability coverage

Research Goals

  • Comparative LLM Analysis — Benchmark Claude, GPT, and Gemini on vulnerability detection, exploit reasoning, and report generation.
  • Autonomous Agent Evaluation — Measure end-to-end pipeline effectiveness across different provider configurations.

Getting Started

# Clone
git clone https://github.com/GhostPWN/ghostpwn.git
cd ghostpwn

# Install dependencies
bun install

# Run
bun run start

License

MIT License. See LICENSE for details.


Contributing

Contributions are welcome. Please open an issue before submitting a PR to discuss the proposed change.


Built for academic research in offensive security.

About

Autonomous web pentest agent — TUI interactive, multi-provider LLM

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors