Skip to content

shahdk/fishbone-generator

Repository files navigation

🐟 Fishbone Threat Modeling System

An AI-powered threat modeling tool that uses a fishbone diagram approach to analyze security risks. The system employs multiple specialized AI agents to conduct interactive interviews, grade responses, and provide security recommendations.

🎯 Overview

This system helps security professionals and teams create comprehensive threat models by:

  • STRIDE-Based Threat Analysis: Systematic evaluation of Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, and Elevation of Privilege threats
  • Interactive Interviews: AI agents ask targeted security questions about each system component
  • Intelligent Grading: Automated evaluation of answer quality for specificity and risk information
  • Threat-Specific Recommendations: Context-aware mitigation strategies based on identified vulnerabilities
  • Visual Threat Modeling: Interactive fishbone diagrams showing attack paths and actors
  • Multi-Provider LLM Support: Works with OpenAI, Azure OpenAI, and Anthropic

πŸ—οΈ Architecture

The system uses a multi-agent architecture:

  1. Interviewer Agent: Conducts comprehensive threat modeling interviews using security best practices

    • Asks structured questions for each node (data stores, access points, actors)
    • Applies STRIDE methodology to identify potential threats
    • Probes for security controls, vulnerabilities, and attack vectors
  2. Grading Agent: Evaluates answer quality on multiple dimensions

    • Clarity: How well-structured and understandable is the answer?
    • Specificity: Does it include concrete details (names, configurations, versions)?
    • Risk Information: How well does it inform about security risks and controls?
  3. Recommendation Agent: Generates threat-specific mitigation strategies

    • Maps answers to STRIDE threat categories
    • Provides prioritized recommendations (Critical, High, Medium, Low)
    • Includes detailed implementation steps for each mitigation

Threat Modeling Methodology

The system follows a structured approach inspired by STRIDE and attack tree analysis:

Starting from a data breach scenario, the system:

  1. Identifies Assets - Maps all data storage locations and sensitive resources
  2. Traces Access Paths - Discovers all entities (users, services, applications) that access resources
  3. Analyzes Threats - For each component, evaluates:
    • Spoofing: Can identities be forged or impersonated?
    • Tampering: Can data or configurations be maliciously modified?
    • Repudiation: Can actions be performed without audit trails?
    • Information Disclosure: Can sensitive data be exposed?
    • Denial of Service: Can the component be made unavailable?
    • Elevation of Privilege: Can attackers gain unauthorized permissions?
  4. Evaluates Controls - Assesses existing security measures (authentication, encryption, monitoring)
  5. Recommends Mitigations - Provides specific, actionable security improvements

The session completes when all paths in the fishbone end at properly secured actor nodes with documented authentication and authorization controls.

πŸš€ Quick Start

Prerequisites

  • Python 3.9+
  • API keys for at least one LLM provider:
    • OpenAI API key
    • Azure OpenAI credentials
    • Anthropic API key

Installation

Option 1: Using uv (Recommended - Fast!)

uv is an extremely fast Python package installer and resolver.

  1. Install uv (if not already installed)

    # macOS/Linux
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
    # Windows
    powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
    
    # Or with pip
    pip install uv
  2. Clone the repository

    cd fishbone-generator
  3. Create virtual environment and install dependencies

    uv venv
    source .venv/bin/activate  # On Windows: .venv\Scripts\activate
    uv pip install -r requirements.txt
  4. Configure environment

    cp .env.example .env
    # Edit .env with your API keys
  5. Run the application

    uv run python -m backend.main
    # Or if venv is activated:
    python -m backend.main
  6. Open your browser

    http://localhost:8000
    

Option 2: Using pip (Traditional)

  1. Clone the repository

    cd fishbone-generator
  2. Create virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Configure environment

    cp .env.example .env
    # Edit .env with your API keys
  5. Run the application

    python -m backend.main
  6. Open your browser

    http://localhost:8000
    

Option 3: Using the run scripts

We provide convenience scripts that handle setup automatically:

# Auto-detect and use uv if available, otherwise use pip
./run.sh       # Linux/macOS
run.bat        # Windows

# Force use of uv (faster)
./run-uv.sh    # Linux/macOS
run-uv.bat     # Windows

These scripts will:

  • Detect and use uv if available (10-100x faster!)
  • Create a virtual environment if it doesn't exist
  • Install dependencies automatically
  • Validate your configuration
  • Start the application

Quick Reference:

Script Tool Virtual Env Use Case
run.sh / run.bat Auto-detect uv/pip venv or .venv General use, adapts to what's installed
run-uv.sh / run-uv.bat uv (required) .venv When you want guaranteed fast installs

βš™οΈ Configuration

Environment Variables

Edit .env file with your credentials:

# Choose provider: openai, azure_openai, or anthropic
LLM_PROVIDER=openai

# OpenAI Configuration
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4

# Azure OpenAI Configuration
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT=your-deployment-name
AZURE_OPENAI_API_VERSION=2024-02-01

# Anthropic Configuration
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-3-5-sonnet-20241022

Question Configuration

Customize starting questions in data/questions_config.json:

{
  "starting_questions": [
    {
      "question": "Where is the sensitive data stored?",
      "expected_insights": "Identify data storage locations",
      "category": "data_storage"
    }
  ]
}

πŸ“– Usage Guide

Starting a Session

  1. Select LLM Provider: Choose your preferred AI provider
  2. Start Session: Click "Start New Session"
  3. Answer Questions: Provide detailed, specific answers about your system
  4. Review Grading: See how well your answer addresses security concerns
  5. Get Recommendations: Receive tailored security mitigation strategies

Understanding STRIDE Threat Categories

The system evaluates each component against the STRIDE threat model:

Threat Description Example Questions
Spoofing Can an attacker impersonate users, systems, or services? "Could someone fake credentials to access this system?"
Tampering Can data or configurations be maliciously modified? "Could an attacker alter data in transit or at rest?"
Repudiation Can actions be performed without audit trails? "Can someone deny they performed an action?"
Information Disclosure Can sensitive data be exposed to unauthorized parties? "Could an attacker access confidential information?"
Denial of Service Can the system be made unavailable? "Could an attacker overload or crash this component?"
Elevation of Privilege Can attackers gain unauthorized permissions? "Could someone escalate from user to admin access?"

Best Practices for Answers

When answering threat modeling questions:

  • Be Specific: Use exact names, versions, and configurations

    • Good: "Azure SQL Database 'prod-db-01' with TDE encryption"
    • Bad: "A SQL database with some encryption"
  • Include Security Details: Mention authentication, encryption, access controls

    • Good: "MFA required via Azure AD with Conditional Access policies"
    • Bad: "Users need to login"
  • Address STRIDE Threats: Explicitly mention how each threat is mitigated

    • "Spoofing is prevented by certificate-based authentication"
    • "Tampering is detected through cryptographic hashing"
    • "All actions are logged in Azure Monitor for non-repudiation"
  • Provide Context: Explain how components interact and data flows

    • "Data flows from ADF to Kusto using managed identity authentication"
  • Mention Risks and Gaps: Acknowledge known vulnerabilities or concerns

    • "Legacy service accounts still use password authentication (planned for deprecation)"

Interactive Fishbone Diagram

  • Click nodes to view questions, answers, and recommendations
  • Drag nodes to rearrange the visualization
  • Zoom in/out for better viewing
  • Export the diagram as SVG

🎨 Features

Multi-Agent System

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  User Interface β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
    β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”
    β”‚ Session β”‚
    β”‚ Manager β”‚
    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
         β”‚
    β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚                         β”‚
β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚Interviewerβ”‚  β”‚  Grading  β”‚  β”‚Recommendationβ”‚
β”‚  Agent   β”‚  β”‚   Agent   β”‚  β”‚    Agent    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Node Types

  • Root: The breach scenario (data manipulation/exfiltration)
  • Resource: Data stores (databases, storage accounts, data lakes)
  • Access Point: Services that access resources (ADF, Functions, APIs)
  • Actor: Users, service principals, identities
  • Control: Security controls (auth, encryption, monitoring)

Grading Criteria

Answers are scored (0-10) on:

  • Clarity: How well-structured and clear is the answer?
  • Specificity: Does it include concrete details and names?
  • Risk Information: How well does it inform about security risks?

Session Completion

Sessions complete when:

  • All fishbone paths end at actor/user nodes
  • Authentication/authorization has been discussed for each actor
  • Minimum grading thresholds are met

πŸ”Œ API Reference

Create Session

POST /api/sessions
Content-Type: application/json

{
  "llm_provider": "openai"
}

Submit Answer

POST /api/sessions/answer
Content-Type: application/json

{
  "session_id": "uuid",
  "answer": "Our data is stored in Azure Kusto..."
}

Get Fishbone Graph

GET /api/sessions/{session_id}/fishbone

Get Recommendations

GET /api/sessions/{session_id}/recommendations

πŸ“‚ Project Structure

fishbone-generator/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ agents/               # AI agent implementations
β”‚   β”‚   β”œβ”€β”€ interviewer_agent.py
β”‚   β”‚   β”œβ”€β”€ grading_agent.py
β”‚   β”‚   └── recommendation_agent.py
β”‚   β”œβ”€β”€ models/               # Data models
β”‚   β”‚   β”œβ”€β”€ fishbone.py       # Fishbone graph structure
β”‚   β”‚   └── session.py        # Session and Q&A models
β”‚   β”œβ”€β”€ services/             # Business logic
β”‚   β”‚   β”œβ”€β”€ llm_config.py     # Multi-provider LLM config
β”‚   β”‚   └── session_manager.py # Workflow orchestration
β”‚   β”œβ”€β”€ api/                  # FastAPI routes
β”‚   β”‚   └── routes.py
β”‚   β”œβ”€β”€ config.py             # Configuration management
β”‚   └── main.py               # Application entry point
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ static/
β”‚   β”‚   β”œβ”€β”€ js/
β”‚   β”‚   β”‚   β”œβ”€β”€ fishbone.js   # D3.js visualization
β”‚   β”‚   β”‚   └── app.js        # Main UI logic
β”‚   β”‚   └── css/
β”‚   β”‚       └── style.css     # Styling
β”‚   └── templates/
β”‚       └── index.html        # Main page
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ sessions/             # Saved sessions (JSON)
β”‚   └── questions_config.json # Question templates
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ .env.example
└── README.md

πŸ”’ Security Considerations

  • API keys are stored in .env (never commit to git)
  • Sessions are saved locally in data/sessions/
  • No data is sent to external services except LLM providers
  • All communication uses HTTPS in production

πŸ› οΈ Development

Using uv for Development

If you're using uv, you get significantly faster dependency management:

# Sync dependencies (faster than pip install)
uv pip sync requirements.txt

# Install a new package and update requirements
uv pip install package-name
uv pip freeze > requirements.txt

# Run the application with uv
uv run python -m backend.main

# Update all dependencies
uv pip compile requirements.txt --upgrade
uv pip sync requirements.txt

Running Tests

# With uv
uv run pytest tests/

# Traditional
pytest tests/

Adding New Agents

  1. Create agent class in backend/agents/
  2. Implement required methods
  3. Register in session_manager.py

Customizing Node Types

Edit backend/models/fishbone.py:

class NodeType(str, Enum):
    CUSTOM_TYPE = "custom_type"

πŸ“Š Example Workflow

  1. Question: "Where is your sensitive data stored?"

    • Answer: "In Azure Kusto cluster 'prod-kusto-01' with encryption at rest"
    • Grade: 8.5/10 (specific, mentions security)
    • Recommendations: Enable customer-managed keys, implement backup encryption
  2. Question: "Who has access to Azure Kusto?"

    • Answer: "Data engineers via Azure AD and our ADF instance 'prod-adf'"
    • Grade: 7.8/10 (good detail, could improve on access levels)
    • Recommendations: Implement RBAC, enable MFA, audit access logs
  3. Question: "How are data engineers authenticated?"

    • Answer: "Azure AD with MFA required for all users"
    • Grade: 9.2/10 (excellent specificity)
    • Recommendations: Consider conditional access policies, implement PIM

🀝 Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Submit a pull request

πŸ“ License

MIT License - see LICENSE file for details

πŸ™ Acknowledgments

  • Visualization powered by D3.js
  • Backend framework: FastAPI
  • LLM integration: OpenAI, Azure OpenAI, Anthropic

πŸ“ž Support

For issues and questions:

  • Check TROUBLESHOOTING.md for common issues and solutions
  • Open an issue on GitHub
  • Review example sessions in /data/examples

Common Issue: Questions Repeating

If you see the same question repeatedly:

  1. Clear old sessions: Run python clear_sessions.py
  2. Restart the application: Stop and restart the server
  3. Create a new session: Start a fresh threat modeling session in the browser

Old sessions may use cached question logic. Always start a new session after updating the code.

πŸ—ΊοΈ Roadmap

  • Export threat model reports (PDF/Word)
  • Integration with JIRA/Azure DevOps for tracking
  • Pre-built templates for common architectures
  • Collaboration features (multi-user sessions)
  • Integration with cloud provider APIs for auto-discovery
  • STRIDE framework integration
  • Attack tree visualization option
  • Risk scoring and prioritization

Made with ❀️ for the security community

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors