Skip to content

vachakb/StrategicSWOT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

4 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

SEC SWOT Analysis Dashboard

A comprehensive Python application for analyzing SEC 10-K filings and generating automated SWOT (Strengths, Weaknesses, Opportunities, Threats) reports using natural language processing and machine learning techniques.

๐ŸŒŸ Features

  • Automated SEC Filing Analysis: Download and process 10-K filings directly from the SEC EDGAR database
  • AI-Powered SWOT Classification: Uses machine learning to classify text into SWOT categories
  • Interactive Dashboard: Modern Streamlit web interface with dark theme and professional styling
  • Comprehensive Visualizations: Interactive charts and graphs powered by Plotly
  • Export Options: Download results in CSV, JSON, and PDF formats
  • Multi-Company Support: Analyze multiple companies and time periods
  • Real-Time Processing: Live progress tracking during analysis

๐Ÿ“Š Dashboard Preview

The application features three main modes:

  • ๐Ÿ“ˆ Quick Analysis: Select a ticker and date range for automated analysis
  • ๐Ÿ“‹ Upload Documents: Process custom SEC filings (coming soon)
  • ๐Ÿ“Š View Results: Browse and visualize previously generated reports

๐Ÿ› ๏ธ Installation

  1. Clone the repository:

    git clone <repository-url>
    cd nlp-project
  2. Install dependencies:

    pip install -r requirements.txt
  3. Run the dashboard:

    streamlit run dashboard.py

๐Ÿ“‹ Requirements

streamlit>=1.28.0
plotly>=5.15.0
pandas>=1.5.0
datamule
tqdm
pathlib

๐Ÿ—๏ธ Project Structure

โ”œโ”€โ”€ dashboard.py              # Main Streamlit dashboard
โ”œโ”€โ”€ swot_analysis.ipynb      # Jupyter notebook for SWOT analysis
โ”œโ”€โ”€ requirements.txt         # Python dependencies
โ”œโ”€โ”€ sec_10k_sentences.csv   # Raw SEC filing sentences
โ”œโ”€โ”€ sec_10k_sentences_clean.csv # Cleaned sentences
โ”œโ”€โ”€ sec_portfolio/          # Downloaded SEC filings
โ”‚   โ”œโ”€โ”€ 000032019323000106.tar
โ”‚   โ””โ”€โ”€ 000032019324000123.tar
โ””โ”€โ”€ sec_swot_output/        # Analysis results
    โ”œโ”€โ”€ index.json          # Master index of reports
    โ”œโ”€โ”€ swot_AAPL_*.csv    # Individual SWOT data
    โ””โ”€โ”€ swot_report_AAPL_*.json # Structured reports

๐Ÿš€ Usage

Quick Analysis Mode

  1. Launch the dashboard: streamlit run dashboard.py
  2. Select ๐Ÿ“ˆ Quick Analysis from the sidebar
  3. Choose a company ticker (e.g., AAPL, MSFT, GOOGL)
  4. Set your desired date range
  5. Click ๐Ÿš€ Run Analysis
  6. View results in the ๐Ÿ“Š View Results section

Dashboard Navigation

๐Ÿ“ˆ Quick Analysis

  • Company Selection: Choose from popular tickers (AAPL, MSFT, GOOGL, AMZN, TSLA, META, NVDA) or enter custom ticker
  • Date Range: Set start and end dates for filing analysis (2020-2025)
  • One-Click Analysis: Automated processing with real-time progress tracking

๐Ÿ“‹ Upload Documents (Coming Soon)

  • Support for .txt, .pdf, and .html files
  • Custom document processing capabilities
  • Batch upload functionality

๐Ÿ“Š View Results

  • Interactive report selector
  • Professional SWOT visualizations
  • Detailed category breakdowns with key themes and insights
  • Export options (CSV, JSON, PDF)

Jupyter Notebook Analysis

For advanced users and development:

  1. Open swot_analysis.ipynb
  2. Configure parameters:
    TICKERS = ["AAPL"]  # Companies to analyze
    FORMS = ["10-K"]    # SEC form types
    DATE_RANGE = ("2023-01-01", "2024-12-31")
  3. Run all cells to perform analysis

๐Ÿ“ˆ Output Files

The analysis generates several output files:

  • CSV Files: Raw SWOT classifications with confidence scores
  • JSON Reports: Structured reports with key themes and insights
  • Index File: Master catalog of all generated reports

Sample JSON Report Structure

{
  "meta": {
    "ticker": "AAPL",
    "accession": "000032019324000123",
    "filing_date": "2024-11-01"
  },
  "report": {
    "Strength": {
      "count": 25,
      "top_bullets": ["Key strength indicators..."],
      "key_themes": ["company", "growth", "innovation"],
      "key_insights": ["Strategic advantages identified..."],
      "summary": "25 strength indicators found"
    },
    "Weakness": { ... },
    "Opportunity": { ... },
    "Threat": { ... }
  }
}

๐ŸŽจ Dashboard Features

Professional Dark Theme

  • Gradient backgrounds and modern styling
  • Color-coded SWOT categories:
    • ๐Ÿ’ช Strengths: Green gradient
    • โš ๏ธ Weaknesses: Red gradient
    • ๐ŸŽฏ Opportunities: Blue gradient
    • โšก Threats: Orange gradient

Interactive Visualizations

  • Pie Charts: SWOT category distribution with pull-out effects
  • Key Themes: Highlighted tag-style display of common topics
  • Sample Evidence: Expandable sections with filing excerpts
  • Metrics Cards: Professional summary statistics

Export Capabilities

  • CSV Export: Raw classification data for further analysis
  • JSON Export: Complete structured reports
  • PDF Generation: Professional report formatting (coming soon)

๐Ÿ”ง Key Components

dashboard.py

Main Streamlit application featuring:

  • Modern dark theme with professional styling
  • Interactive visualizations with Plotly
  • Real-time analysis progress tracking
  • Multi-format export capabilities
  • Responsive design with custom CSS

swot_analysis.ipynb

Core analysis engine providing:

  • SEC filing download via datamule
  • Text preprocessing and sentence extraction
  • ML-based SWOT classification
  • Report generation and export

๐ŸŽฏ SWOT Classification

The system uses keyword-based weak supervision to classify sentences:

  • Strengths: Competitive advantages, strong performance metrics, market leadership
  • Weaknesses: Risk factors, operational challenges, regulatory concerns
  • Opportunities: Growth potential, market expansion, new technologies
  • Threats: External risks, competitive pressures, economic factors

๐Ÿ“Š Visualization Features

Interactive Charts

  • Distribution Analysis: Pie charts showing SWOT category proportions
  • Theme Analysis: Most frequent topics per category
  • Evidence Display: Representative sentences for each category
  • Executive Metrics: Key performance indicators and summaries

Professional Styling

  • Dark Theme: Modern gradient backgrounds
  • Color Coding: Intuitive category identification
  • Responsive Layout: Adapts to different screen sizes
  • Professional Typography: Clean, readable font choices

๐Ÿ” Data Sources

  • SEC EDGAR Database: Official 10-K filings
  • Supported Companies: All publicly traded US companies
  • Time Range: 2020-present (configurable)
  • Filing Types: 10-K annual reports (expandable)

๐Ÿšง Roadmap

Near Term

  • PDF document upload support
  • Enhanced text preprocessing
  • Batch analysis capabilities
  • PDF report generation

Medium Term

  • Advanced NLP models (BERT, GPT)
  • Comparative analysis across companies
  • Trend analysis over time
  • Email notification system

Long Term

  • Multi-language support
  • Real-time market integration
  • Automated report scheduling
  • API development

๐ŸŽฎ Getting Started

  1. First Time Setup:

    git clone <repository-url>
    cd nlp-project
    pip install -r requirements.txt
  2. Launch Dashboard:

    streamlit run dashboard.py
  3. Run Your First Analysis:

    • Navigate to "Quick Analysis" mode
    • Select "AAPL" from the ticker dropdown
    • Set date range to last year
    • Click "Run Analysis"
    • View results in "View Results" mode

๐Ÿ™ Acknowledgments

  • datamule: For SEC filing download and processing
  • Streamlit: For the interactive web interface framework
  • Plotly: For advanced data visualizations
  • SEC EDGAR: For providing public access to corporate filings

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published