Skip to content

πŸ“Š Hedge Fund Tracker: Track SEC 13F/13D filings with AI-Powered Insights for Stock Investments

License

Notifications You must be signed in to change notification settings

dokson/hedge-fund-tracker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

551 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“Š Hedge Fund Tracker

repo views repo size last commit Python License: MIT GitHub stars GitHub watchers GitHub forks

If this tool is helping you, please ⭐ the repo! It really helps discoverability.

SEC 13F Filing Tracker | Institutional Portfolio Analysis | AI-Powered Stock Research

A comprehensive Python tool for tracking hedge fund portfolios through SEC filings (13F, 13D/G, Form 4). Transform raw SEC EDGAR data into actionable investment insights. Built for financial analysts, quantitative traders, and retail investors seeking to analyze institutional investor strategies, portfolio changes, and discover stock opportunities by following elite fund managers.

Keywords: SEC filings tracker, 13F analysis, hedge fund portfolio, institutional investors, stock research, investment intelligence, CUSIP converter, financial data scraper, AI stock analysis

⫢☰ Table of Contents

πŸš€ Quick Start

# Clone the repository
git clone https://github.com/dokson/hedge-fund-tracker.git
cd hedge-fund-tracker

# Install dependencies
pipenv install

# Set up environment variables
cp .env.example .env
# Add your tokens/API keys (FinnHub, GitHub, Google AI Studio, Groq, HuggingFace, OpenRouter) to the .env file

# Run the application
pipenv run python -m app.main

✨ Key Features

Feature Description
πŸ†š Comparative Analysis Combines quarterly (13F) and non-quarterly (13D/G, Form 4) filings for an up-to-date view
πŸ“‹ Detailed Reports Generates clear, console-based reports with intuitive formatting
πŸ—„οΈ Curated Database Includes list of top hedge funds and AI models, both easily editable via CSV files
πŸ” Ticker Resolution Converts CUSIPs to tickers using a smart fallback system (yfinance, Finnhub, FinanceDatabase)
πŸ€– Multi-Provider AI Analysis Leverages different AI models to identify promising stocks based on filings
πŸ”€ Flexible Management Offers multiple analysis modes: all funds, a single fund and also custom CIKs
βš™οΈ Automated Data Update Includes a GitHub Actions workflow to automatically fetch and commit the latest SEC filings
πŸ—ƒοΈ GICS Hierarchy Features an autonomous parser to build a full GICS classification database

πŸ“¦ Installation

Prerequisites

  1. πŸ“₯ Clone and navigate:

    git clone https://github.com/dokson/hedge-fund-tracker.git
    cd hedge-fund-tracker
  2. πŸ“² Install dependencies: Navigate to the project root and run the following command. This will create a virtual environment and install all required packages.

    pipenv install

    πŸ’‘ Tip: If pipenv is not found, you might need to use python -m pipenv install. This can happen if the user scripts directory is not in your system's PATH.

  3. πŸ› οΈ Configure environment: Create a .env file in the root directory of the project and add your keys (Finnhub and Google API)

    # Create environment file
    cp .env.example .env
    
    # Edit .env file and add your API keys:
    # FINNHUB_API_KEY="your_finnhub_key"
    # GITHUB_TOKEN="your_github_token"
    # GOOGLE_API_KEY="your_google_api_key"
    # GROQ_API_KEY="your_groq_api_key"
    # HF_TOKEN="your_hugging_face_token"
    # OPENROUTER_API_KEY="your_openrouter_api_key"
  4. ▢️ Run the script: Execute within the project's virtual environment:

    pipenv run python -m app.main
  5. πŸ“œ Choose an action: Once the script starts, you'll see the main interactive menu for data analysis:

    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚                                 Hedge Fund Tracker                                β”‚
    β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
    β”‚  0. Exit                                                                          β”‚
    β”‚  1. View latest non-quarterly filings activity by funds (from 13D/G, Form 4)      β”‚
    β”‚  2. Analyze overall hedge-funds stock trends for a quarter                        β”‚
    β”‚  3. Analyze a specific fund's quarterly portfolio                                 β”‚
    β”‚  4. Analyze a specific stock's activity for a quarter                             β”‚
    β”‚  5. Run AI Analyst to find most promising stocks                                  β”‚
    β”‚  6. Run AI Due Diligence on a stock                                               β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Management

The data update operations (downloading and processing filings) are inside a dedicated script. This keeps the main application focused on analysis, while the updater handles populating and refreshing the database.

To run the data update operations, you need to use the updater.py script from the project root:

pipenv run python -m database.updater

Database Updater

The updater.py script includes semi-automated maintenance tasks:

  • Sorting: Upon exit (option 0), the script automatically sorts the database/stocks.csv file by ticker to maintain performance and prevent Git diff noise.
  • Auto-Documentation: This README's excluded funds section is synchronized whenever the database is refreshed manually.

This will open a separate menu for data management:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Hedge Fund Tracker - Database Updater                     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  0. Exit                                                                      β”‚
β”‚  1. Generate latest 13F reports for all known hedge funds                     β”‚
β”‚  2. Fetch latest non-quarterly filings for all known hedge funds              β”‚
β”‚  3. Generate 13F report for a known hedge fund                                β”‚
β”‚  4. Manually enter a hedge fund CIK to generate a 13F report                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

GICS Classification

The project includes an autonomous GICS (Global Industry Classification Standard) parser (database/gics/updater.py). Originally developed by MSCI and S&P, it scrapes Wikipedia to build a full hierarchy of 163 sub-industries. This provides the AI Analyst with granular industry context while remaining independent of third-party libraries.

API Configuration

The tool can utilize API keys for enhanced functionality, but all are optional:

Service Purpose Get Free API Key
Finnhub Finnhub CUSIP to stock ticker conversion Finnhub Keys
GitHub Models GitHub Models Access to top-tier models (e.g., xAI Grok-3, OpenAI GPT-5, etc...) GitHub Tokens
Google AI Studio Google AI Studio Access to Google Gemini models AI Studio Keys
Groq AI Groq AI Access to various LLMs (e.g., OpenAI gpt-oss, Meta Llama, etc...) Groq Keys
Hugging Face Hugging Face Access to open weights models (e.g., DeepSeek R1, Kimi-Linear-48B, etc...) HF Tokens
OpenRouter OpenRouter Access to various LLMs (e.g., Claude 4.5 Opus, GLM 4.5 Air, etc...) OpenRouter Keys

πŸ’‘ Note: Ticker resolution primarily uses yfinance, which is free and requires no API key. If that fails, the system falls back to Finnhub (if an API key is provided), with the final fallback being FinanceDatabase.

πŸ’‘ Note: You don't need to use all the APIs. For the generative AI models (Google AI Studio, GitHub Models, Groq AI, Hugging Face, and OpenRouter), you only need the API keys for the services you plan to use. For instance, if you want to experiment with models like OpenAI GPT-4o mini, you just need a GitHub Token. Experimenting with different models is encouraged, as the quality of AI-generated analysis, both for identifying promising stocks and for conducting due diligence, can vary. However, top-performing stocks are typically identified consistently across all tested models. All APIs used in this project are currently free (with GitHub Models providing a generous free tier for developers).

πŸ“ Project Structure

hedge-fund-tracker/
β”œβ”€β”€ πŸ“ .github/
β”‚   β”œβ”€β”€ πŸ“ scripts/
β”‚   β”‚   └── 🐍 fetcher.py           # Daily script for data fetching (scheduled by workflows/daily-fetch.yml)
β”‚   └── πŸ“ workflows/                # GitHub Actions for automation
β”‚       β”œβ”€β”€ βš™οΈ filings-fetch.yml    # GitHub Actions: Filings fetching job
β”‚       └── βš™οΈ python-tests.yml     # GitHub Actions: Unit tests
β”œβ”€β”€ πŸ“ app/                          # Main application logic
β”‚   └── ▢️ main.py                  # Main entry point for Data & AI analysis
β”œβ”€β”€ πŸ“ database/                     # Data storage
β”‚   β”œβ”€β”€ πŸ“ 2025Q1/                  # Quarterly reports
β”‚   β”‚   β”œβ”€β”€ πŸ“Š fund_1.csv           # Individual fund quarterly report
β”‚   β”‚   β”œβ”€β”€ πŸ“Š fund_2.csv
β”‚   β”‚   └── πŸ“Š fund_n.csv
β”‚   β”œβ”€β”€ πŸ“ YYYYQN/
β”‚   β”œβ”€β”€ πŸ“ GICS/
β”‚   β”‚   β”œβ”€β”€ πŸ—ƒοΈ hierarchy.csv        # Full GICS hierarchy
β”‚   β”‚   └── ▢️ updater.py           # GICS updater script
β”‚   β”œβ”€β”€ πŸ“ hedge_funds.csv          # Curated hedge funds list -> EDIT THIS to add or remove funds to track
β”‚   β”œβ”€β”€ πŸ“ models.csv               # LLMs list to use for AI Financial Analyst -> EDIT THIS to add or remove AI models
β”‚   β”œβ”€β”€ πŸ“Š non_quarterly.csv        # Stores latest 13D/G and Form 4 filings
β”‚   β”œβ”€β”€ πŸ“Š stocks.csv               # Master data for stocks (CUSIP-Ticker-Name)
β”‚   └── ▢️ updater.py               # Main entry point for updating the database
β”œβ”€β”€ πŸ“ tests/                        # Test suite
β”œβ”€β”€ πŸ“ .env.example                 # Template for your API keys
β”œβ”€β”€ β›” .gitignore                   # Git ignore rules
β”œβ”€β”€ 🧾 LICENSE                      # MIT License
β”œβ”€β”€ πŸ› οΈ Pipfile                      # Project dependencies
β”œβ”€β”€ πŸ” Pipfile.lock                 # Locked dependency versions
└── πŸ“– README.md                    # Project documentation (this file)

πŸ“ Hedge Funds Configuration File: database/hedge_funds.csv contains the list of hedge funds to monitor (CIK, name, manager) and can also be edited at runtime.

πŸ“ LLMs Configuration File: database/models.csv contains the list of available LLMs for AI analysis and can also be edited at runtime.

πŸ‘¨πŸ»β€πŸ’» How This Tool Tracks Hedge Funds

This tracker leverages the following types of SEC filings to provide a comprehensive view of institutional activity.

  • πŸ“… Quarterly 13F Filings

    • Required for funds managing $100M+
    • Filed within 45 days of quarter-end
    • Shows portfolio snapshot on last day of quarter
  • πŸ“ Non-Quarterly 13D/G Filings

    • Required when acquiring 5%+ of company shares
    • Filed within 10 days of the transaction
    • Provides a timely view of significant investments
  • ✍🏻 Non-Quarterly SEC Form 4 Insider Filings

    • Filed by insiders (executives, directors) or large shareholders (>10%) when they trade company stocks
    • Must be filed within 2 business days of the transaction
    • Offers real-time insight into the actions of key individuals and institutions

🏒 Hedge Funds Selection

This tool tracks a curated list of what I found to be the top-performing institutional investors that file with the U.S. SEC, identified based on their performance over the last 3-5 years. This curation is the result of my own methodology designed to identify the top percentile of global investment funds. My selection methodology is detailed below.

Selection Methodology

Modern portfolio theory (MPT) offers many methods for quantifying the risk-return trade-off, but they are often ill-suited for analyzing the limited data available in public filings. Consequently, the hedge_funds.csv was therefore generated using my own custom selection algorithm designed to identify top-performing funds while managing for volatility.

Note: The selection algorithm is external to this project and was used only to produce the curated hedge_funds.csv list.

My approach prioritizes high cumulative returns but also analyzes the path taken to achieve them: it penalizes volatility, similar to the Sharpe Ratio, but this penalty is dynamically adjusted based on performance consistency; likewise, drawdowns are penalized, echoing the principle of the Sterling Ratio, but the penalty is intentionally dampened to avoid overly punishing funds that recover effectively from temporary downturns.

List Management

The list of hedge funds is actively managed to maintain its quality; funds that underperform may be replaced, while new top performers are periodically added.

However, despite their strong performance, several funds with portfolios predominantly focused on Healthcare and Biotech, such as Nextech Invest, Enavate Sciences, Caligan Partners, and Boxer Capital Management, have been intentionally excluded. These funds invest in highly specialized sectors where I lack the necessary expertise. Consequently, I consider them too risky for my personal investment profile, given the complexity and volatility inherent in biotech and healthcare ventures.

Notable Exclusions

The quality of the output analysis is directly tied to the quality of the input data. To enhance the accuracy of the insights and opportunities identified, many popular high-profile funds have been intentionally excluded by design (the list below is automatically managed and capped to 50 funds, but you can see the full list in excluded_hedge_funds.csv):

πŸ’‘ Note: For convenience, key information for these funds, including their CIKs, is maintained in the database/excluded_hedge_funds.csv file.

Adding Custom Funds

Want to track additional funds? Simply edit database/hedge_funds.csv and add your preferred institutional investors. For example, to add Berkshire Hathaway, Pershing Square and ARK-Invest, you would add the following lines:

"CIK","Fund","Manager","Denomination","CIKs"
...
"0001067983","Berkshire Hathaway","Warren Buffett","Berkshire Hathaway Inc",""
"0001336528","Pershing Square","Bill Ackman","Pershing Square Capital Management, L.P.",""
"0001697748","ARK Invest","Cathie Wood","ARK Investment Management LLC",""

πŸ’‘ Note: hedge_funds.csv currently includes not only traditional hedge funds but also other institutional investors (private equity funds, large banks, VCs, pension funds, etc., that file 13F to the SEC) selected from what I consider the top 5% of performers.

If you wish to track any of the Notable Exclusions hedge funds, you can copy the relevant rows from excluded_hedge_funds.csv into hedge_funds.csv.

Columns for Custom Funds:

  • Denomination: This is the exact legal name used by the fund in its filings. It is essential for accurately processing non-quarterly filings (13D/G, Form 4) as the scraper uses it to identify the fund's specific transactions within complex filing documents.
  • CIKs: A comma-separated list of additional CIKs. This field is used to track filings from related entities or subsidiaries. Some investment firms have complex structures where different legal entities file separately (e.g., a management company and a holding company).
    • Example: Jeffrey Ubben's ValueAct Holdings (CIK = 0001418814) also has filings under ValueAct Capital Management (CIK = 0001418812). By adding 0001418812 to the CIKs column, the tool aggregates non-quarterly filings from both entities for a complete view.
"CIK","Fund","Manager","Denomination","CIKs"
"0001418814","ValueAct","Jeffrey Ubben","ValueAct Holdings, L.P.","0001418812"

🧠 AI Models Selection

The AI Financial Analyst's primary goal is to identify stocks with the highest growth potential based on hedge fund activity. It achieves this by calculating a "Promise Score" for each stock. This score is a weighted average of various metrics derived from 13F filings. The AI's first critical task is to act as a strategist, dynamically defining the heuristic by assigning the optimal weights for these metrics based on the market conditions of the selected quarter. Its second task is to provide quantitative scores (e.g., momentum, risk) for the top-ranked stocks.

The models included in database/models.csv have been selected because they have demonstrated the best performance and reliability for these specific tasks. Through experimentation, they have proven effective at interpreting the prompts and providing insightful, well-structured responses.

πŸ’‘ Note on Meta's llama-3.3-70b-versatile: while it can occasionally be less precise in defining the heuristic for the "Promise Score" compared to other top-tier models, it remains a valuable option. Its exceptional speed and lightweight nature make it ideal for rapid experimentation and iterative analysis, providing a useful trade-off between accuracy and performance. As the AI landscape evolves, it is expected that this model will eventually be replaced by newer alternatives that offer similar or better speed and efficiency.

πŸ’‘ Note on xAI's Grok-3: This tool now supports GitHub Models, which provides access to Grok-3 and other next-generation models like GPT-5 and Llama 4. This integration allows for state-of-the-art financial reasoning and due diligence directly through your GitHub account.

πŸ’‘ Note on OpenRouter: OpenRouter was initially included because it offered free access to top-tier models; while some are no longer free, you can still use it with this tool if you have an existing API key.

Adding Custom AI Models

You can easily add or change the AI models used for analysis by editing the database/models.csv file. This allows you to experiment with different Large Language Models (LLMs) from supported providers.

To add a new model, open database/models.csv and add a new row with the following columns:

  • ID: The specific model identifier as required by the provider's API.
  • Description: A brief, user-friendly description that will be displayed in the selection menu.
  • Client: The provider of the model. Must be one of GitHub, Google, Groq, HuggingFace, or OpenRouter.

Here are the official model lists for each provider:

⚠️ Limitations & Considerations

It's crucial to understand the inherent limitations of tracking investment strategies solely through SEC filings:

Limitation Impact Mitigation
πŸ•’ Filing Delay Data can be 45+ days old Focus on long-term strategies
🧩 Incomplete Picture Only US long positions shown Use as part of broader analysis
πŸ“‰ No Short Positions Missing hedge information Consider reported positions carefully
🌎 Limited Scope No non-US stocks or other assets Supplement with additional data

A Truly Up-to-Date View

Many tracking websites rely solely on quarterly 13F filings, which means their data can be over 45 days old and miss many significant trades. Non-quarterly filings like 13D/G and Form 4 are often ignored because they are more complex to process and merge.

This tracker helps overcome that limitation by integrating multiple filing types. When analyzing the most recent quarter, the tool automatically incorporates the latest data from 13D/G and Form 4 filings. As a result, the holdings, deltas, and portfolio percentages reflect not just the static 13F snapshot, but also any significant trades that have occurred since. This provides a more dynamic and complete picture of institutional activity.

βš™οΈ Automation with GitHub Actions

This repository includes a GitHub Actions workflow (.github/workflows/filings-fetch.yml) designed to keep your data effortlessly up-to-date by automatically fetching the latest SEC filings.

How It Works

  • Scheduled Runs: The workflow runs automatically to check for new 13F, 13D/G, and Form 4 filings from the funds you are tracking (hedge_funds.csv). It runs four times a day from Monday to Friday (at 01:30, 13:30, 17:30, and 21:30 UTC) and once on Saturday (at 04:00 UTC).
  • Safe Branching Strategy: Instead of committing directly to your main branch, the workflow pushes all new data to a dedicated branch named automated/filings-fetch.
  • User-Controlled Merging: This approach gives you full control. You can review the changes committed by the bot and then merge them into your main branch whenever you're ready. This prevents unexpected changes and allows you to manage updates at your own pace.
  • Automated Alerts: If the script encounters a non-quarterly filing where it cannot identify the fund owner based on your hedge_funds.csv configuration, it will automatically open a GitHub Issue in your repository, alerting you to a potential data mismatch that needs investigation.

How to Enable It

  1. Fork the Repository: Create your own fork of this project on GitHub.
  2. Enable Actions: GitHub Actions are typically enabled by default on forked repositories. You can verify this under the Actions tab of your fork.
  3. Configure Secrets: For the workflow to resolve tickers and create issues, you need to add your API keys as repository secrets. In your forked repository, you must add your FINNHUB_API_KEY as a repository secret. Go to Settings > Secrets and variables > Actions in your forked repository to add it.

πŸ—ƒοΈ Technical Stack

πŸ—‚οΈ Category 🦾 Technology
Core Python 3.13+, pipenv
Web Scraping Requests, Beautiful Soup, lxml
Reliability Tenacity (Smart retries for API rate-limiting and AI responses)
Config python-dotenv
Data Processing pandas, csv
Stocks Libraries Finnhub-Stock-API, FinanceDatabase
Gen AI python-toon, Google Gen AI SDK, OpenAI

🀝🏼 Contributing & Support

πŸ’¬ Loved it? Help it grow

✍🏻 Feedback

This tool is in active development, and your input is valuable. If you have any suggestions or ideas for new features, please feel free to get in touch.

πŸ“š References

πŸ™πŸΌ Acknowledgments

This project began as a fork of sec-web-scraper-13f by Gary Pang. The original tool provided a solid foundation for scraping 13F filings from the SEC's EDGAR database. It has since been significantly re-architected and expanded into a comprehensive analysis platform, incorporating multiple filing types, AI-driven insights, and automated data management.

πŸ“„ License

This project is released under the MIT License, an open-source license that grants you the freedom to use, modify, and distribute the software. For the full terms, please see the LICENSE file.

Releases

No releases published

Contributors 5

Languages