AI Tools Website

A modern Python web application for aggregating and browsing AI tools. Built with FastHTML for the frontend and featuring AI-powered search capabilities.

🌐 Live at: drose.io/aitools

Overview

The AI Tools Website aggregates various AI tools and presents them in a responsive, searchable interface. Recent refactoring has separated core functionalities into distinct modules—web, search, logging, data management, and storage—to better organize and scale the application.

Key Features

Modular Architecture: Separation of concerns across data processing, logging, search, and storage.
Modern Tech Stack: Built with FastHTML for server-side rendering.
Enhanced Search: AI-powered search with support for OpenAI and Tavily integrations.
Flexible Storage: Supports both local storage and Minio S3-compatible storage.
Robust Logging: Improved logging configuration for easier debugging and monitoring.
Efficient Dependency Management: UV used for dependency synchronization and task execution.

Project Structure

.
├── ai_tools_website/        # Main application package
│   ├── __init__.py         # Package initializer
│   ├── config.py           # Application configuration
│   ├── data_manager.py     # Data processing and validation
│   ├── logging_config.py   # Logging configuration
│   ├── search.py           # AI-powered search implementation
│   ├── storage.py          # Storage interfaces (local/Minio)
│   ├── web.py             # FastHTML web server
│   ├── utils/             # Utility functions
│   └── static/            # Client-side assets
│
├── scripts/                # Automation scripts
│   ├── crontab            # Scheduled task configuration
│   └── run-update.sh      # Tool update script
│
├── data/                  # Data storage directory
├── logs/                  # Application logs
│
├── docker-compose.yml     # Docker Compose configuration
├── Dockerfile            # Web service container
├── Dockerfile.updater    # Update service container
├── pyproject.toml        # Python project configuration
└── uv.lock              # UV dependency lock file

How It Works

Web Interface:
- The FastHTML server (web.py) renders a responsive UI with real-time client-side search.
Data Management & Search:
- Data is processed and validated in data_manager.py.
- search.py leverages AI integrations to provide enhanced search functionality.
Storage & Logging:
- storage.py handles file storage, supporting local and Minio backends.
- logging_config.py sets up comprehensive logging for monitoring and debugging.

Technical Details

AI-Powered Tool Discovery

The system uses a multi-stage pipeline for discovering and validating AI tools:

Search Integration
- Uses Tavily API for initial tool discovery
- Focuses on high-quality domains (github.com, producthunt.com, huggingface.co, replicate.com)
- Implements caching in development mode for faster iteration
Validation Pipeline
- Multi-stage verification using LLMs:
  - Initial filtering of search results (confidence threshold: 80%)
  - Page content analysis and verification (confidence threshold: 90%)
  - Category assignment based on existing tool context
- URL validation to filter out listing/search pages
- Async processing for improved performance
Deduplication System
- Two-pass deduplication:
  - Quick URL-based matching
  - LLM-based semantic comparison for similar tools
- Confidence-based decision making for updates vs. new entries
- Smart merging of tool information when duplicates found
Data Models
- ToolUpdate: Tracks tool verification decisions
- SearchAnalysis: Manages search result analysis
- DuplicateStatus: Handles deduplication decisions
- Strong typing with Pydantic for data validation
Categorization
- Dynamic category management
- LLM-powered category suggestions
- Supported categories:
  - Language Models
  - Image Generation
  - Audio & Speech
  - Video Generation
  - Developer Tools
  - Other

Background Update Process

The updater service (Dockerfile.updater) implements:

Scheduled tool discovery using supercronic
Automatic deduplication of new entries
Health monitoring of the update process
Configurable update frequency via crontab

Content Enhancement Process

Weekly Supercronic job calls run-enhancement.sh, which executes uv run python -m ai_tools_website.v1.content_enhancer_v2 inside the updater container.
The V2 enhancer uses a multi-stage pipeline (Tavily search + LLM analysis) to enrich tool records with detailed information, installation commands, and feature lists.
Quality Tiering: Tools are automatically assigned to tiers (Tier 1, Tier 2, Tier 3, or noindex) based on importance signals like GitHub stars and HuggingFace downloads. This ensures resources are focused on high-value tools.
Regeneration limits (CONTENT_ENHANCER_MAX_PER_RUN) are configurable. CONTENT_ENHANCER_MODEL must be set in the environment.

Storage Implementation

The system implements a flexible storage system:

Minio Integration
- S3-compatible object storage
- Automatic bucket creation and management
- LRU caching for improved read performance
- Graceful handling of initialization (empty data)
- Content-type aware storage (application/json)

Data Format

JSON-based storage for flexibility

Schema:

{
  "tools": [
    {
      "name": "string",
      "description": "string",
      "url": "string",
      "category": "string"
    }
  ],
  "last_updated": "string"
}

Atomic updates with cache invalidation
Error handling for storage operations

Development Features
- Local filesystem fallback
- Development mode caching
- Configurable secure/insecure connections
- Comprehensive logging of storage operations

Web Implementation

The frontend is built with FastHTML for efficient server-side rendering:

Architecture
- Server-side rendering with FastHTML components
- Async request handling with uvicorn
- In-memory caching with background refresh
- Health check endpoint for monitoring
UI Components
- Responsive grid layout for tool cards
- Real-time client-side search filtering
- Category-based organization
- Dynamic tool count display
- GitHub integration corner
Performance Features
- Background cache refresh mechanism
- Efficient DOM updates via client-side JS
- Static asset serving (CSS, JS, images)
- Optimized search with data attributes
Development Mode
- Hot reload support
- Configurable port via environment
- Static file watching
- Detailed request logging

Quick Start

# Install UV if you haven't already
pip install uv

# Install dependencies
uv sync

# Set up environment variables (copy from .env.example)
cp .env.example .env

# Run the web server
uv run python -m ai_tools_website.web

# Run background search/updater
uv run python -m ai_tools_website.search

Visit https://drose.io/aitools or http://localhost:8000 (for local development) in your browser.

Configuration

The application uses environment variables for configuration. Copy .env.example to .env and configure the following:

Core Settings

WEB_PORT: Web server port (default: 8000)
LOG_LEVEL: Logging verbosity (default: INFO)

CLI Flags

When running the search module:

--cache-searches: Cache Tavily search results for faster iteration
--dry-run: Run without saving any changes
OPENAI_API_KEY: OpenAI API key for enhanced search
TAVILY_API_KEY: Tavily API key for additional search features
CONTENT_ENHANCER_MODEL: Model for content enhancement (required, no default)
SEARCH_MODEL: Model for search and deduplication (required, no default)
MAINTENANCE_MODEL: Model for maintenance tasks (required, no default)
WEB_SEARCH_MODEL: Model for web search API calls (required, no default)
LANGCHAIN_API_KEY: Optional LangChain integration
LANGCHAIN_TRACING_V2: Enable LangChain tracing (default: false)
LANGCHAIN_PROJECT: LangChain project name

Storage Configuration

AITOOLS_STORAGE_BACKEND: Storage backend (minio or local, default: minio)
AITOOLS_LOCAL_DATA_DIR: Local storage directory for tools/comparisons (default: dev_cache)
TOOLS_FILE: Path to tools data file when using local storage
AITOOLS_SLUG_REGISTRY_FILE: Optional slug registry path for local storage

Minio Storage (optional)

If using Minio for storage, configure:

MINIO_ENDPOINT: Minio server endpoint
MINIO_ACCESS_KEY: Minio access key
MINIO_SECRET_KEY: Minio secret key
MINIO_BUCKET_NAME: Bucket name for tool storage

See .env.example for a template with default values.

Recent Improvements

Refactored the codebase to separate concerns:
- data_manager.py now handles data processing and validation.
- search.py is refactored for clarity and integration with AI services.
- Improved logging configuration in logging_config.py.
- Enhanced storage interface in storage.py to support multiple backends.
Adopted UV for dependency management and task execution best practices.

Deployment

The application is containerized using Docker with two services:

Web Service
- Serves the main web application
- Built from Dockerfile
- Exposes the configured web port
- Includes health checks for reliability
Updater Service
- Runs scheduled tool updates using supercronic
- Built from Dockerfile.updater
- Automatically keeps tool data fresh
- Includes health monitoring

Sitemap Publishing

Nightly sitemap exports run inside the updater container via run-sitemaps.sh (scheduled in scripts/crontab at 05:00 UTC).
You can generate the XML bundle manually with uv run python -m ai_tools_website.v1.sitemap_builder --dry-run or omit --dry-run to publish directly to MinIO.
Sitemaps are stored under the sitemaps/ prefix in object storage and served through /sitemap.xml plus /sitemaps/<file>.xml routes.

To deploy using Docker Compose:

# Build and start all services
docker compose up -d

# View logs
docker compose logs -f

# Stop services
docker compose down

Make sure to configure your .env file before deployment. See Configuration section above for required variables.

License

This project is licensed under the Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
.cursor/rules		.cursor/rules
ai_tools_website		ai_tools_website
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
ASTRO_MIGRATION_PLAN.md		ASTRO_MIGRATION_PLAN.md
COMPARISON_SYSTEM_IMPLEMENTATION.md		COMPARISON_SYSTEM_IMPLEMENTATION.md
Dockerfile		Dockerfile
Dockerfile.updater		Dockerfile.updater
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Tools Website

🌐 Live at: drose.io/aitools

Overview

Key Features

Project Structure

How It Works

Technical Details

AI-Powered Tool Discovery

Background Update Process

Content Enhancement Process

Storage Implementation

Web Implementation

Quick Start

Configuration

Core Settings

CLI Flags

Storage Configuration

Minio Storage (optional)

Recent Improvements

Deployment

Sitemap Publishing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

cipher982/ai-tools-website

Folders and files

Latest commit

History

Repository files navigation

AI Tools Website

🌐 Live at: drose.io/aitools

Overview

Key Features

Project Structure

How It Works

Technical Details

AI-Powered Tool Discovery

Background Update Process

Content Enhancement Process

Storage Implementation

Web Implementation

Quick Start

Configuration

Core Settings

CLI Flags

Storage Configuration

Minio Storage (optional)

Recent Improvements

Deployment

Sitemap Publishing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages