A modern Python web application for aggregating and browsing AI tools. Built with FastHTML for the frontend and featuring AI-powered search capabilities.
π Live at: drose.io/aitools
The AI Tools Website aggregates various AI tools and presents them in a responsive, searchable interface. Recent refactoring has separated core functionalities into distinct modulesβweb, search, logging, data management, and storageβto better organize and scale the application.
- Modular Architecture: Separation of concerns across data processing, logging, search, and storage.
- Modern Tech Stack: Built with FastHTML for server-side rendering.
- Enhanced Search: AI-powered search with support for OpenAI and Tavily integrations.
- Flexible Storage: Supports both local storage and Minio S3-compatible storage.
- Robust Logging: Improved logging configuration for easier debugging and monitoring.
- Efficient Dependency Management: UV used for dependency synchronization and task execution.
.
βββ ai_tools_website/ # Main application package
β βββ __init__.py # Package initializer
β βββ config.py # Application configuration
β βββ data_manager.py # Data processing and validation
β βββ logging_config.py # Logging configuration
β βββ search.py # AI-powered search implementation
β βββ storage.py # Storage interfaces (local/Minio)
β βββ web.py # FastHTML web server
β βββ utils/ # Utility functions
β βββ static/ # Client-side assets
β
βββ scripts/ # Automation scripts
β βββ crontab # Scheduled task configuration
β βββ run-update.sh # Tool update script
β
βββ data/ # Data storage directory
βββ logs/ # Application logs
β
βββ docker-compose.yml # Docker Compose configuration
βββ Dockerfile # Web service container
βββ Dockerfile.updater # Update service container
βββ pyproject.toml # Python project configuration
βββ uv.lock # UV dependency lock file
- Web Interface:
- The FastHTML server (web.py) renders a responsive UI with real-time client-side search.
- Data Management & Search:
- Data is processed and validated in data_manager.py.
- search.py leverages AI integrations to provide enhanced search functionality.
- Storage & Logging:
- storage.py handles file storage, supporting local and Minio backends.
- logging_config.py sets up comprehensive logging for monitoring and debugging.
The system uses a multi-stage pipeline for discovering and validating AI tools:
-
Search Integration
- Uses Tavily API for initial tool discovery
- Focuses on high-quality domains (github.com, producthunt.com, huggingface.co, replicate.com)
- Implements caching in development mode for faster iteration
-
Validation Pipeline
- Multi-stage verification using LLMs:
- Initial filtering of search results (confidence threshold: 80%)
- Page content analysis and verification (confidence threshold: 90%)
- Category assignment based on existing tool context
- URL validation to filter out listing/search pages
- Async processing for improved performance
- Multi-stage verification using LLMs:
-
Deduplication System
- Two-pass deduplication:
- Quick URL-based matching
- LLM-based semantic comparison for similar tools
- Confidence-based decision making for updates vs. new entries
- Smart merging of tool information when duplicates found
- Two-pass deduplication:
-
Data Models
ToolUpdate: Tracks tool verification decisionsSearchAnalysis: Manages search result analysisDuplicateStatus: Handles deduplication decisions- Strong typing with Pydantic for data validation
-
Categorization
- Dynamic category management
- LLM-powered category suggestions
- Supported categories:
- Language Models
- Image Generation
- Audio & Speech
- Video Generation
- Developer Tools
- Other
The updater service (Dockerfile.updater) implements:
- Scheduled tool discovery using supercronic
- Automatic deduplication of new entries
- Health monitoring of the update process
- Configurable update frequency via crontab
- Weekly Supercronic job calls
run-enhancement.sh, which executesuv run python -m ai_tools_website.v1.content_enhancer_v2inside the updater container. - The V2 enhancer uses a multi-stage pipeline (Tavily search + LLM analysis) to enrich tool records with detailed information, installation commands, and feature lists.
- Quality Tiering: Tools are automatically assigned to tiers (Tier 1, Tier 2, Tier 3, or noindex) based on importance signals like GitHub stars and HuggingFace downloads. This ensures resources are focused on high-value tools.
- Regeneration limits (
CONTENT_ENHANCER_MAX_PER_RUN) are configurable.CONTENT_ENHANCER_MODELmust be set in the environment.
The system implements a flexible storage system:
-
Minio Integration
- S3-compatible object storage
- Automatic bucket creation and management
- LRU caching for improved read performance
- Graceful handling of initialization (empty data)
- Content-type aware storage (application/json)
-
Data Format
- JSON-based storage for flexibility
- Schema:
{ "tools": [ { "name": "string", "description": "string", "url": "string", "category": "string" } ], "last_updated": "string" } - Atomic updates with cache invalidation
- Error handling for storage operations
-
Development Features
- Local filesystem fallback
- Development mode caching
- Configurable secure/insecure connections
- Comprehensive logging of storage operations
The frontend is built with FastHTML for efficient server-side rendering:
-
Architecture
- Server-side rendering with FastHTML components
- Async request handling with uvicorn
- In-memory caching with background refresh
- Health check endpoint for monitoring
-
UI Components
- Responsive grid layout for tool cards
- Real-time client-side search filtering
- Category-based organization
- Dynamic tool count display
- GitHub integration corner
-
Performance Features
- Background cache refresh mechanism
- Efficient DOM updates via client-side JS
- Static asset serving (CSS, JS, images)
- Optimized search with data attributes
-
Development Mode
- Hot reload support
- Configurable port via environment
- Static file watching
- Detailed request logging
# Install UV if you haven't already
pip install uv
# Install dependencies
uv sync
# Set up environment variables (copy from .env.example)
cp .env.example .env
# Run the web server
uv run python -m ai_tools_website.web
# Run background search/updater
uv run python -m ai_tools_website.searchVisit https://drose.io/aitools or http://localhost:8000 (for local development) in your browser.
The application uses environment variables for configuration. Copy .env.example to .env and configure the following:
WEB_PORT: Web server port (default: 8000)LOG_LEVEL: Logging verbosity (default: INFO)
When running the search module:
-
--cache-searches: Cache Tavily search results for faster iteration -
--dry-run: Run without saving any changes -
OPENAI_API_KEY: OpenAI API key for enhanced search -
TAVILY_API_KEY: Tavily API key for additional search features -
CONTENT_ENHANCER_MODEL: Model for content enhancement (required, no default) -
SEARCH_MODEL: Model for search and deduplication (required, no default) -
MAINTENANCE_MODEL: Model for maintenance tasks (required, no default) -
WEB_SEARCH_MODEL: Model for web search API calls (required, no default) -
LANGCHAIN_API_KEY: Optional LangChain integration -
LANGCHAIN_TRACING_V2: Enable LangChain tracing (default: false) -
LANGCHAIN_PROJECT: LangChain project name
AITOOLS_STORAGE_BACKEND: Storage backend (minioorlocal, default:minio)AITOOLS_LOCAL_DATA_DIR: Local storage directory for tools/comparisons (default:dev_cache)TOOLS_FILE: Path to tools data file when using local storageAITOOLS_SLUG_REGISTRY_FILE: Optional slug registry path for local storage
If using Minio for storage, configure:
MINIO_ENDPOINT: Minio server endpointMINIO_ACCESS_KEY: Minio access keyMINIO_SECRET_KEY: Minio secret keyMINIO_BUCKET_NAME: Bucket name for tool storage
See .env.example for a template with default values.
- Refactored the codebase to separate concerns:
- data_manager.py now handles data processing and validation.
- search.py is refactored for clarity and integration with AI services.
- Improved logging configuration in logging_config.py.
- Enhanced storage interface in storage.py to support multiple backends.
- Adopted UV for dependency management and task execution best practices.
The application is containerized using Docker with two services:
-
Web Service
- Serves the main web application
- Built from
Dockerfile - Exposes the configured web port
- Includes health checks for reliability
-
Updater Service
- Runs scheduled tool updates using supercronic
- Built from
Dockerfile.updater - Automatically keeps tool data fresh
- Includes health monitoring
- Nightly sitemap exports run inside the updater container via
run-sitemaps.sh(scheduled inscripts/crontabat 05:00 UTC). - You can generate the XML bundle manually with
uv run python -m ai_tools_website.v1.sitemap_builder --dry-runor omit--dry-runto publish directly to MinIO. - Sitemaps are stored under the
sitemaps/prefix in object storage and served through/sitemap.xmlplus/sitemaps/<file>.xmlroutes.
To deploy using Docker Compose:
# Build and start all services
docker compose up -d
# View logs
docker compose logs -f
# Stop services
docker compose downMake sure to configure your .env file before deployment. See Configuration section above for required variables.
This project is licensed under the Apache License 2.0.