A Python-based SEO auditing tool that analyzes websites for Generative Engine Optimization (GEO) and Answer Engine Optimization (AEO) — the 2025 standards for AI search visibility.
- Schema Markup Analysis: Detects and validates JSON-LD structured data
- llms.txt Detection: Checks for the new AI discovery standard at
/llms.txt - Content Structure Analysis: Evaluates AI-friendly formatting (lists, tables, FAQs, headings)
- E-E-A-T Assessment: Analyzes experience, expertise, authoritativeness, and trustworthiness signals
- Semantic Clarity Scoring: Measures topic focus and context richness
- AI Citation Potential: Estimates likelihood of being cited by AI search engines
- Deep Crawling: Supports multi-page website analysis with configurable depth
- CSV Export: Generates detailed reports for tracking progress over time
AI search engines (ChatGPT, Perplexity, Claude, Google AI Overviews, Gemini) are changing how content is discovered. This tool helps optimize for:
- Direct answer citations in AI responses
- Structured data that AI models can parse
- Clear, scannable content formats
- Authority signals that build trust
Reference: For more on SEO in the AI era, see article SEO for the AI Era: A 2025 Quick Guide
-
Clone the repository:
git clone https://github.com/ngstcf/ai-seo-auditor.git cd ai-seo-auditor -
Install dependencies:
pip install -r requirements.txt
-
Install crawl4ai browser setup:
crawl4ai-setup
Create a .env file in the project root:
# Required: OpenAI-compatible API key
LLM_API_KEY=your_api_key_here
# Optional: Custom API endpoint (defaults to OpenAI)
LLM_BASE_URL=https://api.openai.com/v1
# Optional: Model to use (defaults to gpt-4o-mini)
LLM_MODEL=gpt-4o-mini
# Optional: Crawl depth and page limits
MAX_DEPTH=1
MAX_PAGES=5A .env.example file is provided for reference.
Run the auditor:
python seo_audit.pyFollow the prompts to enter a URL. The tool will:
- Check for
/llms.txtat the domain root - Crawl the specified page(s)
- Analyze schema markup, content structure, and E-E-A-T signals
- Generate an AI-powered SEO report
- Save results to
ai_seo_report_2025.csv
The tool generates:
- Console Report: Real-time analysis with scores and recommendations
- CSV Report: Detailed metrics for historical tracking
| Category | What It Measures |
|---|---|
| Schema Markup | JSON-LD presence and quality |
| llms.txt | AI discovery file presence |
| Content Structure | Lists, tables, FAQs, headings |
| E-E-A-T Signals | Trust and authority indicators |
| Semantic Clarity | Topic focus and context |
- Python 3.10+
- OpenAI API key (or compatible endpoint)
- Schema Validation: Detects presence of JSON-LD but does not perform deep validation against Schema.org specifications
- llms.txt Content: Checks for file existence only; does not validate syntax or completeness of the file contents
- E-E-A-T Heuristics: Trust signals are inferred from content patterns; cannot verify real-world credentials or authority
- Single-Language: Optimized for English content analysis; other languages may have reduced accuracy
- LLM Dependency: Analysis quality depends on the model used; different LLMs may produce varying scores
- Schema Validation: Integrate Schema.org validator to check markup correctness
- llms.txt Parser: Full parsing and validation of llms.txt file structure and content
- Multi-language Support: Extend LLM prompts and content analysis for non-English sites
- Historical Tracking: Built-in comparison of previous audits to show progress over time
- Competitor Analysis: Compare multiple sites side-by-side
- Export Formats: Add JSON and HTML report options
- CI/CD Integration: GitHub Action for automated SEO checks on PRs
- Batch Mode: Analyze multiple URLs from a file without interactive prompts
- Core Web Vitals: Integrate page speed and UX metrics from Lighthouse/PageSpeed Insights
MIT License - see LICENSE for details.
Contributions are welcome! Feel free to open issues or submit pull requests.
Created by Ng Chong @ngstcf