Conversation
There was a problem hiding this comment.
Pull request overview
Updates crawling and MkDocs configuration to allow LLM user-agents to access /ai/ markdown resources while providing a way to disable LLM-related MkDocs plugins via an environment toggle.
Changes:
- Removed the
/ai/disallow rule fromrobots.txt. - Added
ENABLED_LLMS_PLUGINS-gated enablement for LLM-related MkDocs plugins. - Documented the new local-development toggle in
README.md.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| robots.txt | Allows crawlers (including LLM user-agents) to access /ai/ paths by removing the disallow rule. |
| mkdocs.yml | Adds env-controlled enabled flags for LLM-related MkDocs plugins. |
| README.md | Documents how to disable git revision and LLM plugins locally for faster builds. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
This updated version works as follows:
This combination will allow LLM user-related bots to access the /ai/ dir AND our regular webpages, block search engine indexers from grabbing the /ai/ files to prevent duplicate content and wrong file format in search results, and block all LLM crawlers who are scraping for training data from both /ai/ and the webpages. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Using Disallow: /ai/ was keeping user-based agent bots like Claude and ChatGPT from being able to access the LLM markdown files intended for their use. For now, we can remove all Disallow statements as Google doesn't index Markdown files at all so there is no concern around search results serving the MD rather than HTML version of pages to users.