fix(seo): apply critical and high-priority SEO fixes by pbrissaud · Pull Request #248 · kubeasy-dev/website

pbrissaud · 2026-03-13T18:16:36Z

Summary

Add robots.txt (was missing) — no /_next/* block, AI crawler rules for GPTBot/ClaudeBot/PerplexityBot
Add public/llms.txt for AI search readiness (ChatGPT, Perplexity, Claude)
Fix relative URLs in generateLearningResourceSchema and generateCourseSchema — now absolute
Fix generateMetadata returning {} on 404 in challenges/[slug] and themes/[slug] → noIndex: true
Add noindex metadata to login page
Guard ReactQueryDevtools behind NODE_ENV=development (~150KB JS saved in prod)
Fix schema spec violations in lib/seo.ts: SearchAction.target, applicationCategory, SoftwareApplication.url, hasCourseInstance, remove invalid timeRequired from BlogPosting, add @id to Organization and WebSite
Move SEO audit docs to docs/seo/

Test plan

Verify https://kubeasy.dev/robots.txt is accessible and doesn't block /_next/
Verify https://kubeasy.dev/llms.txt is accessible
Check Google Rich Results Test on a challenge page — LearningResource URL should be absolute
Check Google Rich Results Test on a theme page — Course URL should be absolute
Confirm /login has noindex in page source
Confirm ReactQueryDevtools is absent from production bundle
Validate schemas via schema.org validator

🤖 Generated with Claude Code

@id

- Add robots.txt (was missing) without /_next/* block; include AI crawler rules - Add public/llms.txt for AI search readiness (ChatGPT, Perplexity, Claude) - Fix relative URLs in generateLearningResourceSchema and generateCourseSchema - Fix generateMetadata returning {} on 404 in challenges/[slug] and themes/[slug] - Add noindex metadata to login page - Guard ReactQueryDevtools behind NODE_ENV=development (~150KB JS saved in prod) - Fix schema spec violations in lib/seo.ts: - SearchAction.target: remove EntryPoint wrapper, use plain string - applicationCategory: "DeveloperApplication" → "Developer Tools" - Add url to SoftwareApplication schema - Add hasCourseInstance to Course schema - Remove invalid timeRequired from BlogPosting schema - Add @id to Organization and WebSite schemas - Move SEO audit docs to docs/seo/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

vercel · 2026-03-13T18:16:41Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
website	Ready	Preview, Comment	Mar 13, 2026 6:54pm

claude · 2026-03-13T18:16:57Z

Claude finished @pbrissaud's task in 2m 11s —— View job

PR Review: fix(seo): apply critical and high-priority SEO fixes

Tasks

Gather context and understand the PR
Read changed files
Analyze code quality, bugs, security, performance
Post review feedback

Overall Assessment

Good set of targeted SEO fixes. The absolute URL fix for structured data schemas and the noIndex on 404s are the most impactful changes. A few issues worth addressing before merging.

Issues Found

🔴 `applicationCategory` regression — `lib/seo.ts:121`

The PR changes applicationCategory from "DeveloperApplication" → "Developer Tools". This is actually a step backward: "DeveloperApplication" maps to the schema:DeveloperApplication enumeration recognized by structured data validators, while "Developer Tools" is an arbitrary string. Google's Rich Results documentation recommends using the schema.org enum URL (https://schema.org/DeveloperApplication) or its short form. The original value was closer to spec.

Fix this →

🟡 `robots.txt` — `Host:` directive is Yandex-only — `public/robots.txt:20`

Host: https://kubeasy.dev

The Host: directive is only understood by Yandex (and even there it's legacy). Google, Bing, and all other major crawlers ignore it. It also incorrectly includes the https:// prefix — Yandex's spec expects just the hostname (kubeasy.dev). Safe to remove.

Fix this →

🟡 `robots.txt` — inconsistent `User-agent` casing — `public/robots.txt`

The default block uses User-agent: (lowercase a), while GPTBot, OAI-SearchBot, ClaudeBot and PerplexityBot blocks use User-Agent: (uppercase A). The spec is case-insensitive, but consistency makes the file easier to read and maintain.

🟡 `robots.txt` — `/login` not disallowed despite being `noindex` — `public/robots.txt`

The login page is now noindex, but it's not in robots.txt. The noindex meta tag is sufficient for Google, but best practice is to keep both in sync. Consider adding:

Disallow: /login

🟡 `ReactQueryDevtools` — static import may not fully eliminate from prod bundle — `app/layout.tsx:16`

import { ReactQueryDevtools } from "@tanstack/react-query-devtools";
// ...
{process.env.NODE_ENV === "development" && <ReactQueryDevtools initialIsOpen={false} />}

Next.js/webpack does perform dead code elimination on process.env.NODE_ENV constants, so this likely works correctly in practice. However, a static import at module level means the bundler must analyze the module before it can tree-shake it. For a guaranteed ~150KB saving, a lazy import is more reliable:

// Only import in development - guaranteed to not be bundled in prod
const ReactQueryDevtools =
  process.env.NODE_ENV === "development"
    ? (await import("@tanstack/react-query-devtools")).ReactQueryDevtools
    : null;

Or simply wrap in a dedicated DevTools component with a dynamic import. The current approach is acceptable if CI build validation confirms the production bundle doesn't include it.

🟡 `hasCourseInstance.courseWorkload` hardcoded — `lib/seo.ts:201-206`

hasCourseInstance: {
  "@type": "CourseInstance",
  courseMode: "online",
  courseWorkload: "PT30M",  // Hardcoded 30 min for ALL themes
  inLanguage: "en",
},

courseWorkload: "PT30M" is applied to every theme regardless of its actual challenge count. A theme with 15 challenges will still show 30 min. This will fail Google's Rich Results Test with incorrect data. Consider either passing courseWorkload as a parameter to generateCourseSchema, calculating it from the total estimated time, or omitting the field entirely until accurate data is available.

🟢 `llms.txt` — hardcoded URLs may go stale — `public/llms.txt`

The blog URLs and challenge URLs are hardcoded static content. This is fine for now, but if posts are removed or challenge slugs change, the file will contain broken links. No action required immediately, but worth tracking as the content library grows.

Positive Changes ✅

Absolute URLs in structured data — Critical fix. Relative URLs in @type: LearningResource and @type: Course would fail validation.
noIndex: true on 404 generateMetadata — Previously returning {} meant Next.js would use inherited/default metadata; now correctly suppresses indexing of 404 pages.
noIndex on login page — Correct; login has no SEO value and should not appear in search results.
@id on Organization and WebSite — Enables proper entity linking in Google's Knowledge Graph.
SearchAction.target as plain string — Correct per Google's current Sitelinks Searchbox spec (EntryPoint object was deprecated).
aggregateRating: undefined removal — Eliminates an invalid null-ish property from the schema output.
hasCourseInstance addition — Makes Course schema more complete (minus the hardcoded workload noted above).
robots.txt AI crawler rules — Forward-looking addition for GPTBot/ClaudeBot/PerplexityBot.

- Revert applicationCategory to "DeveloperApplication" (valid schema.org enum) - Use dynamic import for ReactQueryDevtools — guaranteed exclusion from prod bundle - Remove courseWorkload from hasCourseInstance (was hardcoded 30min for all themes) - Fix robots.txt: remove Yandex-only Host: directive, normalize User-agent casing, add Disallow: /login to match noindex meta tag Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claude · 2026-03-13T18:50:52Z

Claude finished @pbrissaud's task in 1m 50s —— View job

PR Review: fix(seo): apply critical and high-priority SEO fixes

Tasks

Gather context and understand the PR
Read changed files and diff
Analyze code quality, bugs, security, performance
Post review feedback

Follow-up Review (after review feedback commit)

All 6 issues raised in the initial review have been addressed. Here's a verification and the one remaining item I spotted.

Previous Issues — All Resolved ✅

Issue	Status
`applicationCategory: "Developer Tools"` (schema regression)	✅ Reverted to `"DeveloperApplication"` (`lib/seo.ts:120`)
`Host: https://kubeasy.dev` (Yandex-only directive)	✅ Removed from `public/robots.txt`
Inconsistent `User-agent` casing	✅ All entries now use `User-agent:` (lowercase `a`)
`/login` not in `robots.txt` `Disallow`	✅ Added at line 7
`ReactQueryDevtools` static import (prod bundle risk)	✅ Replaced with `next/dynamic` — guaranteed code splitting
Hardcoded `courseWorkload: "PT30M"` on all themes	✅ Field removed from `hasCourseInstance`

New Observation

🟡 `generateBlogPostSchema` is dead code — `lib/seo.ts:266-321`

The generateBlogPostSchema function is defined in lib/seo.ts but is no longer imported or called anywhere in the codebase (the blog page only imports generateMetadata and stringifyJsonLd). This leaves ~55 lines of unused code. It should either be removed or re-wired to the blog post page to actually emit the BlogPosting JSON-LD schema (which would be a meaningful SEO improvement for blog posts).

Fix this →

Overall Assessment

The PR is in good shape. The absolute URL fix for structured data, noIndex on 404s, dynamic ReactQueryDevtools, and the corrected robots.txt are all solid improvements. The dead code in generateBlogPostSchema is the only remaining item — whether to remove or re-wire it is a judgement call. Ready to merge once that's decided.

vercel bot deployed to Preview March 13, 2026 18:20 View deployment

vercel bot deployed to Preview March 13, 2026 18:54 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(seo): apply critical and high-priority SEO fixes#248

fix(seo): apply critical and high-priority SEO fixes#248
pbrissaud wants to merge 2 commits intomainfrom
seo/critical-and-high-fixes

pbrissaud commented Mar 13, 2026

Uh oh!

vercel bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

claude bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

claude bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pbrissaud commented Mar 13, 2026

Summary

Test plan

Uh oh!

vercel bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: fix(seo): apply critical and high-priority SEO fixes

Overall Assessment

Issues Found

🔴 applicationCategory regression — lib/seo.ts:121

🟡 robots.txt — Host: directive is Yandex-only — public/robots.txt:20

🟡 robots.txt — inconsistent User-agent casing — public/robots.txt

🟡 robots.txt — /login not disallowed despite being noindex — public/robots.txt

🟡 ReactQueryDevtools — static import may not fully eliminate from prod bundle — app/layout.tsx:16

🟡 hasCourseInstance.courseWorkload hardcoded — lib/seo.ts:201-206

🟢 llms.txt — hardcoded URLs may go stale — public/llms.txt

Positive Changes ✅

Uh oh!

claude bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: fix(seo): apply critical and high-priority SEO fixes

Follow-up Review (after review feedback commit)

Previous Issues — All Resolved ✅

New Observation

🟡 generateBlogPostSchema is dead code — lib/seo.ts:266-321

Overall Assessment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Mar 13, 2026 •

edited

Loading

claude bot commented Mar 13, 2026 •

edited

Loading

🔴 `applicationCategory` regression — `lib/seo.ts:121`

🟡 `robots.txt` — `Host:` directive is Yandex-only — `public/robots.txt:20`

🟡 `robots.txt` — inconsistent `User-agent` casing — `public/robots.txt`

🟡 `robots.txt` — `/login` not disallowed despite being `noindex` — `public/robots.txt`

🟡 `ReactQueryDevtools` — static import may not fully eliminate from prod bundle — `app/layout.tsx:16`

🟡 `hasCourseInstance.courseWorkload` hardcoded — `lib/seo.ts:201-206`

🟢 `llms.txt` — hardcoded URLs may go stale — `public/llms.txt`

claude bot commented Mar 13, 2026 •

edited

Loading

🟡 `generateBlogPostSchema` is dead code — `lib/seo.ts:266-321`