Skip to content

Switch from Algolia DocSearch to Lunr offline search#168

Merged
alamb merged 4 commits intoapache:productionfrom
vinooganesh:vinooganesh/switch-to-lunr-offline
Feb 24, 2026
Merged

Switch from Algolia DocSearch to Lunr offline search#168
alamb merged 4 commits intoapache:productionfrom
vinooganesh:vinooganesh/switch-to-lunr-offline

Conversation

@vinooganesh
Copy link
Copy Markdown
Collaborator

cc @alamb
Algolia DocSearch loads from cdn.jsdelivr.net which is blocked by Apache's CSP, so search is completely broken on production.

This switches to Docsy's built-in Lunr offline search — fully client-side, no CDN, no API keys. Hugo generates a search index JSON at build time and Lunr searches it in the browser. Follows the Docsy search docs.

Changes

  • hugo.toml — enable offlineSearch, remove Algolia config, bump max results to 25 and summary length to 200
  • layouts/partials/scripts.html — override removing Algolia JS and markmap CDN blocks

Note: after cloning, run hugo once before hugo server to generate the search index.

Part of #163

Copy link
Copy Markdown
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks (again) @vinooganesh

I tested this locally and it seems like a step forward to me

Image

ANd it worked great

Image

It is strange that some of the pages don't have any way to click on them 🤔

Image

Pages with only linkTitle (no title) had empty .Title in Hugo,
causing Lunr search results to render with no visible link text.
@vinooganesh vinooganesh force-pushed the vinooganesh/switch-to-lunr-offline branch from f92dccf to eb7dec6 Compare February 22, 2026 21:37
@vinooganesh
Copy link
Copy Markdown
Collaborator Author

@alamb fixed! fix was adding a title: to each page

Copy link
Copy Markdown
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vinooganesh

This is looking much better. I merged up from main

One thing I noticed is that there are still some pages that seem to be missing their titles. That being said, this change seems like an improvement to me (and we can perhaps continue to improve things as follow on PRs)

Image

@@ -0,0 +1,66 @@
{{/*
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the diff compared to the template

Run this command to compute the diff

diff -du  ~/go/pkg/mod/github.com/google/docsy\@v0.12.0/layouts/_partials/scripts.html layouts/partials/scripts.html

Here is the actual diff:

--- /Users/andrewlamb/go/pkg/mod/github.com/google/docsy@v0.12.0/layouts/_partials/scripts.html	2026-02-11 07:27:43
+++ layouts/partials/scripts.html	2026-02-21 15:07:02
@@ -1,26 +1,18 @@
+{{/*
+  Project-level override of Docsy's layouts/_partials/scripts.html
+
+  Why this file exists:
+  Removes the Algolia DocSearch JS block (cdn.jsdelivr.net) which is blocked
+  by Apache's CSP. Also removes the markmap-autoloader CDN reference since
+  markmap is not enabled on this site.
+
+  See: https://github.com/apache/parquet-site/issues/163
+*/ -}}
 {{ $needKaTeX  := or .Params.math .Site.Params.katex.enable .Params.chem .Site.Params.chem (.Page.Store.Get "hasKaTeX") (.Page.Store.Get "hasmhchem") -}}
 {{ $needmhchem := or .Params.chem .Site.Params.katex.mhchem.enable (.Page.Store.Get "hasmhchem") -}}

-{{ if .Site.Params.markmap.enable -}}
-<style>
-.markmap > svg {
-  width: 100%;
-  height: 300px;
-}
-</style>
-<script>
-window.markmap = {
-  autoLoader: {
-      manual: true,
-      onReady() {
-        const { autoLoader, builtInPlugins } = window.markmap;
-        autoLoader.transformPlugins = builtInPlugins.filter(plugin => plugin.name !== 'prism');
-      },
-  },
-};
-</script>
-<script src="https://cdn.jsdelivr.net/npm/markmap-autoloader"></script>
-{{ end -}}
+{{/* markmap block removed — it loads from cdn.jsdelivr.net which is blocked
+     by Apache's CSP, and markmap is not enabled on this site anyway. */ -}}

 {{ if .Site.Params.plantuml.enable -}}
   <script src='{{ "js/deflate.js" | relURL }}'></script>
@@ -70,25 +62,5 @@
     crossorigin="anonymous"></script>
 {{ end -}}

-{{ if and .Site.Params.search (isset .Site.Params.search "algolia") -}}
-  {{ template "algolia/scripts" .Site.Params.search.algolia -}}
-{{ end -}}
 <script src='{{ "js/tabpane-persist.js" | relURL }}'></script>
 {{ partial "hooks/body-end.html" . -}}
-
-{{ define "algolia/scripts" -}}
-<script src="https://cdn.jsdelivr.net/npm/@docsearch/js@3.8.2"
-  integrity="sha512-lsD+XVzdBI6ZquXc8gqbw0/bgrfIsMJwY/8xvmvbN+U3gZSeG7BXQoCq4zv/yCmntR2GLHtgB+bD4ESPsKIbIA=="
-  crossorigin="anonymous" ></script>
-<script type="text/javascript">
-const containers = ['#docsearch-0', '#docsearch-1'];
-for (let c of containers) {
-  docsearch({
-    container: c,
-    appId: {{ .appId | default "R2IYF7ETH7" }},
-    apiKey: {{ .apiKey | default "599cec31baffa4868cae4e79f180729b" }},
-    indexName: {{ .indexName | default "docsearch" }},
-  });
-}
-</script>
-{{ end -}}

@vinooganesh
Copy link
Copy Markdown
Collaborator Author

Hey @alamb hmm I'm not able to repro - did you re-run hugo serve after the merge? The index should be recreated upon running that
Screenshot 2026-02-22 at 17 51 14

@alamb
Copy link
Copy Markdown
Collaborator

alamb commented Feb 24, 2026

I thought I did but maybe not. I'll merge this and see how it goes

@alamb alamb merged commit 90ea0a7 into apache:production Feb 24, 2026
1 check passed
@alamb
Copy link
Copy Markdown
Collaborator

alamb commented Feb 24, 2026

Thanks again @vinooganesh

@alamb
Copy link
Copy Markdown
Collaborator

alamb commented Feb 24, 2026

Nice -- it is live and appears to be working well

https://parquet.apache.org/docs/

Screenshot 2026-02-24 at 7 21 05 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants