Switch from Algolia DocSearch to Lunr offline search#168
Switch from Algolia DocSearch to Lunr offline search#168alamb merged 4 commits intoapache:productionfrom
Conversation
alamb
left a comment
There was a problem hiding this comment.
Thanks (again) @vinooganesh
I tested this locally and it seems like a step forward to me
ANd it worked great
It is strange that some of the pages don't have any way to click on them 🤔
Pages with only linkTitle (no title) had empty .Title in Hugo, causing Lunr search results to render with no visible link text.
f92dccf to
eb7dec6
Compare
|
@alamb fixed! fix was adding a |
…tch-to-lunr-offline
alamb
left a comment
There was a problem hiding this comment.
Thanks @vinooganesh
This is looking much better. I merged up from main
One thing I noticed is that there are still some pages that seem to be missing their titles. That being said, this change seems like an improvement to me (and we can perhaps continue to improve things as follow on PRs)
| @@ -0,0 +1,66 @@ | |||
| {{/* | |||
There was a problem hiding this comment.
Here is the diff compared to the template
Run this command to compute the diff
diff -du ~/go/pkg/mod/github.com/google/docsy\@v0.12.0/layouts/_partials/scripts.html layouts/partials/scripts.htmlHere is the actual diff:
--- /Users/andrewlamb/go/pkg/mod/github.com/google/docsy@v0.12.0/layouts/_partials/scripts.html 2026-02-11 07:27:43
+++ layouts/partials/scripts.html 2026-02-21 15:07:02
@@ -1,26 +1,18 @@
+{{/*
+ Project-level override of Docsy's layouts/_partials/scripts.html
+
+ Why this file exists:
+ Removes the Algolia DocSearch JS block (cdn.jsdelivr.net) which is blocked
+ by Apache's CSP. Also removes the markmap-autoloader CDN reference since
+ markmap is not enabled on this site.
+
+ See: https://github.com/apache/parquet-site/issues/163
+*/ -}}
{{ $needKaTeX := or .Params.math .Site.Params.katex.enable .Params.chem .Site.Params.chem (.Page.Store.Get "hasKaTeX") (.Page.Store.Get "hasmhchem") -}}
{{ $needmhchem := or .Params.chem .Site.Params.katex.mhchem.enable (.Page.Store.Get "hasmhchem") -}}
-{{ if .Site.Params.markmap.enable -}}
-<style>
-.markmap > svg {
- width: 100%;
- height: 300px;
-}
-</style>
-<script>
-window.markmap = {
- autoLoader: {
- manual: true,
- onReady() {
- const { autoLoader, builtInPlugins } = window.markmap;
- autoLoader.transformPlugins = builtInPlugins.filter(plugin => plugin.name !== 'prism');
- },
- },
-};
-</script>
-<script src="https://cdn.jsdelivr.net/npm/markmap-autoloader"></script>
-{{ end -}}
+{{/* markmap block removed — it loads from cdn.jsdelivr.net which is blocked
+ by Apache's CSP, and markmap is not enabled on this site anyway. */ -}}
{{ if .Site.Params.plantuml.enable -}}
<script src='{{ "js/deflate.js" | relURL }}'></script>
@@ -70,25 +62,5 @@
crossorigin="anonymous"></script>
{{ end -}}
-{{ if and .Site.Params.search (isset .Site.Params.search "algolia") -}}
- {{ template "algolia/scripts" .Site.Params.search.algolia -}}
-{{ end -}}
<script src='{{ "js/tabpane-persist.js" | relURL }}'></script>
{{ partial "hooks/body-end.html" . -}}
-
-{{ define "algolia/scripts" -}}
-<script src="https://cdn.jsdelivr.net/npm/@docsearch/js@3.8.2"
- integrity="sha512-lsD+XVzdBI6ZquXc8gqbw0/bgrfIsMJwY/8xvmvbN+U3gZSeG7BXQoCq4zv/yCmntR2GLHtgB+bD4ESPsKIbIA=="
- crossorigin="anonymous" ></script>
-<script type="text/javascript">
-const containers = ['#docsearch-0', '#docsearch-1'];
-for (let c of containers) {
- docsearch({
- container: c,
- appId: {{ .appId | default "R2IYF7ETH7" }},
- apiKey: {{ .apiKey | default "599cec31baffa4868cae4e79f180729b" }},
- indexName: {{ .indexName | default "docsearch" }},
- });
-}
-</script>
-{{ end -}}|
Hey @alamb hmm I'm not able to repro - did you re-run |
|
I thought I did but maybe not. I'll merge this and see how it goes |
|
Thanks again @vinooganesh |
|
Nice -- it is live and appears to be working well https://parquet.apache.org/docs/
|


cc @alamb
Algolia DocSearch loads from cdn.jsdelivr.net which is blocked by Apache's CSP, so search is completely broken on production.
This switches to Docsy's built-in Lunr offline search — fully client-side, no CDN, no API keys. Hugo generates a search index JSON at build time and Lunr searches it in the browser. Follows the Docsy search docs.
Changes
hugo.toml— enableofflineSearch, remove Algolia config, bump max results to 25 and summary length to 200layouts/partials/scripts.html— override removing Algolia JS and markmap CDN blocksNote: after cloning, run
hugoonce beforehugo serverto generate the search index.Part of #163