feat: Playwright PDF engine (allow the PDF engine to be configured in the mkdocs.yml) by hkato · Pull Request #58 · domWalters/mkdocs-to-pdf

hkato · 2025-03-22T11:20:10Z

ref. #51

Experimental feature

This change has no side effects on the WeasyPrint PDF engine.
headless_chrome_path option defaults to None
- Playwright bundles its own Chromium binary : None
- When Playwright uses an external Chromium, it requires an absolute path

plugins:
  - to-pdf:
      pdf_engine: chromium                                 # default: weasyprint
      headless_chrome_path: /usr/bin/chromium-browser      # default: None
      render_js: true                                      # default: None

Install

python -m venv .venv
source .venv/bin/activate
pip install mkdocs-material git+https://github.com/domWalters/mkdocs-to-pdf.git@impl-playwright-pdf-engine

By default, Playwright requires the installation of its own Chromium

root@1719ac0fad24:/docs# playwright install chromium --only-shell
Downloading Chromium Headless Shell 134.0.6998.35 (playwright build v1161) from https://cdn.playwright.dev/dbazure/download/playwright/builds/chromium/1161/chromium-headless-shell-linux.zip
100.9 MiB [====================] 100% 0.0s
Chromium Headless Shell 134.0.6998.35 (playwright build v1161) downloaded to /root/.cache/ms-playwright/chromium_headless_shell-1161
Downloading FFMPEG playwright build v1011 from https://cdn.playwright.dev/dbazure/download/playwright/builds/ffmpeg/1011/ffmpeg-linux.zip
2.3 MiB [====================] 100% 0.0s
FFMPEG playwright build v1011 downloaded to /root/.cache/ms-playwright/ffmpeg-1011

Otherwise, please specify the absolute path to the Chromium binary.

plugins:
  - to-pdf:
      pdf_engine: chromium
      # Chromium on Ubuntu/Alma/Alpine
      headless_chrome_path: /usr/bin/chromium-browser
      # Chromium on Debian/Arch
      #headless_chrome_path: /usr/bin/chromium
      # Google Chrome on Linux(deb/rpm)
      #headless_chrome_path: /usr/bin/google-chrome
      # Google Chrome on macOS
      #headless_chrome_path: '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome'
      # Google Chrome on Windows
      #headless_chrome_path: 'C:/Program Files/Google/Chrome/Application/chrome.exe'
      # Microsoft Edge on Windows
      #headless_chrome_path: 'C:/Program Files (x86)/Microsoft/Edge/Application/msedge.exe'

Code changed

Added: Chromium-based headless browser PDF engine implementation.
- using Playwright for Python
Changed: render_js impl. from subprocess to Playwright
Changed: WeasyPrint dependencies removed from the main routine
- WeasyPrint's URL/IRI utilities replaced with its own urllib-based utilities.
  - weasyprint.urls.url_is_absolute -> preprocessor.link.util.is_absolute_url
  - weasyprint.urls.iri_to_uri -> preprocessor.link.util.iri_to_uri
  - Unit Test for utilities

Test

pytest

$ uv run pytest . -vv
============================================================= test session starts =============================================================
platform darwin -- Python 3.9.21, pytest-8.3.5, pluggy-1.5.0 -- /Users/hideyuki/Workspaces/github.com/domWalters/mkdocs-to-pdf/.venv/bin/python3
cachedir: .pytest_cache
rootdir: /Users/hideyuki/Workspaces/github.com/domWalters/mkdocs-to-pdf
configfile: pyproject.toml
collected 3 items                                                                                                                             

tests/mkdocs_to_pdf/preprocessor/links/test_util.py::test_is_absolute_url PASSED                                                        [ 33%]
tests/mkdocs_to_pdf/preprocessor/links/test_util.py::test_iri_to_uri PASSED                                                             [ 66%]
tests/test_links.py::TransformHrefTestCase::test_transform_href PASSED                                                                  [100%]

============================================================== 3 passed in 0.24s ==============================================================

PDF check(a little bit)

macOS 15.3.2 24D81
Windows 10 22H2
Docker container:
- Debian GNU/Linux 12 (bookworm)
- Ubuntu 24.04.1 LTS
- AlmaLinux release 9.5 (Teal Serval) (Playwright does not support this)
- Arch Linux 20250316.0.322463 (Playwright does not support this)
- Alpine Linux v3.21 (Playwright does not support this) with Node.js binary replaced

Note

Font issues probably remain
Need to add documents same as chrome option
To run pytest, need to add configuration to pyproject.toml
mkdocs-material sample issues are fixed
- Bug: samples/mkdocs-material: PDF cuts off at Section 1.3.1 #34
- Bug: samples/mkdocs-material: Table of Contents indexes incorrectly when the PDF truncates. #35

Operating Systems that install chromium using snap such as Ubuntu cannot open files from /tmp

- Playwright bundles its own Chromium binary : None - When Playwright uses an external Chromium, it requires an absolute path

hkato · 2025-03-23T05:18:45Z

@domWalters

Please test whether the generated PDFs are fine or not.
Could you review and merge this into the 'develop' branch as an experimental feature?

domWalters · 2025-04-29T22:31:44Z

This is gonna be the next thing I look at.

Hopefully, I'll have Sunday afternoon to do this.

hkato · 2025-05-01T02:24:37Z

@domWalters

I've noticed a critical issue: page numbers are not appearing in the TOC, and neither are the author and copyright information. This seems to be a limitation of Chrome.

Looking at other implementations (Vivliostyle), it appears they perform the PDF conversion using Chrome/Playwright and implement these features as a post-processing step.

Given that this is an insufficient implementation, should we consider withdrawing it for now? Or should we proceed with it as an experimental implementation with the aim of implementing post-processing in the future?

I'd like to integrate the insufficient Chrome-based PDF conversion as a hidden option, and then implement post-processing (adding page numbers to the TOC) as the next step.

luminoso · 2025-12-05T16:30:17Z

Thank you. been testing this branch and other than a few glitches it is working very well. a lot better than the default engine. Any expectations of a merge?

jernejfrank · 2025-12-10T15:01:35Z

Thanks, +1 on getting this merged. Have also been using this branch instead of the pip package for a while and can confirm it behaves nicer than the default engine.

hkato added 6 commits March 22, 2025 16:19

change: replace URL/IRI utils from weasyprint.urls to build-in urllib

520345e

fix: parameter name

282dde0

add: unit test for URL/IRI utilities

e9def5e

feat: WeasyPrint PDF engine has been modularized

b9570e9

add: Chromium PDF engine as a prototype

4f904e1

feat: add Plarywright Chromium PDF engine

8d9dadb

hkato requested a review from domWalters March 22, 2025 11:20

hkato self-assigned this Mar 22, 2025

hkato added 3 commits March 22, 2025 23:51

feat: render_js impl. has been changed from subprocess to Playwright

e9908f0

fix: render_js temporary file path changed

11abfc9

Operating Systems that install chromium using snap such as Ubuntu cannot open files from /tmp

change: headless_chrome_path option defaults to None

eafe834

- Playwright bundles its own Chromium binary : None - When Playwright uses an external Chromium, it requires an absolute path

domWalters added this to the v0.11.0 milestone Mar 22, 2025

domWalters marked this pull request as ready for review March 22, 2025 18:49

refactor: Playwright/Chromium processing move to HeadlessChromeDriver

050bbe2

hkato mentioned this pull request Mar 23, 2025

Proposal: Enable PDF engine switching #48

Open

hkato added the enhancement New feature or request label Mar 26, 2025

domWalters linked an issue Apr 29, 2025 that may be closed by this pull request

Proposal: Enable PDF engine switching #48

Open

hkato and others added 10 commits May 1, 2025 16:51

fix: Use css for page size (Chrome defaults to Letter without option)

e928207

feat: add shell.nix for nixos developers

c48cf08

Merge branch 'develop' into impl-playwright-pdf-engine

a83213c

Merge branch 'nixos' into impl-playwright-pdf-engine

29fc32b

feat: swap to chromium renderer

c258283

Merge branch 'develop' into impl-playwright-pdf-engine

d5529e6

chore: update requirements*.txt

d5b4a4d

docs: add new settings to docs, remove from migrate list

34130a2

fix: address #81

e5db0f7

docs: add FAQ for chromium snap issue

6eb388c

domWalters linked an issue Jul 22, 2025 that may be closed by this pull request

Documentation for toc_level gives the wrong default #81

Open

domWalters added 2 commits July 22, 2025 23:32

docs: change samples to generate in chromium

f986e4b

test: add test framework

ceb982a

This was linked to issues Jul 22, 2025

Bug: samples/mkdocs-material: PDF cuts off at Section 1.3.1 #34

Open

Bug: samples/mkdocs-material: Table of Contents indexes incorrectly when the PDF truncates. #35

Open

domWalters added 2 commits July 22, 2025 22:49

docs: add mathjax example

2e535c2

fix: change README to stable docs

c9f1a4e

domWalters linked an issue Jul 22, 2025 that may be closed by this pull request

Change README links to use the stable docs #78

Open

domWalters added 5 commits July 22, 2025 23:03

fix: add chromium to readthedocs runner

b0ef6fd

fix: ensure that readthedocs can use the playwright downloaded chromium

4f94148

fix: manually install chromium with playwright

9ebf9ba

fix: readthedocs shell is sh

2d95490

fix: readthedocs venv appears to be globally available

76cc5d2

domWalters mentioned this pull request Jul 28, 2025

PDF rendered is incomplete #74

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Playwright PDF engine (allow the PDF engine to be configured in the mkdocs.yml)#58

feat: Playwright PDF engine (allow the PDF engine to be configured in the mkdocs.yml)#58
hkato wants to merge 29 commits intodevelopfrom
impl-playwright-pdf-engine

hkato commented Mar 22, 2025 •

edited

Loading

Uh oh!

hkato commented Mar 23, 2025

Uh oh!

domWalters commented Apr 29, 2025

Uh oh!

hkato commented May 1, 2025

Uh oh!

luminoso commented Dec 5, 2025 •

edited

Loading

Uh oh!

jernejfrank commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

hkato commented Mar 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Experimental feature

Install

Code changed

Test

pytest

PDF check(a little bit)

Note

Uh oh!

hkato commented Mar 23, 2025

Uh oh!

domWalters commented Apr 29, 2025

Uh oh!

hkato commented May 1, 2025

Uh oh!

luminoso commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jernejfrank commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hkato commented Mar 22, 2025 •

edited

Loading

luminoso commented Dec 5, 2025 •

edited

Loading