A workflow for converting LaTeX documents to accessible HTML with PDF output.
This project converts .tex files into web-ready HTML pages using LaTeXML, while also generating PDF versions via pdflatex. The HTML output is customised with additional CSS, JavaScript, and accessibility enhancements.
latexport/
├── static/ # Shared CSS/JS assets (source of truth)
│ ├── css/
│ │ └── custom.css
│ └── js/
│ ├── custom.js
│ └── mathjax-config.js
├── latexml/ # Custom LaTeXML bindings (.ltxml); all loaded automatically
│ ├── amsmath-compat.ltxml
│ └── emph-in-math.ltxml
├── output/ # Generated output (seeded from static/ on each run)
│ ├── css/ # Copied from static/css/
│ ├── js/ # Copied from static/js/
│ ├── index.html # Generated by latexport-index
│ └── {document}/ # Per-document output
│ ├── index.html
│ └── {document}.pdf
├── templates/ # HTML templates
├── main.py # Main processing script
├── create_main_index.py # Index page generator
├── embed_assets.py # Self-contained HTML bundler
└── config.py # Configuration settings
Install uv, which manages the Python version and dependencies:
curl -LsSf https://astral.sh/uv/install.sh | shLaTeXML converts .tex files to HTML5.
macOS:
brew install latexmlUbuntu / Debian:
sudo apt install latexmlOther: see the LaTeXML installation docs.
A TeX distribution provides pdflatex, used to produce PDF output.
macOS:
brew install --cask mactex-no-guiUbuntu / Debian:
sudo apt install texlive-latex-baseAlready have TeX Live? Install only pdflatex via tlmgr:
tlmgr install pdftexbibtex is included with most TeX distributions. For biber (used with biblatex):
tlmgr install biberBoth are optional — latexport auto-detects whether they are needed based on the source file.
# Clone the repository
git clone <repository-url>
cd latexport
# Install Python dependencies and register CLI commands
uv sync
uv pip install -e .Convert one or more .tex files to HTML and PDF:
# Process a single file
uv run latexport tex_src/example.tex
# Process multiple files
uv run latexport tex_src/file1.tex tex_src/file2.tex
# Write output to a custom directory instead of output/
uv run latexport -o ./public tex_src/example.tex
# Override the output subdirectory name (single file only)
uv run latexport --name lecture-notes tex_src/example.tex
# → output goes to output/lecture-notes/ instead of output/example/
# Dry run (preview without changes)
uv run latexport -n tex_src/example.texThis will:
- Seed the output directory with shared assets from
static/ - Auto-detect whether bibliography processing (
bibtex/biber) is needed - If
\citecommands are present: run bibtex/biber before LaTeXML so citations resolve in HTML - Generate HTML at
{output}/{stem}/index.html(via LaTeXML, with alllatexml/*.ltxmlbindings) - Generate PDF at
{output}/{stem}/{stem}.pdf(via pdflatex, with bibtex/biber if needed) - Clean up auxiliary files (
.aux,.log,.out,.bbl,.blg,.bcf,.run.xml) - Remove empty subdirectories left by pdflatex's
\includehandling - Inject custom CSS and JavaScript references
- Replace QED symbols with accessible HTML
- Consolidate local CSS files to the shared
css/folder
Create an index page listing all documents:
# Use the default output directory (from config.py)
uv run latexport-index
# Use a custom output directory
uv run latexport-index -o examples/outputThis scans the output directory for index.html files and generates a main index with links to each document (and PDF if available).
Remove latexml.log files left behind by LaTeXML:
uv run latexport-cleanThis removes latexml.log from the current directory and recursively from the output directory. During a normal latexport run these are cleaned up automatically; latexport-clean handles any leftovers from previous runs.
Inline all CSS and JS into a single portable file:
# Bundle with all assets inlined (CSS + JS) — default behaviour
uv run embed_assets.py output/example/index.html
# Bundle but skip remote assets (they remain as external references)
uv run embed_assets.py --skip-remote output/example/index.html
# Bundle CSS only — leave <script src> tags untouched
uv run embed_assets.py --skip-js output/example/index.html
# Write the bundled file to a custom path
uv run embed_assets.py output/example/index.html dist/standalone.htmlEdit config.py to customise paths and settings:
OUTPUT_DIR = Path("./output") # Root directory for generated output
STATIC_DIR = Path("./static") # Shared CSS/JS source; copied into output on each run
LATEXML_DIR = Path(__file__).parent / "latexml" # LaTeXML binding files (absolute path)
SRC_QED_SYMBOL = "∎" # QED symbol to replace in HTML
ENCODING = "utf-8" # File encoding
# Index generator settings
ROOT_DIR = OUTPUT_DIR
PATTERN = "index.html"
TEMPLATE_PATH = Path("templates/main_index_template.html")Live demos are published at https://kalv25.github.io/latexport/.
Source: latex3/latex2e — © American Mathematical Society / LaTeX Project, LPPL 1.3c.
A self-contained file with no \include dependencies. The stem is overridden
so the output folder has a descriptive name rather than the generic testmath.
uv run latexport \
-o examples/output \
--name latex2e-testmath \
examples/tex_src/testmath.texOutput:
examples/output/latex2e-testmath/index.html
examples/output/latex2e-testmath/testmath.pdf
Live: https://kalv25.github.io/latexport/latex2e-testmath/
Source: hermish/proofs-notes — CS70 lecture notes by Hermish Mehta.
A document split across multiple files via \include. latexport creates the
required subdirectories for pdflatex, then removes them once they are empty
after aux file cleanup.
uv run latexport \
-o examples/output \
--name hermish-proofs-notes \
examples/tex_src/hermish-proofs-notes/main.texOutput:
examples/output/hermish-proofs-notes/index.html
examples/output/hermish-proofs-notes/main.pdf
Live: https://kalv25.github.io/latexport/hermish-proofs-notes/
After converting one or more documents, build the navigable index page:
uv run latexport-index -o examples/outputThis scans examples/output/ and writes examples/output/index.html with
links to each document (and its PDF where available).
- Write LaTeX — Create/edit
.texfiles intex_src/ - Convert to HTML/PDF — Run
uv run latexport tex_src/yourfile.tex - Regenerate index — Run
uv run latexport-index - Deploy — Upload
output/to your web server
Edit static/css/custom.css. This file is automatically copied into the output directory and injected into every processed HTML file.
Edit files in static/js/. The following are automatically injected:
custom.js— Page-width slider, MathJax toggle, go-to-top buttonmathjax-config.js— MathJax configuration
All user-visible strings in the toolbar are read from window.latexportI18n.
To override them for another language, add a <script> block before custom.js loads:
<script>
window.latexportI18n = {
widthLabel: 'Breite',
widthAriaLabel: 'Seitenbreite in ch-Einheiten',
mathOn: 'Formel ✓',
mathOff: 'Formel ✗',
mathAriaOn: 'MathJax-Darstellung ein',
mathAriaOff: 'MathJax-Darstellung aus',
goToTopAria: 'Zum Seitenanfang',
};
</script>Only the keys you want to change need to be provided; omitted keys fall back to the English defaults.
Custom LaTeXML behaviour is defined in .ltxml files inside latexml/. These are Perl modules loaded via --preload on every latexmlc invocation. All .ltxml files in latexml/ are loaded automatically (alphabetical order) — no changes to main.py needed when adding new ones.
Currently included:
amsmath-compat.ltxml— no-op stubs for amsmath internal commands (e.g.\ctagsplit@true) that would otherwise cause "undefined macro" errors.emph-in-math.ltxml— redefines\emph{…}as\mathit{…}inside math environments,\textit{…}elsewhere.
To add a new binding, simply create a .ltxml file in latexml/.
Edit templates/main_index_template.html. The template uses Python str.format-style placeholders:
| Placeholder | Default | Description |
|---|---|---|
{lang} |
en |
<html lang> attribute |
{title} |
Documents |
<title> and <meta name="description"> |
{description} |
Document index |
Meta description content |
{heading} |
Documents |
<h1> text |
{contents_label} |
Contents |
<h3> section label |
{links} |
(generated) | Rendered <li> elements — filled automatically |
To generate the index in another language, pass keyword arguments to create_main_index_page:
create_main_index_page(
root_dir=Path("output"),
lang="de",
title="Dokumente",
description="Dokumentenindex",
heading="Dokumente",
contents_label="Inhalt",
)SVG images (e.g., diagrams generated by TikZ) use a simple CSS filter to invert colours in dark mode. This works well for simple black-and-white diagrams but may produce unexpected results when multiple colours are used. Always test your documents in dark mode to verify SVG rendering.
LaTeXML does not support all LaTeX packages and document structures. Known cases where HTML conversion fails or produces degraded output:
Multi-part documents — Projects where the root .tex file relies on a custom build system, non-standard \include chaining, or shared preamble files split across multiple directories may not convert correctly. LaTeXML resolves includes relative to --sourcedirectory; files outside that tree are not found.
memoir class — Documents using the memoir document class are not reliably converted. LaTeXML has limited support for memoir's extended sectioning, captioning, and page-layout commands. For example, the UiO Introduction to LaTeX repository uses memoir and fails to produce usable HTML output.
In these cases pdflatex still produces a correct PDF; only the HTML output is affected. Consider restructuring such documents to use a standard class (article, report, book) for full LaTeXML compatibility.
Resources that informed this project:
- Using LaTeXML to convert your LaTeX files to accessible HTML — a practical guide to LaTeXML from the University of London's Inclusive Working Group.
- Make Your LaTeX Documents Accessible — Volker RH Sorge, Notices of the AMS, January 2023. Motivation for accessible LaTeX output.
- Tagged and Accessible PDF with LaTeX — revisited — PDF Association presentation on producing tagged, accessible PDFs from LaTeX.
Contributions are welcome — see CONTRIBUTING.md for setup instructions, code style, and how to submit a pull request.
MIT — see LICENSE.