Skip to content

A collection of command-line tools for processing and converting Perplexity-generated markdown content, with special focus on proper citation and maths formatting and PDF generation.

License

Notifications You must be signed in to change notification settings

heseber/perplexity-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Perplexity Tools

A collection of command-line tools for processing and converting Perplexity-generated markdown content, with special focus on proper citation formatting and PDF generation.

Overview

This repository contains tools designed to handle the specific formatting challenges that arise when working with markdown content generated by Perplexity AI, particularly when converting to PDF with proper academic-style citations.

Why These Tools Are Needed

While Perplexity AI offers a built-in PDF export feature for conversation threads, the exported PDFs have a significant limitation: mathematical expressions are not rendered properly. Instead of displaying formatted equations, the raw LaTeX commands are shown verbatim in the PDF, making mathematical content unreadable.

To generate PDFs with correctly rendered mathematical expressions, you need to:

  1. Export your Perplexity conversation as Markdown (not PDF)
  2. Use the tools provided in this repository to preprocess and convert the markdown to a properly formatted PDF

This workflow ensures that mathematical expressions, citations, and other academic formatting elements are rendered correctly in the final PDF output.

Tools

1. perplexity-preprocess-md.py

A Python script that preprocesses markdown content to make it compatible with Pandoc's PDF conversion pipeline.

Features:

  • Converts footnotes to proper Pandoc citations ([^1] → [@ref1])
  • Consolidates duplicate references automatically
  • Fixes math expressions by converting dollar signs to proper LaTeX math delimiters (handles escaped \$, unescaped $, and mixed combinations)
  • Converts HTML center divs to LaTeX centering commands
  • Adds proper YAML front matter with bibliography entries
  • Supports multiple languages for citations

Usage:

# Basic usage (reads from stdin, outputs to stdout)
cat input.md | python3 perplexity-preprocess-md.py > output.md

# With language specification
cat input.md | python3 perplexity-preprocess-md.py -l de-DE > output.md

# Skip font fallback configuration
cat input.md | python3 perplexity-preprocess-md.py --no-fallback-fonts > output.md

# Available language options: en-US, de-DE, or shortcuts: en, de

2. perplexity-md-to-md

A bash function that performs basic markdown preprocessing, specifically fixing escaped math expressions.

Usage:

# Convert a markdown file
perplexity-md-to-md input.md
# Creates input-fixed.md in the same directory

3. perplexity-md-to-pdf

A bash function that converts markdown files directly to PDF using Pandoc with optimized settings for academic documents.

Features:

  • Automatic preprocessing using perplexity-preprocess-md.py
  • Two-column landscape layout for saving paper, unless the markdown contains tables (tables enforce one-column output)
  • Option to force single-column layout even when no tables are present
  • Option to force landscape orientation even when tables are present
  • Proper citation processing with --citeproc
  • Uses LuaLaTeX engine for better Unicode support with fallback fonts
  • Professional document formatting using the Eisvogel template
  • Syntax highlighting for code blocks
  • Configurable font selection (default: FreeSans)
  • Configurable language support

Usage:

# Convert to PDF with default settings (English)
perplexity-md-to-pdf document.md

# Convert with German language support
perplexity-md-to-pdf -l de document.md

# Convert with custom font
perplexity-md-to-pdf -f "Times New Roman" document.md

# Skip font fallback configuration
perplexity-md-to-pdf --no-fallback-fonts document.md

# Force single column layout
perplexity-md-to-pdf --single-column document.md

# Force landscape orientation even with tables
perplexity-md-to-pdf --landscape document.md

# Combine multiple options
perplexity-md-to-pdf -l de -f "DejaVu Serif" --no-fallback-fonts --single-column document.md

# Show help
perplexity-md-to-pdf --help

Installation

Prerequisites

  • Python 3.6+
  • Pandoc
  • LuaLaTeX (usually comes with TeXLive or MiKTeX)
  • Eisvogel template (see Eisvogel Template Installation below)
  • Bash shell (for the shell functions)
  • Font requirements (see Font Configuration section below)

Eisvogel Template Installation

The perplexity-md-to-pdf tool requires the Eisvogel Pandoc LaTeX template for PDF generation. This template provides professional document formatting with support for syntax highlighting, tables, and other advanced features.

Installation:

  1. Download the template:

    # Download the latest eisvogel.latex template
    curl -L https://raw.githubusercontent.com/Wandmalfarbe/pandoc-latex-template/master/eisvogel.latex -o eisvogel.latex
  2. Install to Pandoc templates directory:

    # Create pandoc templates directory if it doesn't exist
    mkdir -p ~/.pandoc/templates
    
    # Move the template to the pandoc templates directory
    mv eisvogel.latex ~/.pandoc/templates/

Alternative installation methods:

Note: The template must be accessible to Pandoc. Common locations where Pandoc looks for templates:

  • ~/.pandoc/templates/eisvogel.latex
  • ~/.local/share/pandoc/templates/eisvogel.latex
  • /usr/share/pandoc/data/templates/eisvogel.latex
  • Current directory (./eisvogel.latex)

Setup

  1. Clone this repository:
git clone <repository-url>
cd perplexity-tools
  1. Run the automated installation script:
./install.sh

The installation script will:

  • Install the Python script (perplexity-preprocess-md.py) to your local binary directory
  • Install the shell functions (perplexity-md-to-md and perplexity-md-to-pdf) to your shell function directory
  • Detect your shell (bash/zsh) and provide instructions for loading the functions
  1. Follow the instructions provided by the installer to add the shell functions to your shell configuration file (~/.bashrc or ~/.zshrc).

  2. Restart your shell or source your configuration file:

# For bash
source ~/.bashrc

# For zsh
source ~/.zshrc

Font Configuration

Required Fonts

By default, the tools add font fallback configuration to ensure proper rendering of emojis and special characters. The following fonts need to be installed on your system for PDF conversion to work without errors:

Required fonts:

  • FreeSans - Default main font (required unless using -f option with a different font)
  • Noto Emoji - For emoji and Unicode symbol support
  • DejaVu Serif - Primary fallback font for serif text
  • FreeSerif - Secondary fallback font

Installing Fonts

macOS

# Install via Homebrew
brew install font-noto-emoji font-dejavu font-freefont

# Or download manually from:
# - FreeSans: https://www.gnu.org/software/freefont/
# - Noto Emoji: https://fonts.google.com/noto/specimen/Noto+Emoji
# - DejaVu: https://dejavu-fonts.github.io/
# - FreeSerif: https://www.gnu.org/software/freefont/

Linux (Ubuntu/Debian)

# Install via package manager
sudo apt-get install fonts-noto-emoji fonts-dejavu fonts-freefont-ttf

Windows

Download and install the fonts manually:

Alternative: Disable Font Fallbacks

If you prefer not to install the additional fonts or want to use your own font configuration, you can disable the font fallback feature:

# Skip font fallback configuration
perplexity-md-to-pdf --no-fallback-fonts document.md

# Or when preprocessing manually
cat input.md | python3 perplexity-preprocess-md.py --no-fallback-fonts > output.md

Note: When using --no-fallback-fonts, you may encounter warnings about missing special symbols (emojis, Unicode characters) that cannot be rendered with the standard fonts. These symbols will not appear in the final PDF output.

Important: Even with --no-fallback-fonts, the default font FreeSans must still be installed on your system, unless you specify a different font using the -f|--font option.

Workflow Examples

Basic Markdown to PDF Conversion

# Simple conversion
perplexity-md-to-pdf my-document.md

Advanced Workflow with Custom Processing

# Preprocess only
cat input.md | python3 perplexity-preprocess-md.py -l de > processed.md

# Manual Pandoc conversion with custom options
pandoc processed.md -o output.pdf --pdf-engine=lualatex --citeproc

Batch Processing

# Convert multiple files
for file in *.md; do
    perplexity-md-to-pdf "$file"
done

Citation Format

The tools automatically convert Perplexity's footnote format to Pandoc's citation system:

Input (Perplexity format):

This is a claim[^1].

[^1]: https://example.com/source

Output (Pandoc format):

This is a claim[@ref1].

---
references:
  - id: ref1
    type: webpage
    URL: https://example.com/source
csl: https://raw.githubusercontent.com/citation-style-language/styles/master/nature.csl
lang: en-US
---

Configuration

Language Support

The tools support multiple languages for citation formatting:

  • en-US (default) - English (United States)
  • de-DE - German (Germany)
  • en - Short form for English
  • de - Short form for German

Font Configuration

The perplexity-md-to-pdf function allows you to specify custom fonts:

# Use different fonts
perplexity-md-to-pdf -f "Times New Roman" document.md
perplexity-md-to-pdf -f "DejaVu Serif" document.md
perplexity-md-to-pdf -f "Liberation Sans" document.md

# Font names with spaces need quotes
perplexity-md-to-pdf -f "Computer Modern" document.md

Note: The specified font must be installed on your system. If the font is not available, LuaLaTeX will fall back to the default font (FreeSans) or show warnings.

Default Font Requirement: The default font FreeSans must be installed on your system unless you specify a different font with the -f|--font option. This requirement applies even when using --no-fallback-fonts.

PDF Output Settings

The perplexity-md-to-pdf function uses these default settings:

  • Layout: Two-column landscape (single-column portrait when tables are present or --single-column is used, unless --landscape is specified)
  • Paper: A4
  • Margins: 2.5cm
  • Column separation: 1cm
  • Font: FreeSans (configurable with -f|--font option)
  • Engine: LuaLaTeX
  • Font fallbacks: Noto Emoji, DejaVu Serif, FreeSerif (can be disabled with --no-fallback-fonts)

Troubleshooting

Common Issues

  1. LuaLaTeX not found: Install TeXLive or MiKTeX
  2. Pandoc not found: Install Pandoc from pandoc.org
  3. Eisvogel template not found: Install the eisvogel template (see Eisvogel Template Installation section above)
  4. Font issues:
    • Ensure the required fallback fonts are installed (see Font Configuration section)
    • FreeSans must be installed unless using -f|--font with a different font
    • Or use --no-fallback-fonts to skip font fallback configuration
    • Check that your system has the fonts specified in the YAML front matter
    • If using custom fonts with -f|--font, ensure the specified font is installed
  5. Permission denied: Make sure the Python script is executable
  6. PDF conversion fails with font errors: Install the required fonts or use --no-fallback-fonts

Debug Mode

For troubleshooting, you can run the preprocessing step separately:

# Check the preprocessing output
cat input.md | python3 perplexity-preprocess-md.py -l en-US

# Then manually run Pandoc
pandoc processed.md -o output.pdf --pdf-engine=lualatex --citeproc

Testing

The project includes a comprehensive test suite to ensure all tools work correctly. To run the tests:

# Run all tests with dependency checking
make test

# Or run tests directly
cd tests
python3 run_tests.py

# Run tests quickly (skip dependency checks)
make test-quick

# Check dependencies only
make test-deps

# Clean up test output files
make clean

The test suite includes:

  • Unit tests for each tool
  • Integration tests for the full pipeline
  • Tests with various document types (simple, with tables, complex, German)
  • Error handling tests
  • Dependency validation

See the tests/README.md for detailed information about the test suite.

Contributing

Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.

Before submitting changes, please run the test suite to ensure everything works correctly:

make test

License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.

Acknowledgments

About

A collection of command-line tools for processing and converting Perplexity-generated markdown content, with special focus on proper citation and maths formatting and PDF generation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published