Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions .github/workflows/validate.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
name: Validate

on:
pull_request:
push:
branches:
- main
- "codex/**"

jobs:
validate:
runs-on: ubuntu-latest
steps:
- name: Check out repository
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Install YAML parser
run: python -m pip install --disable-pip-version-check pyyaml

- name: Validate repo structure
run: python tools/validate-repo.py

- name: Validate Codex metadata
run: |
python - <<'PY'
import pathlib
import yaml

path = pathlib.Path("nutrient-document-processing/agents/openai.yaml")
data = yaml.safe_load(path.read_text())
interface = data.get("interface", {})
required = ["display_name", "short_description", "default_prompt"]
missing = [key for key in required if key not in interface]
if missing:
raise SystemExit(f"openai.yaml missing interface keys: {missing}")
print("openai.yaml parsed successfully")
PY

- name: Compile Python scripts
run: |
python -m py_compile nutrient-document-processing/scripts/*.py nutrient-document-processing/scripts/lib/common.py

- name: Smoke test script help
run: |
for script in nutrient-document-processing/scripts/*.py; do
python "$script" --help > /dev/null
done
22 changes: 18 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,14 @@

<p align="center">
<a href="https://www.nutrient.io/api/"><img src="https://img.shields.io/badge/Nutrient-DWS%20API-blue" alt="Nutrient DWS API"></a>
<a href="LICENSE"><img src="https://img.shields.io/badge/license-Apache--2.0-green" alt="License"></a>
<a href="https://www.npmjs.com/package/@nutrient-sdk/dws-mcp-server"><img src="https://img.shields.io/npm/v/@nutrient-sdk/dws-mcp-server" alt="npm version"></a>
<a href="nutrient-document-processing/LICENSE.txt"><img src="https://img.shields.io/badge/license-Apache--2.0-green" alt="License"></a>
<a href="https://agentskills.io"><img src="https://img.shields.io/badge/Agent%20Skills-compatible-purple" alt="Agent Skills"></a>
</p>

<p align="center">
<strong>Give your AI agent PDF superpowers — in one command.</strong><br>
Convert, extract, OCR, redact, sign, and fill documents from any coding agent.
Generate, convert, extract, OCR, redact, sign, archive, and optimize documents from any coding agent.
</p>

<p align="center">
Expand Down Expand Up @@ -131,13 +132,17 @@ patient-records.pdf (contains PII)

| Capability | Description | Example prompt |
|------------|-------------|----------------|
| ✨ **Generate** | Create PDFs from HTML templates, uploaded assets, or remote URLs | *"Generate a PDF proposal from this HTML template"* |
| 📄 **Convert** | PDF ↔ DOCX/XLSX/PPTX, HTML → PDF, images → PDF | *"Convert report.docx to PDF"* |
| 🧩 **Assemble** | Merge, split, reorder, rotate, and flatten PDF packets before delivery | *"Merge these PDFs, rotate the landscape pages, and keep only pages 1-5"* |
| 📝 **Extract** | Text, tables, and key-value pairs from PDFs | *"Extract all tables from invoice.pdf as Excel"* |
| 🔍 **OCR** | Multi-language OCR for scanned documents | *"OCR this German scan and extract the text"* |
| 🔒 **Redact** | Pattern-based + AI-powered PII redaction | *"Redact all SSNs and emails from records.pdf"* |
| 💧 **Watermark** | Text or image watermarks with full styling | *"Add a DRAFT watermark to proposal.pdf"* |
| ✍️ **Sign** | CMS and CAdES digital signatures | *"Digitally sign contract.pdf"* |
| 📋 **Fill Forms** | Programmatic PDF form filling | *"Fill the tax form with these values…"* |
| 🗂️ **Compliance** | Convert PDFs for archival or accessibility targets like PDF/A and PDF/UA | *"Convert this PDF to PDF/A-2a"* |
| ⚡ **Optimize** | Optimize and linearize PDFs for web delivery and download performance | *"Linearize this PDF for fast web viewing"* |
| 📊 **Credits** | Monitor API usage and balance | *"How many API credits do I have left?"* |

---
Expand Down Expand Up @@ -188,28 +193,37 @@ cp -r nutrient-agent-skill/nutrient-document-processing ~/.claude/skills/
```
nutrient-document-processing/
├── SKILL.md # Main instructions (loaded by agents)
├── agents/
│ └── openai.yaml # Optional Codex App metadata
├── references/
│ ├── REFERENCE.md # Reference index
│ └── *.md # Focused cookbooks by workflow type
├── scripts/
│ ├── *.py # Single-operation scripts
│ └── lib/common.py # Shared utilities
├── assets/
│ ├── nutrient.svg # Skill icon
│ └── templates/
│ └── custom-workflow-template.py # Runtime pipeline template
├── tests/
│ └── testing-guide.md
└── LICENSE # Apache-2.0
└── LICENSE.txt # Apache-2.0
```

### Script Model

- `scripts/*.py` are single-operation scripts only.
- Multi-step workflows are generated at runtime in a temporary script from `assets/templates/custom-workflow-template.py`.
- Do not commit runtime pipeline scripts.
- Use `references/` for HTML/URL generation, compliance outputs, and other workflows that are easier to express as direct API payloads or temporary pipelines.

## Documentation

- **[SKILL.md](nutrient-document-processing/SKILL.md)** — Agent instructions with setup and operation examples
- **[Reference Index](nutrient-document-processing/references/REFERENCE.md)** — Modular cookbook for generation, conversion, extraction, security, compliance, and workflow sequencing
- **[Testing Guide](nutrient-document-processing/tests/testing-guide.md)** — Manual test procedures
- **[Custom Workflow Template](nutrient-document-processing/assets/templates/custom-workflow-template.py)** — Runtime pipeline starting point
- **[Codex App Metadata](nutrient-document-processing/agents/openai.yaml)** — Optional manifest for Codex App packaging
- **[API Playground](https://dashboard.nutrient.io/processor-api/playground/)** — Interactive API testing
- **[Official API Docs](https://www.nutrient.io/guides/dws-processor/)** — Nutrient documentation

Expand All @@ -219,4 +233,4 @@ Built by [Nutrient](https://www.nutrient.io/) (formerly PSPDFKit) — document S

## License

[Apache-2.0](nutrient-document-processing/LICENSE)
[Apache-2.0](nutrient-document-processing/LICENSE.txt)
142 changes: 88 additions & 54 deletions nutrient-document-processing/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,83 +1,117 @@
---
name: nutrient-document-processing
description: >-
Process documents with the Nutrient DWS API. Use this skill when the user wants to convert documents
(PDF, DOCX, XLSX, PPTX, HTML, images), extract text or tables from PDFs, OCR scanned documents,
redact sensitive information (PII, SSN, emails, credit cards), add watermarks, digitally sign PDFs,
fill PDF forms, or check API credit usage. Activates on keywords: PDF, document, convert, extract,
OCR, redact, watermark, sign, merge, compress, form fill, document processing.
Process documents with Nutrient DWS. Use when the user wants to generate PDFs from HTML or URLs,
convert Office/images/PDFs, assemble or split packets, OCR scans, extract text/tables/key-value
pairs, redact PII, watermark, sign, fill forms, optimize PDFs, or produce compliance outputs like
PDF/A or PDF/UA. Triggers include convert to PDF, merge these PDFs, OCR this scan, extract tables,
redact PII, sign this PDF, make this PDF/A, or linearize for web delivery.
license: Apache-2.0
metadata:
author: nutrient-sdk
version: "1.0"
homepage: "https://www.nutrient.io/api/"
repository: "https://github.com/PSPDFKit-labs/nutrient-agent-skill"
compatibility: "Requires Node.js 18+ and internet. Works with Claude Code, Codex CLI, Gemini CLI, OpenCode, Cursor, Windsurf, GitHub Copilot, Amp, or any Agent Skills-compatible product."
compatibility: "Requires Python 3.10+, uv, and internet. Works with Claude Code, Codex CLI, Gemini CLI, OpenCode, Cursor, Windsurf, GitHub Copilot, Amp, or any Agent Skills-compatible product."
short-description: "Generate, convert, assemble, OCR, redact, sign, archive, and optimize documents"
---

# Nutrient Document Processing

Process, convert, extract, redact, sign, and manipulate documents using the [Nutrient DWS Processor API](https://www.nutrient.io/api/).
Use Nutrient DWS for managed document workflows where fidelity, compliance, or multi-step processing matters more than local-tool convenience.

## Setup

You need a Nutrient DWS API key. Get one free at <https://dashboard.nutrient.io/sign_up/?product=processor>.

Export the API key before running scripts:

```bash
export NUTRIENT_API_KEY="nutr_sk_..."
```

Scripts live in `scripts/` relative to this SKILL.md. Use the directory containing this SKILL.md as the working directory when running scripts:

```bash
cd <directory containing this SKILL.md> && uv run scripts/<script>.py --help
```

Page ranges use `start:end` (0-based, end-exclusive). Negative indices count from the end. Use comma-separated ranges like `0:2,3:5,-2:-1`.

## PDF Requirements

Some operations require specific document characteristics:

- **split.py**: Requires multi-page PDFs (2+ pages). Cannot extract a range from a single-page document.
- **delete-pages.py**: Must retain at least one page. Cannot delete all pages in a document.
- **sign.py**: Only accepts local file paths (not URLs).

## Single-Operation Scripts

- Convert format: `uv run scripts/convert.py --input doc.pdf --format docx --out doc.docx`
- Merge files: `uv run scripts/merge.py --inputs a.pdf,b.pdf --out merged.pdf`
- Split by ranges: `uv run scripts/split.py --input doc.pdf --ranges 0:2,2: --out-dir out --prefix part`
- OCR: `uv run scripts/ocr.py --input scan.pdf --languages english --out scan-ocr.pdf`
- Rotate pages: `uv run scripts/rotate.py --input doc.pdf --angle 90 --out rotated.pdf`
- Optimize: `uv run scripts/optimize.py --input doc.pdf --out optimized.pdf`
- Extract text: `uv run scripts/extract-text.py --input doc.pdf --out text.json`
- Extract tables: `uv run scripts/extract-table.py --input doc.pdf --out tables.json`
- Extract key-value pairs: `uv run scripts/extract-key-value-pairs.py --input doc.pdf --out kvp.json`
- Add text watermark: `uv run scripts/watermark-text.py --input doc.pdf --text CONFIDENTIAL --out watermarked.pdf`
- AI redact: `uv run scripts/redact-ai.py --input doc.pdf --criteria "Remove all SSNs" --mode apply --out redacted.pdf`
- Sign: `uv run scripts/sign.py --input doc.pdf --out signed.pdf`
- Password protect: `uv run scripts/password-protect.py --input doc.pdf --user-password upass --owner-password opass --out protected.pdf`
- Add pages: `uv run scripts/add-pages.py --input doc.pdf --count 2 --out with-pages.pdf`
- Delete pages: `uv run scripts/delete-pages.py --input doc.pdf --pages 0,2,-1 --out trimmed.pdf`
- Duplicate/reorder pages: `uv run scripts/duplicate-pages.py --input doc.pdf --pages 2,0,1,1 --out reordered.pdf`
- Get a Nutrient DWS API key at <https://dashboard.nutrient.io/sign_up/?product=processor>.
- Direct API calls use `Authorization: Bearer $NUTRIENT_API_KEY`.
```bash
export NUTRIENT_API_KEY="nutr_sk_..."
```
- MCP setups commonly use `@nutrient-sdk/dws-mcp-server` with `NUTRIENT_DWS_API_KEY`.
- Scripts live in `scripts/` relative to this SKILL.md. Use the directory containing this SKILL.md as the working directory:
```bash
cd <directory containing this SKILL.md> && uv run scripts/<script>.py --help
```
- Page ranges use `start:end` with 0-based indexes and end-exclusive semantics. Negative indexes count from the end.

## When to use
- Generate PDFs from HTML templates, uploaded assets, or remote URLs.
- Convert Office, HTML, image, and PDF files between supported formats.
- OCR scans and extract text, tables, or key-value pairs.
- Redact PII, watermark, sign, fill forms, merge, split, rotate, flatten, or encrypt PDFs.
- Produce delivery targets like PDF/A, PDF/UA, optimized PDFs, or linearized PDFs.
- Check credits before large, batch, or AI-heavy runs.

## Tool preference
1. Prefer `scripts/*.py` for covered single-operation workflows.
2. Use `assets/templates/custom-workflow-template.py` for multi-step jobs that should still run through the Python client.
3. Use the modular `references/` docs and direct API payloads for capabilities that do not yet have a dedicated helper script, especially HTML/URL generation and compliance tuning.
4. Use local PDF utilities only for lightweight inspection. Use Nutrient when output fidelity or compliance matters.

## Single-operation scripts
- `convert.py` -> convert between `pdf`, `pdfa`, `pdfua`, `docx`, `xlsx`, `pptx`, `png`, `jpeg`, `webp`, `html`, and `markdown`
- `merge.py` -> merge multiple files into one PDF
- `split.py` -> split one PDF into multiple PDFs by page ranges
- `add-pages.py` -> append blank pages
- `delete-pages.py` -> remove specific pages
- `duplicate-pages.py` -> reorder or duplicate pages into a new PDF
- `rotate.py` -> rotate selected pages
- `ocr.py` -> OCR scanned PDFs or images
- `extract-text.py` -> extract text to JSON
- `extract-table.py` -> extract tables
- `extract-key-value-pairs.py` -> extract key-value pairs
- `watermark-text.py` -> apply a text watermark
- `redact-ai.py` -> detect and apply AI-powered redactions
- `sign.py` -> digitally sign a local PDF
- `password-protect.py` -> write encrypted output PDFs
- `optimize.py` -> apply optimization and linearization-style options via JSON

## Multi-Step Workflow Rule

Do not add new committed pipeline scripts under `scripts/`.

When the user asks for multiple operations in one run:

1. Copy `assets/templates/custom-workflow-template.py` to a temporary location (for example `/tmp/ndp-workflow-<task>.py`).
1. Copy `assets/templates/custom-workflow-template.py` to a temporary location such as `/tmp/ndp-workflow-<task>.py`.
2. Implement the combined workflow in that temporary script.
3. Run it with `uv run /tmp/ndp-workflow-<task>.py ...`.
4. Return generated output files.
5. Delete the temporary script unless the user explicitly asks to keep it.

## Rules
## PDF Requirements
- `split.py` requires a multi-page PDF and cannot extract ranges from a single-page document.
- `delete-pages.py` must retain at least one page and cannot delete the entire document.
- `sign.py` only accepts local file paths for the main PDF.

## Decision rules
- Prefer a helper script when one already covers the requested operation cleanly.
- If you control the source markup, prefer HTML generation over browser print workflows.
- Use remote `file.url` inputs when the source already lives at a stable URL and you want to avoid local uploads.
- Use `output.type` for conversion and finalization targets. Use `actions` for transformations when building direct API payloads.
- OCR before text extraction, key-value extraction, or semantic redaction on scans.
- Prefer preset or regex redaction when the target is explicit. Use AI redaction only for contextual or natural-language requests.
- Use the PDF manipulation reference for merge, split, rotate, flatten, and page-range workflows instead of inferring those payloads from conversion examples.
- Treat PDF/A and PDF/UA as compliance targets, not cosmetic export formats. Choose the target up front and validate final artifacts when requirements are contractual.
- For PDF/UA, clean born-digital inputs and structured HTML usually tag better than rasterized or flattened source PDFs.
- For delivery optimization, linearize or optimize unsigned output artifacts instead of mutating already signed files.
- When the user asks for multiple steps, keep destructive or final steps late in the sequence. Use the workflow recipes when ordering is ambiguous.

## Anti-patterns
- Do not OCR born-digital PDFs just because the task mentions extraction. Extract first and OCR only if the text layer is missing.
- Do not flatten forms or annotations until the user confirms the artifact no longer needs to stay editable.
- Do not sign, archive, or linearize intermediate working files. Keep those as final-delivery steps.
- Do not promise PDF/A or PDF/UA compliance without a validation step when the requirement is contractual.
- Do not commit temporary workflow scripts under `scripts/`.

## Reference map
Read only what you need:

- `references/request-basics.md` -> endpoint model, auth, multipart vs JSON, credits, limits, and errors
- `references/generation-and-conversion.md` -> HTML/URL generation and format conversion
- `references/pdf-manipulation.md` -> merge, split, page-range, rotate, and flatten workflows
- `references/extraction-and-ocr.md` -> OCR, text extraction, tables, and key-value workflows
- `references/security-signing-and-forms.md` -> redaction, watermarking, signatures, forms, and passwords
- `references/compliance-and-optimization.md` -> PDF/A, PDF/UA, optimization, and linearization
- `references/workflow-recipes.md` -> end-to-end sequencing patterns for common business document workflows

## Rules
- Fail fast when required arguments are missing.
- Write outputs to explicit paths and print created files.
- Do not log secrets.
Expand Down
6 changes: 6 additions & 0 deletions nutrient-document-processing/agents/openai.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
interface:
display_name: "Nutrient Document Processing"
short_description: "Generate, convert, assemble, OCR, redact, sign, archive, and optimize documents"
icon_small: "./assets/nutrient.svg"
icon_large: "./assets/nutrient.svg"
default_prompt: "Use $nutrient-document-processing to generate, convert, assemble, OCR, extract, redact, sign, fill, archive, optimize, or linearize this document, then return the output files and a concise summary."
4 changes: 4 additions & 0 deletions nutrient-document-processing/assets/nutrient.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading