Skip to content

cahaseler/unportfolio

Repository files navigation

unportfolio

Extract embedded files from Adobe PDF Portfolios.

Try the demo — works in your browser, no upload required.

import { Portfolio, setDecompressor } from 'unportfolio';
import pako from 'pako';

setDecompressor('FlateDecode', data => pako.inflate(data));

const portfolio = Portfolio.open(pdfData);

for (const { name, data } of portfolio.extractAll()) {
  fs.writeFileSync(name, data);
}

Most portfolios use FlateDecode compression, so you'll need pako (or Node's zlib). ASCII85 and ASCIIHex are built in.

Install

npm install unportfolio pako

API

Portfolio.isPortfolio(data: Uint8Array): boolean — Quick check. Doesn't fully parse.

Portfolio.open(data: Uint8Array): Portfolio — Parse and open. Throws NotAPortfolioError, EncryptedPDFError, or MalformedPDFError.

portfolio.listFiles() — Returns { name, size?, created?, modified?, checksum?, mimeType? }[]

portfolio.extract(filename) — Returns Uint8Array. Throws FileNotFoundError.

portfolio.extractAll() — Generator yielding { name, data }. Memory-efficient for large portfolios.

portfolio.fileCount / portfolio.hasFile(name) — What you'd expect.

setDecompressor(filter, fn) — Register your own. The function receives (data: Uint8Array) => Uint8Array.

Node.js with zlib

import { inflateSync } from 'zlib';
setDecompressor('FlateDecode', data => inflateSync(data));

Limitations

  • Encrypted PDFs not supported (throws EncryptedPDFError)
  • LZWDecode requires you to bring your own decompressor
  • Read-only — this extracts files, doesn't create portfolios

License

MIT

About

Lightweight library for extracting files from Adobe PDF Portfolios

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors