feat: add interlinear PDF export pipeline by delgado-jacob · Pull Request #134 · globalbibletools/platform

delgado-jacob · 2025-12-17T15:51:03Z

closes: #109

Feature

For Translators:

Export interlinear PDFs directly from the Language Settings page
Choose specific books and chapter ranges, or export everything
Select between "Standard" (word-by-word) and "Parallel" (two-column) layouts
Download links with automatic expiration tracking

For Snapshots:

Automatic PDF generation when language snapshots are created
PDFs are archived alongside snapshot data for historical record

Architecture

New export Module (src/modules/export/)

InterlinearPdfGenerator - Core PDF rendering using PDFKit with SBL Hebrew/Greek fonts
ExportRequestRepository - Tracks export requests, status, and download URLs
ExportStorageRepository - S3/LocalStack integration for PDF storage
Server actions for requesting exports and polling status
React components for the export UI with real-time status updates

Background Jobs

export_interlinear_pdf - Generates PDFs asynchronously, merges multi-book exports
cleanup_exports - Purges expired exports from storage
create_snapshot_interlinear_pdf - Generates archival PDFs during snapshot creation

Database Changes

New export_request table tracking request lifecycle (pending → in progress → complete/failed)
New export_request_book junction table for multi-book exports
Three new job types registered

Key Design Decisions

Async Generation: PDFs are generated in background jobs rather than blocking the UI, with polling for status updates
Per-Book Chunking with Merge: Multi-book exports generate individual PDFs per book, then merge them using pdf-lib—this prevents memory issues with large exports
Script Detection: Automatically detects Hebrew vs Greek source text to apply correct fonts and RTL/LTR layout
Expiring Downloads: Export URLs expire (configurable, default 7 days) with automatic cleanup job
Snapshot Integration: Snapshot creation now triggers interlinear PDF generation, archived at snapshots/{languageId}/{snapshotId}/interlinear.pdf

Testing

Unit tests for PDF generator, jobs, actions, and React components
Integration tests for repository layer
LocalStack integration tests for S3 operations
Test coverage for error paths and edge cases

arrocke

Wow thanks for the contribution! There is a lot of good stuff in here, but a PR this size is very difficult to review. I'm going to treat this like a prototype to work through product/design decisions together. Then let's chop up the PR up into smaller PRs for easier technical review. There will be a lot to work through and this will make it less overwhelming to review and faster to get changes shipped for testing.

In the future, I encourage you to seek clarity on vague tickets before starting work and open PRs with smaller changes. We have a feature flag system to be able to do this with incomplete features.

We can use this PR to discuss the bigger picture of this feature, so I will leave some high level feedback here to start that discussion. From there, here is a way we can break this work up into smaller PRs for easier review:

Introduce localstack: I'm thrilled you are introducing this. There are a number of other places where we could take advantage of this. This PR could simply introduce localstack and its initialization scripts.
Basic export job infrastructure: Set up the UI to trigger the new job and report it's status. At this stage the job is a noop. This new UI can be behind a feature flag.
Upload logic: Add the logic to upload an empty PDF in the job.
PDF generation: Add the MVP PDF generation logic to the job. This will be the meat of the work, but we'll be able to focus on just that logic now that all of the infrastructure is taken care of.

Please don't be discouraged by this feedback. My goal is to clarify how we can best work together. I'm grateful for the work you've put in thus far to develop this feature. Please let me know if you have any questions.

arrocke · 2025-12-18T01:05:49Z

src/modules/snapshots/jobs/createSnapshotJob.ts

+  await enqueueJob(SNAPSHOT_JOB_TYPES.CREATE_SNAPSHOT_INTERLINEAR_PDF, {
+    languageId: snapshot.languageId,
+    languageCode: language.code,
+    snapshotId: snapshot.id,
+  });


The snapshot system is meant for backup/restore purposes, not to export for general consumption. We can remove PDF generation from the snapshot module

arrocke · 2025-12-18T01:13:48Z

db/migrations/25-12-06-add-export-request.sql

@@ -0,0 +1,36 @@
+create type export_request_status as enum ('PENDING', 'IN_PROGRESS', 'COMPLETE', 'FAILED');
+
+create table export_request (


I'd prefer to track all of this in the payload and data columns of the job table. Both of those columns can hold arbitrary JSON. Jobs are ephemeral so I find referential integrity of job data to be less important and having a single mechanism for tracking job progress is useful. Do you have any concerns with this approach?

Sounds good, no concerns.

arrocke · 2025-12-18T01:20:47Z

vitest.localstack.config.mts

+import { defineConfig } from "vitest/config";
+import tsconfigPaths from "vite-tsconfig-paths";
+
+export default defineConfig({


I'm not sure we want to require localstack to be running for integration tests. I think we can get away with stubbing the s3 api and asserting it is called with the right data

arrocke · 2025-12-18T01:22:04Z

compose.local.yaml

@@ -0,0 +1,57 @@
+services:


Is there a reason to break this out into it's own file? Our compose.yaml is for locale development only since everything is deployed separately in production environments

No, just precaution to avoid messing with any other workflows. I can consolidate it.

arrocke · 2025-12-18T01:28:45Z

src/modules/languages/react/LanguageSettingsPage.tsx

+      <InterlinearExportPanel
+        languageCode={languageSettings.code}
+        books={books}
+      />


I'd like to move this to its own page in the admin language view. The export module would own that view. The settings page is reserved for configuration on the language

arrocke · 2025-12-18T01:44:46Z

src/modules/export/pdf/InterlinearPdfGenerator.ts

+export interface PdfGeneratorOptions {
+  pageSize?: "letter" | "a4";
+  layout: "standard" | "parallel";
+  direction?: "ltr" | "rtl";
+  header?: { title?: string; subtitle?: string };
+  footer?: { generatedAt?: Date; pageOffset?: number; pageTotal?: number };
+  sourceScript?: "hebrew" | "greek";
+}


For this first version, I'd like to simplify this to a single button for the user. That button converts to a loading UI while the job is in progress and then reports when the job is done.

For the PDF, let's use your standard layout and export all books by default even if there are no glosses for that book yet. This is a true MVP of this feature and we can expand options later.

delgado-jacob · 2025-12-22T19:40:01Z

There is a lot of good stuff in here, but a PR this size is very difficult to review. I'm going to treat this like a prototype to work through product/design decisions together. Then let's chop up the PR up into smaller PRs for easier technical review.

Makes sense, I'll work on splitting things up. Are you ok with stacked branches or would you prefer a single branch at a time in PR?

Please don't be discouraged by this feedback.

Not at all. My main objective was to start getting familiar with the platform and prototype something for discussion/iteration. I realize it's a large change set and have no issues breaking it up.

arrocke · 2025-12-23T02:11:45Z

Stacked branches are completely fine. Since we squash a PR into a single commit, the branch history can be a little messy and it will all consolidate to a single commit in the main branch. Open stacked PRs whenever you'd like, just leave them in draft state until you have something you want to merge. PRs are the best place to ask questions about the changes you are making so opening them early and mentioning me with specific questions will speed up the review time later rather than addressing all of that at the end. I'll try to be prompt in responding.

feat: add interlinear PDF export pipeline

dae6a00

github-project-automation bot added this to Global Bible Tools Dec 17, 2025

arrocke reviewed Dec 18, 2025

View reviewed changes

arrocke force-pushed the main branch from df12799 to 0132535 Compare December 22, 2025 19:30

delgado-jacob marked this pull request as draft December 22, 2025 19:32

arrocke force-pushed the main branch from 06bbf98 to d7d7876 Compare January 4, 2026 19:26

This was referenced Jan 11, 2026

chore: add localstack S3 to docker compose #145

Merged

feat: interlinear export job ui #148

Merged

delgado-jacob mentioned this pull request Mar 18, 2026

feat: upload placeholder interlinear export PDF #209

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add interlinear PDF export pipeline#134

feat: add interlinear PDF export pipeline#134
delgado-jacob wants to merge 1 commit intoglobalbibletools:mainfrom
delgado-jacob:feature_pdf_export

delgado-jacob commented Dec 17, 2025

Uh oh!

arrocke left a comment

Uh oh!

arrocke Dec 18, 2025

Uh oh!

arrocke Dec 18, 2025

Uh oh!

delgado-jacob Dec 22, 2025

Uh oh!

arrocke Dec 18, 2025

Uh oh!

arrocke Dec 18, 2025

Uh oh!

delgado-jacob Dec 22, 2025

Uh oh!

arrocke Dec 18, 2025

Uh oh!

arrocke Dec 18, 2025

Uh oh!

delgado-jacob commented Dec 22, 2025

Uh oh!

arrocke commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1,36 @@
		create type export_request_status as enum ('PENDING', 'IN_PROGRESS', 'COMPLETE', 'FAILED');

		create table export_request (

Conversation

delgado-jacob commented Dec 17, 2025

Feature

Architecture

Key Design Decisions

Testing

Uh oh!

arrocke left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

delgado-jacob commented Dec 22, 2025

Uh oh!

arrocke commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants