Skip to content

Issue: PDF Upload and Parse Failing #1

@Lycan-Xx

Description

@Lycan-Xx

Issue Summary

  • Affected Feature: PDF file upload and text extraction
  • Severity: Medium – core AI functionality remains operational
  • Discovery Date: September 17 2025 (2 days post‑deployment)
  • Estimated Fix: Immediately after hackathon judging

Timeline

  • Sept 15: Initial deployment to Netlify
  • Sept 17: Issue discovered during user testing
  • Sept 18: Root cause identified and documented
  • Post‑Judging: Fix impl
Image

ementation scheduled

What Happened

After deploying StudyWise AI to production on Netlify (Sept 15 2025), PDF uploads began failing with a version‑compatibility error. The feature worked perfectly in local development.

Technical Details

Error Message

PDF processing error: The API version "5.4.54" does not match the Worker version "3.11.174"

Root Cause

A version mismatch between the two PDF.js components:

Component Source Version
PDF.js display layer (installed package) npm ("pdfjs-dist": "^5.4.54") 5.4.54
PDF.js worker script (CDN) https://cdnjs.cloudflare.com/ajax/libs/pdf.js/3.11.174/pdf.worker.min.js 3.11.174

Why This Happened

  • Development used a local worker that matched the library version.
  • Production pointed to a CDN fallback (3.11.174).
  • The library was later upgraded to 5.4.54, but that version isn’t available on CDNJS (only newer builds like 5.4.149).
  • The missing CDN version caused the API and worker versions to diverge, triggering the error.
  • Mistake: I didn’t verify that the CDN actually provides the fallback version I was referencing.

Impact Assessment

  • ✅ Still works: All features except PDF upload.
  • ❌ Affected: PDF file uploads and text extraction.
  • 🔄 Workaround:
    1. Convert PDFs to plain text with external tools.
    2. Copy‑paste text directly into the app.
    3. Use markdown (.md) files for formatted notes.

Why Not Fixed Immediately?

  • Submission integrity: Hackathon rules forbid code changes during judging.
  • OAuth complexity: Separate branches would break Google OAuth callbacks configured in Supabase.
  • Risk management: Changes during judging could introduce new issues to already‑working features.

Technical Solution (Post‑Judging)

Quick Fix

// client/src/utils/documentProcessor.ts
const RELIABLE_CDN_VERSION = '5.4.149'; // use an available CDNJS version

Long‑Term Solution

Implement dynamic version matching (e.g., fetch the latest matching worker version at runtime) to prevent future mismatches.

Lessons Learned

  1. Verify external dependencies: Ensure CDN versions referenced in code actually exist before deployment.
  2. Production environment testing: Test all features in the real production environment, not just locally.
  3. Dependency management: Bundle critical libraries locally instead of relying on external CDNs for core functionality.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions