-
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or requestpriority:mediumMedium priority taskMedium priority task
Description
Overview
Extract embedded JavaScript code from PDF documents for security analysis.
Parent Epic
Part of #91 - Document & Office Format Awareness
Description
PDFs can contain JavaScript in various locations (actions, annotations, form fields). Extract this code for analysis.
Implementation Details
- Parse JavaScript actions (/JS, /JavaScript)
- Extract from document-level scripts
- Extract from page actions (OpenAction)
- Extract from form field actions
- Handle both string and stream JavaScript
String Sources
- JavaScript code
- Function names
- Variable names
- String literals within JavaScript
- API calls (app., doc., etc.)
Acceptance Criteria
- Extract document-level JavaScript
- Extract page-level JavaScript
- Extract form field scripts
- Handle obfuscated JavaScript
- Pretty-print JavaScript output
- Tests with JS-enabled PDFs
Security Note
This is important for malware analysis as malicious PDFs often contain JavaScript exploits.
Related
Project: #76
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestpriority:mediumMedium priority taskMedium priority task