feat: add AI redaction tool with stage/apply options#15
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces an AI-powered document redaction tool (ai_redactor) that integrates with the Nutrient DWS /ai/redact endpoint. The feature was split from PR #12 per review feedback and specifically addresses Nick's Review Point #4 regarding support for stage and apply options. The implementation follows established codebase patterns for API integration, file handling, and error management, providing a clean interface for AI-driven PII detection and permanent removal from documents.
Changes:
- Added
ai_redactorMCP tool with AI-powered PII detection supporting customizable criteria and stage/apply workflow options - Implemented path validation with overwrite protection and mutually exclusive stage/apply flag validation
- Added schema validation tests for default values, flag acceptance, and empty criteria rejection
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
src/dws/ai-redact.ts |
New implementation for AI redaction API calls with file validation, stage/apply logic, and FormData payload construction |
src/schemas.ts |
Added AiRedactArgsSchema with optional stage/apply boolean flags and type export |
src/index.ts |
Registered ai_redactor tool with descriptive documentation of capabilities and detected content types |
tests/unit.test.ts |
Added 3 schema validation tests covering default criteria, flag acceptance, and empty criteria rejection |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| describe('AiRedactArgsSchema', () => { | ||
| it('should apply default criteria when omitted', () => { | ||
| const result = AiRedactArgsSchema.parse({ filePath: '/input.pdf', outputPath: '/output.pdf' }) | ||
|
|
||
| expect(result.criteria).toBe('All personally identifiable information') | ||
| expect(result.stage).toBeUndefined() | ||
| expect(result.apply).toBeUndefined() | ||
| }) | ||
|
|
||
| it('should accept stage and apply flags when provided', () => { | ||
| const stageResult = AiRedactArgsSchema.parse({ | ||
| filePath: '/input.pdf', | ||
| outputPath: '/output.pdf', | ||
| stage: true, | ||
| }) | ||
|
|
||
| const applyResult = AiRedactArgsSchema.parse({ | ||
| filePath: '/input.pdf', | ||
| outputPath: '/output.pdf', | ||
| apply: true, | ||
| }) | ||
|
|
||
| expect(stageResult.stage).toBe(true) | ||
| expect(applyResult.apply).toBe(true) | ||
| }) | ||
|
|
||
| it('should reject empty criteria', () => { | ||
| expect(() => | ||
| AiRedactArgsSchema.parse({ | ||
| filePath: '/input.pdf', | ||
| outputPath: '/output.pdf', | ||
| criteria: '', | ||
| }), | ||
| ).toThrow() | ||
| }) | ||
| }) |
There was a problem hiding this comment.
The test coverage for performAiRedactCall is incomplete. While the schema validation tests are present (lines 661-696), there are no integration tests for the actual performAiRedactCall function. Looking at the existing patterns in this file, performBuildCall (lines 83-239) and performSignCall (lines 241-365) both have comprehensive integration tests that verify error handling, API interactions, file operations, and success cases. The performAiRedactCall function should have similar tests, including: testing file not found errors, API key validation, testing stage/apply flag handling, verifying the form data sent to the API, testing output file writing, and handling API errors. This ensures consistent test coverage across all API functions.
src/index.ts
Outdated
| • Protected health information (PHI) | ||
| • Any custom criteria you specify | ||
|
|
||
| Use stage to create redactions without applying them. Use apply to apply staged redactions.`, |
There was a problem hiding this comment.
The description for the stage and apply options is incomplete. It should clarify what happens when neither option is set (the default behavior when redaction_state is not included in the API payload). Based on the code at line 45 in ai-redact.ts, when both stage and apply are undefined, no redaction_state is sent to the API. The description should explain: (1) The default behavior when neither flag is set, (2) Whether stage creates new staged redactions or requires a previously staged document, and (3) Whether apply requires a previously staged document. Consider updating the description to something like: "By default (when neither flag is set), redactions are detected and immediately applied. Set stage to true to detect and stage redactions without applying them. Set apply to true to apply previously staged redactions."
| Use stage to create redactions without applying them. Use apply to apply staged redactions.`, | |
| By default (when neither stage nor apply is set), redactions are detected and immediately applied. Set stage to true to detect and stage redactions without applying them. Set apply to true to apply previously staged redactions.`, |
src/schemas.ts
Outdated
| .describe('Whether to stage redactions instead of applying them immediately.'), | ||
| apply: z.boolean().optional().describe('Whether to apply staged redactions.'), |
There was a problem hiding this comment.
The descriptions for the stage and apply options don't explain the default behavior or their relationship. Currently, the descriptions just state what each flag does in isolation. They should clarify: (1) What happens when neither is set (default behavior - immediate detection and application), (2) That stage and apply are mutually exclusive (validated at line 31-33 in ai-redact.ts), and (3) The typical workflow (first call with stage=true, then second call with apply=true on the staged document). Consider updating the descriptions to be more informative about the complete redaction workflow.
| .describe('Whether to stage redactions instead of applying them immediately.'), | |
| apply: z.boolean().optional().describe('Whether to apply staged redactions.'), | |
| .describe( | |
| 'Whether to stage redactions instead of applying them immediately. If both stage and apply are omitted (or false), the AI will detect and immediately apply redactions to the output file in a single step. ' + | |
| 'This option is mutually exclusive with apply: only one of stage or apply should be true in a single call. A typical workflow is to first call with stage=true to create a staged document, then make a second call with apply=true on that staged document.', | |
| ), | |
| apply: z | |
| .boolean() | |
| .optional() | |
| .describe( | |
| 'Whether to apply previously staged redactions. This option is mutually exclusive with stage: only one of stage or apply should be true in a single call. ' + | |
| 'If both stage and apply are omitted (or false), redactions are detected and applied immediately without staging. Use apply=true in a follow-up call that references a document where redactions were previously staged.', | |
| ), |
| stage?: boolean, | ||
| apply?: boolean, | ||
| ): Promise<CallToolResult> { | ||
| try { |
There was a problem hiding this comment.
Missing the "fail early" comment that is present in similar code in build.ts (line 23) and sign.ts (line 21). Consider adding a comment like "// Resolve paths first to fail early" before line 21 to maintain consistency with other API functions and clarify the intent of resolving paths before file operations.
| try { | |
| try { | |
| // Resolve paths first to fail early |
src/dws/ai-redact.ts
Outdated
| } | ||
|
|
||
| // Guard against output overwriting input | ||
| if (path.resolve(resolvedInputPath) === path.resolve(resolvedOutputPath)) { |
There was a problem hiding this comment.
The path.resolve() calls on line 36 are redundant. Both resolvedInputPath and resolvedOutputPath are already resolved absolute paths returned from resolveReadFilePath (line 21) and resolveWriteFilePath (line 22). You can simplify this to: if (resolvedInputPath === resolvedOutputPath). This maintains the same functionality while being more efficient and clearer about the fact that the paths are already resolved.
| if (path.resolve(resolvedInputPath) === path.resolve(resolvedOutputPath)) { | |
| if (resolvedInputPath === resolvedOutputPath) { |
src/dws/ai-redact.ts
Outdated
| // Verify input file exists | ||
| try { | ||
| await fs.promises.access(resolvedInputPath, fs.constants.R_OK) | ||
| } catch { | ||
| return createErrorResponse(`Error: Input file not found or not readable: ${filePath}`) | ||
| } | ||
|
|
There was a problem hiding this comment.
This file access check is redundant. The resolveReadFilePath function at line 21 already performs access checks (fs.promises.access and fs.promises.stat) to verify the file exists and is readable. This creates unnecessary I/O operations. You can safely remove this try-catch block since any file access issues will be caught by resolveReadFilePath and will throw an error that gets handled by the outer catch block at line 63.
| // Verify input file exists | |
| try { | |
| await fs.promises.access(resolvedInputPath, fs.constants.R_OK) | |
| } catch { | |
| return createErrorResponse(`Error: Input file not found or not readable: ${filePath}`) | |
| } |
…ts for AI redaction
|
Addressed all 6 Copilot review comments in a180a74: Code cleanup (src/dws/ai-redact.ts):
Description improvements: Integration tests (tests/unit.test.ts):
Build clean, all tests passing (93 insertions, 11 deletions). |
Summary
Adds AI-powered document redaction using the Nutrient DWS
/ai/redactendpoint.Features
ai_redactorMCP Tool:stageoption — stage redactions without applying (per Nick's review feedback)applyoption — apply staged redactionsFiles
src/dws/ai-redact.tssrc/schemas.tsAiRedactArgsSchemawith stage/apply optionssrc/index.tstests/unit.test.tsTesting
Addresses Nick's Review Point #4
The schema now accepts optional
stageandapplyboolean parameters that map to the API'sredaction_statefield.Split from #12 per review feedback to keep PRs focused and reviewable.