Skip to content

Conversation

@tupizz
Copy link

@tupizz tupizz commented Dec 12, 2025

Summary

Add support for converting legacy .doc files to .docx format using a conversion server powered by LibreOffice. This enables SuperDoc to open and edit legacy Word documents seamlessly.

  • Add DOC type constant to document-types.ts
  • Create documentConverter.js helper module with conversion logic
  • Update SuperDoc.js with automatic .doc detection and conversion
  • Add conversion events (onConversionStart, onConversionComplete, onConversionError)
  • Add modules.conversion config option for server URL
  • Update file.js to detect .doc files by extension
  • Update BasicUpload.vue to accept .doc files
  • Add conversion-server example with Docker support

Demo

CleanShot.2025-12-12.at.10.26.58.mp4

Usage

const superdoc = new SuperDoc({
  document: docFile, // Can be .doc or .docx
  modules: {
    conversion: {
      serverUrl: 'http://localhost:3001',
      timeout: 60000, // optional
    },
  },
  onConversionStart: ({ fileName }) => console.log(`Converting: ${fileName}`),
  onConversionComplete: ({ convertedFile }) => console.log(`Done: ${convertedFile.name}`),
  onConversionError: ({ error }) => console.error(error),
});

Conversion Server

A Docker-based conversion server is included in examples/conversion-server/:

cd examples/conversion-server
docker-compose up -d

The server uses LibreOffice for reliable .doc to .docx conversion.

Test plan

  • Upload a .doc file and verify it converts and loads correctly
  • Verify conversion events fire appropriately
  • Test error handling when conversion server is unavailable
  • Test with various .doc files (different Word versions)
  • Verify .docx files continue to work normally (no regression)

Closes #1019

Add support for converting legacy .doc files to .docx format using a
conversion server powered by LibreOffice.

Changes:
- Add DOC type constant to document-types.ts
- Create documentConverter.js helper module for conversion logic
- Update SuperDoc.js with automatic .doc detection and conversion
- Add conversion events (onConversionStart, onConversionComplete, onConversionError)
- Add modules.conversion config option for server URL
- Update file.js to detect .doc files
- Update BasicUpload.vue to accept .doc files
- Add conversion-server example with Docker support

Usage:
```javascript
const superdoc = new SuperDoc({
  document: docFile,
  modules: {
    conversion: {
      serverUrl: 'http://localhost:3001',
    },
  },
});
```

Closes superdoc-dev#1019
Copilot AI review requested due to automatic review settings December 12, 2025 13:26
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for converting legacy .doc files to .docx format, enabling SuperDoc to work with older Word document formats. The implementation includes a LibreOffice-based conversion server and seamless client-side integration with automatic conversion detection and event handling.

Key Changes:

  • Adds .doc MIME type constant and file detection logic throughout the codebase
  • Implements a conversion helper module with timeout handling and error management
  • Integrates automatic .doc to .docx conversion into SuperDoc's initialization flow
  • Provides a Docker-based conversion server using LibreOffice for reliable document conversion

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
shared/common/document-types.ts Adds DOC MIME type constant and includes it in DocumentType union
shared/common/components/BasicUpload.vue Updates file upload component to accept .doc files
packages/superdoc/src/dev/components/SuperdocDev.vue Adds conversion UI dialogs, state management, and event handlers for development
packages/superdoc/src/core/helpers/file.js Adds .doc extension detection in file type inference
packages/superdoc/src/core/helpers/documentConverter.js New module providing conversion logic, server communication, and utilities
packages/superdoc/src/core/SuperDoc.js Integrates automatic conversion into document initialization with dual event emission pattern
examples/conversion-server/server.js Express server implementing LibreOffice-based conversion with file upload handling
examples/conversion-server/package.json Dependencies and scripts for the conversion server
examples/conversion-server/docker-compose.yml Docker Compose configuration with health checks and resource limits
examples/conversion-server/README.md Comprehensive documentation for setup, usage, and troubleshooting
examples/conversion-server/Dockerfile Multi-stage Docker build with LibreOffice and security best practices
examples/conversion-server/.gitignore Standard ignore patterns for Node.js projects
examples/conversion-server/.dockerignore Excludes unnecessary files from Docker build context

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

<div class="doc-conversion-error">
{{ conversionError }}
</div>
<p style="margin-top: 12px">Make sure the conversion server is running at {{ CONVERSION_SERVER_URL }}</p>
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CONVERSION_SERVER_URL is hardcoded and displayed in user-facing error messages. If this value could be derived from user input or environment variables in the future, ensure proper sanitization to prevent XSS attacks through the template interpolation on line 680.

Copilot uses AI. Check for mistakes.
if (!file) return false;

// Check by MIME type
if (file.type === DOC || file.type === 'application/msword') {
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MIME type check on line 27 redundantly checks for both DOC constant (which equals 'application/msword') and the literal string 'application/msword'. Since DOC is already defined as 'application/msword', the second condition is unnecessary. Remove the redundant check for cleaner code.

Suggested change
if (file.type === DOC || file.type === 'application/msword') {
if (file.type === DOC) {

Copilot uses AI. Check for mistakes.
Comment on lines +109 to +111
const originalName = req.file.originalname.replace(/\.doc$/i, '.docx');
res.setHeader('Content-Type', 'application/vnd.openxmlformats-officedocument.wordprocessingml.document');
res.setHeader('Content-Disposition', `attachment; filename="${originalName}"`);
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filename replacement using regex is not escaped in the Content-Disposition header, which could lead to HTTP header injection if the original filename contains newline characters or other special characters. Use a proper header value escaping function or validate the filename before using it.

Copilot uses AI. Check for mistakes.
memory: 512M
# Health check
healthcheck:
test: ["CMD", "node", "-e", "fetch('http://localhost:3001/health').then(r => process.exit(r.ok ? 0 : 1))"]
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The health check command uses fetch which is not available in Node.js versions prior to 18 (it was added as experimental in v18 and stable in v21). While the Dockerfile uses node:20-slim, this should still work, but the command will fail in Node 18. Consider using a more portable approach or documenting the Node version requirement.

Suggested change
test: ["CMD", "node", "-e", "fetch('http://localhost:3001/health').then(r => process.exit(r.ok ? 0 : 1))"]
test: ["CMD", "curl", "-f", "http://localhost:3001/health"]

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +38
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD node -e "fetch('http://localhost:3001/health').then(r => process.exit(r.ok ? 0 : 1))" || exit 1
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The health check command uses fetch which is not available in Node.js versions prior to 18 (it was added as experimental in v18 and stable in v21). While this uses node:20-slim, the command will fail in Node 18. Consider using a more portable approach or documenting the Node version requirement clearly.

Copilot uses AI. Check for mistakes.
Comment on lines +93 to +96
console.log(`Converting: ${inputPath}`);
const command = `"${libreOfficePath}" --headless --convert-to docx --outdir "${outputDir}" "${inputPath}"`;

await execAsync(command, { timeout: 60000 }); // 60 second timeout
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command construction is vulnerable to command injection. The inputPath and outputDir come from user-controlled file uploads and are not properly sanitized before being used in shell commands. An attacker could craft a malicious filename to execute arbitrary commands. Use proper argument escaping or pass arguments as an array to spawn instead of using string concatenation with exec.

Copilot uses AI. Check for mistakes.
Comment on lines +70 to +74
const response = await fetch(`${serverUrl}/convert`, {
method: 'POST',
body: formData,
signal: controller.signal,
});
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The serverUrl is not validated before being used in fetch requests. This could allow Server-Side Request Forgery (SSRF) attacks if an attacker can control the conversion config. Consider validating that the URL uses an allowed protocol (http/https) and optionally checking against an allowlist of domains.

Copilot uses AI. Check for mistakes.
// Configure multer for file uploads
const storage = multer.diskStorage({
destination: async (req, file, cb) => {
const tempDir = path.join(os.tmpdir(), 'superdoc-conversions');
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a potential race condition where multiple concurrent conversions could attempt to write to the same temp directory. While the unique filename generation helps, consider adding file locking or ensuring the temp directory can handle concurrent writes safely. This is especially important if the conversion server will be used in production.

Suggested change
const tempDir = path.join(os.tmpdir(), 'superdoc-conversions');
const baseTempDir = path.join(os.tmpdir(), 'superdoc-conversions');
const uniqueSubdir = crypto.randomBytes(8).toString('hex');
const tempDir = path.join(baseTempDir, uniqueSubdir);

Copilot uses AI. Check for mistakes.
Comment on lines +105 to +119
// Read the converted file
const convertedFile = await fs.readFile(expectedOutputPath);

// Set response headers
const originalName = req.file.originalname.replace(/\.doc$/i, '.docx');
res.setHeader('Content-Type', 'application/vnd.openxmlformats-officedocument.wordprocessingml.document');
res.setHeader('Content-Disposition', `attachment; filename="${originalName}"`);
res.setHeader('Content-Length', convertedFile.length);

// Send the file
res.send(convertedFile);

// Cleanup files
await cleanupFiles([inputPath, expectedOutputPath]);

Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The converted file is read entirely into memory before sending. For large files (up to the 50MB limit), this could cause memory issues under high load. Consider using streaming with fs.createReadStream() and res.sendFile() or piping the stream directly to the response for better memory efficiency.

Suggested change
// Read the converted file
const convertedFile = await fs.readFile(expectedOutputPath);
// Set response headers
const originalName = req.file.originalname.replace(/\.doc$/i, '.docx');
res.setHeader('Content-Type', 'application/vnd.openxmlformats-officedocument.wordprocessingml.document');
res.setHeader('Content-Disposition', `attachment; filename="${originalName}"`);
res.setHeader('Content-Length', convertedFile.length);
// Send the file
res.send(convertedFile);
// Cleanup files
await cleanupFiles([inputPath, expectedOutputPath]);
// Set response headers
const originalName = req.file.originalname.replace(/\.doc$/i, '.docx');
res.setHeader('Content-Type', 'application/vnd.openxmlformats-officedocument.wordprocessingml.document');
res.setHeader('Content-Disposition', `attachment; filename="${originalName}"`);
// Stream the file to the response
res.sendFile(expectedOutputPath, {}, (err) => {
// Cleanup files after response is sent or on error
cleanupFiles([inputPath, expectedOutputPath]).catch(() => {});
if (err) {
console.error('Error sending file:', err);
if (!res.headersSent) {
res.status(500).json({ error: 'Failed to send file' });
}
}
});

Copilot uses AI. Check for mistakes.
Comment on lines +115 to +119
res.send(convertedFile);

// Cleanup files
await cleanupFiles([inputPath, expectedOutputPath]);

Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the client disconnects during the response (after line 115 but before line 118), the cleanup at line 118 won't execute, leaving temporary files on disk. Consider adding a response event listener or using a try-finally block to ensure cleanup always occurs.

Suggested change
res.send(convertedFile);
// Cleanup files
await cleanupFiles([inputPath, expectedOutputPath]);
let cleanedUp = false;
const doCleanup = async () => {
if (!cleanedUp) {
cleanedUp = true;
try {
await cleanupFiles([inputPath, expectedOutputPath]);
} catch (e) {
// Optionally log cleanup error
}
}
};
res.on('close', doCleanup);
res.send(convertedFile);
// Ensure cleanup after send (in case 'close' hasn't fired yet)
await doCleanup();

Copilot uses AI. Check for mistakes.
@tupizz
Copy link
Author

tupizz commented Dec 12, 2025

@edoversb 🙏🏻
I'd love feedbacks on this suggestion I made

Comment on lines +252 to +258
findLibreOffice()
.then(path => console.log(` LibreOffice found at: ${path}\n`))
.catch(() => {
console.log(` WARNING: LibreOffice not found!`);
console.log(` Install instructions:`, getInstallInstructions());
console.log('');
});
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove please

@harbournick harbournick force-pushed the main branch 2 times, most recently from 84f8ea5 to 2a443d8 Compare December 23, 2025 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SuperDoc has plan to support .doc files?

1 participant