feat: Add multi-format file upload support for RAG (PDF, TXT, JSON, TSV) #90

yash12991 · 2026-01-05T17:51:11Z

Add support for PDF, TXT, JSON, and TSV file formats in addition to CSV
Implement FileLoaderUtils class for handling multiple file formats
Add file upload functionality using multer middleware
Create comprehensive validation for uploaded files (type, size, format)
Update DataSet controller with automatic file format detection
Add new endpoint GET /api/v1/dataset/formats to list supported formats
Change upload endpoint from GET to POST with multipart/form-data
Add TypeScript types and interfaces for better type safety
Include test sample files for each supported format
Add comprehensive documentation in README.md
Implement automatic file cleanup after processing
Closes feat: Add multi-format file upload support for RAG (PDF, TXT, JSON, TSV) #91
Dependencies added:
pdf-parse: For PDF document parsing
multer: For handling file uploads
@types/multer: TypeScript types for multer

Breaking changes:

Upload endpoint changed from GET /upload to POST /upload
Now requires file upload via form-data instead of hardcoded path

- Add support for PDF, TXT, JSON, and TSV file formats in addition to CSV - Implement FileLoaderUtils class for handling multiple file formats - Add file upload functionality using multer middleware - Create comprehensive validation for uploaded files (type, size, format) - Update DataSet controller with automatic file format detection - Add new endpoint GET /api/v1/dataset/formats to list supported formats - Change upload endpoint from GET to POST with multipart/form-data - Add TypeScript types and interfaces for better type safety - Include test sample files for each supported format - Add comprehensive documentation in README.md - Implement automatic file cleanup after processing Dependencies added: - pdf-parse: For PDF document parsing - multer: For handling file uploads - @types/multer: TypeScript types for multer Breaking changes: - Upload endpoint changed from GET /upload to POST /upload - Now requires file upload via form-data instead of hardcoded path

gemini-code-assist · 2026-01-05T17:51:37Z

Summary of Changes

Hello @yash12991, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the RAG system's data ingestion capabilities by introducing support for multiple file formats. Previously limited to CSV, the system can now process PDF, TXT, JSON, and TSV files, broadening its applicability for various knowledge bases. The changes involve a complete overhaul of the file upload and processing pipeline, incorporating robust validation, dynamic format detection, and automatic cleanup, all while maintaining a clear and extensible architecture.

Highlights

Multi-Format File Upload: The RAG system now supports uploading and processing datasets in PDF, TXT, JSON, and TSV formats, in addition to the existing CSV support. This significantly expands the types of data that can be used for knowledge retrieval.
Modular File Handling: A new FileLoaderUtils class has been implemented to centralize file format detection and loading logic, making the system extensible for future formats.
Robust File Upload Mechanism: The pull request integrates multer middleware for handling multipart/form-data file uploads, ensuring secure and efficient file reception.
Comprehensive File Validation: Uploaded files undergo thorough validation for type, size (max 10MB, not empty), and format, preventing invalid or malicious uploads.
Dynamic Format Detection: The DataSetController now automatically detects the file format based on its extension and MIME type, streamlining the processing workflow.
New API Endpoint for Supported Formats: A new GET /api/v1/dataset/formats endpoint has been added, allowing clients to query the currently supported file formats and their descriptions.
Automatic File Cleanup: Uploaded files are automatically removed from the server after successful processing or in case of an error, ensuring system hygiene and preventing accumulation of temporary files.
Breaking Change: Upload Endpoint: The dataset upload endpoint has changed from GET /upload to POST /upload and now requires file submission via multipart/form-data instead of a hardcoded path.
Enhanced Type Safety: New TypeScript types and interfaces (FileFormat, UploadedFileMetadata, etc.) have been introduced to improve code clarity and maintainability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces significant and valuable functionality by adding support for multiple file formats (PDF, TXT, JSON, TSV) for dataset uploads. The implementation is well-structured, with new utility classes for file loading, multer configuration for uploads, and validation, which greatly enhances the RAG system's flexibility.

My review focuses on improving error handling, security, and performance. I've suggested changes to:

Prevent potential information leaks in error responses.
Use asynchronous file operations to avoid blocking the event loop.
Refine the API response for supported formats to ensure it accurately reflects the current implementation.
Improve error wrapping to preserve stack traces for easier debugging.
Consolidate validation logic to avoid redundancy and ensure consistent error handling.

Overall, this is a great addition to the project. Addressing these points will make the new functionality more robust and secure.

gemini-code-assist · 2026-01-05T17:53:59Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.controller.ts

+        if (fs.existsSync(filePath)) {
+          fs.unlinkSync(filePath)
+        }


Using fs.unlinkSync is synchronous and blocks the Node.js event loop. In a server environment, this can degrade performance, especially under load. It's recommended to use the asynchronous version, fs.promises.unlink, to avoid blocking. This comment also applies to the other uses of unlinkSync in this file (lines 49, 60, and 71).

Suggested change

if (fs.existsSync(filePath)) {

fs.unlinkSync(filePath)

}

if (fs.existsSync(filePath)) {

await fs.promises.unlink(filePath)

}

gemini-code-assist · 2026-01-05T17:53:59Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.controller.ts

+      SendResponse.error(
+        res,
+        'Failed to upload and process dataset',
+        500,
+        error.message
+      )


Exposing raw error messages to the client can be a security risk, as it might leak sensitive information about the application's internals (e.g., file paths, library issues). It's better to log the detailed error on the server for debugging and send a generic error message to the client.

Suggested change

SendResponse.error(

res,

'Failed to upload and process dataset',

500,

error.message

)

console.error('Failed to upload and process dataset:', error);

SendResponse.error(

res,

'Failed to upload and process dataset',

500

)

gemini-code-assist · 2026-01-05T17:53:59Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.controller.ts

+      const formats = Object.values(FileFormat)
+      SendResponse.success(res, 'Supported file formats', {
+        formats,
+        description: {
+          csv: 'Comma-separated values file',
+          xlsx: 'Excel spreadsheet (not yet fully supported)',
+          tsv: 'Tab-separated values file',
+          json: 'JSON file with Q&A pairs',
+          pdf: 'PDF document',
+          txt: 'Plain text file',
+        },
+      })


The getSupportedFormats endpoint currently lists 'xlsx' as a supported format, but the implementation in DataSet.fileLoader.ts throws an error because it's not yet implemented. This can be misleading for API consumers. It's better to remove 'xlsx' from the list of supported formats until it is fully functional.

const formats = Object.values(FileFormat).filter( (f) => f !== FileFormat.XLSX ) SendResponse.success(res, 'Supported file formats', { formats, description: { csv: 'Comma-separated values file', tsv: 'Tab-separated values file', json: 'JSON file with Q&A pairs', pdf: 'PDF document', txt: 'Plain text file', }, })

gemini-code-assist · 2026-01-05T17:53:59Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.fileLoader.ts

+    } catch (error) {
+      throw new Error(`Failed to load file: ${error}`)
+    }


Wrapping the caught error object directly in a new Error constructor will stringify it (often to [object Object]), losing valuable information like the original error's stack trace and type. To preserve this information for better debugging, you should re-throw the original error or create a new error that includes the original error's message.

Suggested change

} catch (error) {

throw new Error(`Failed to load file: ${error}`)

}

} catch (error: any) {

throw new Error(`Failed to load file: ${error.message}`)

}

gemini-code-assist · 2026-01-05T17:53:59Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.validator.ts

+  public validateFileUpload = (
+    req: Request,
+    res: Response,
+    next: NextFunction
+  ): void => {
+    try {
+      // Check if file exists in request
+      if (!req.file) {
+        SendResponse.error(res, 'No file uploaded', 400)
+        return
+      }
+
+      const file = req.file
+
+      // Validate file size
+      if (file.size > this.MAX_FILE_SIZE) {
+        SendResponse.error(
+          res,
+          `File size exceeds maximum limit of ${this.MAX_FILE_SIZE / (1024 * 1024)}MB`,
+          400
+        )
+        return
+      }
+
+      // Validate file size (minimum)
+      if (file.size === 0) {
+        SendResponse.error(res, 'Uploaded file is empty', 400)
+        return
+      }
+
+      // Validate file extension
+      const fileExtension = file.originalname
+        .toLowerCase()
+        .substring(file.originalname.lastIndexOf('.'))
+
+      if (!this.ALLOWED_EXTENSIONS.includes(fileExtension)) {
+        SendResponse.error(
+          res,
+          `Invalid file extension. Allowed extensions: ${this.ALLOWED_EXTENSIONS.join(', ')}`,
+          400
+        )
+        return
+      }
+
+      // Validate MIME type
+      if (!this.ALLOWED_MIME_TYPES.includes(file.mimetype)) {
+        SendResponse.error(
+          res,
+          `Invalid file type. Allowed types: ${this.ALLOWED_MIME_TYPES.join(', ')}`,
+          400
+        )
+        return
+      }
+
+      // All validations passed
+      next()
+    } catch (error: any) {
+      SendResponse.error(res, 'File validation failed', 400, error.message)
+    }
+  }


There appears to be redundant validation logic here. The multer configuration in DataSet.multer.ts already handles file size and MIME type validation through its limits and fileFilter options. When multer rejects a file, it raises an error that will likely bypass this middleware, preventing these custom error messages from being sent and leading to inconsistent error handling.

It's recommended to centralize this validation. A good approach is to let multer perform the checks and then use a dedicated error-handling middleware for multer errors to format and send user-friendly responses. This would remove the need for the redundant checks in this validateFileUpload function.

Copilot

Pull request overview

This PR adds comprehensive multi-format file upload support to the LocalMind RAG system, expanding beyond CSV to include PDF, TXT, JSON, and TSV formats. The implementation introduces a robust file handling architecture with validation, automatic format detection, and clean separation of concerns.

Key Changes:

Implemented FileLoaderUtils class with format-specific document loaders for CSV, PDF, TXT, JSON, and TSV files
Added multer middleware for handling multipart/form-data file uploads with size limits and type validation
Introduced comprehensive TypeScript type definitions for file formats and processing results

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 14 comments.

Show a summary per file

File	Description
DataSet.type.ts	Defines TypeScript enums and interfaces for file formats, Q&A pairs, validation errors, and upload metadata
DataSet.fileLoader.ts	Implements format detection and file loading logic for all supported formats using LangChain loaders
DataSet.multer.ts	Configures multer with disk storage, file filtering, and 10MB size limits for secure file uploads
DataSet.validator.ts	Provides middleware for validating file uploads by extension, MIME type, and size constraints
DataSet.controller.ts	Updates upload handler to support multiple formats with automatic detection and adds formats endpoint
DataSet.routes.ts	Changes upload endpoint from GET to POST with multer and validation middleware chain
README.md	Comprehensive documentation covering usage, formats, validation rules, and testing examples
sample-qa.*	Test sample files demonstrating expected format for each supported file type
package.json	Adds pdf-parse, multer, and @types/multer dependencies

Files not reviewed (1)

LocalMind-Backend/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-09T10:27:48Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.routes.ts

+/**
+ * @route   POST /api/v1/dataset/upload
+ * @desc    Upload and process a dataset file (CSV, PDF, TXT, JSON, TSV)
+ * @access  Public (add authentication if needed)
+ */
+router.post(
+  '/upload',
+  upload.single('file'), // 'file' is the field name for the uploaded file
+  DataSetValidator.validateFileUpload,
+  DataSetController.uploadDataSet
+)


The route is marked as "Public" in the comment, but file upload endpoints typically require authentication to prevent abuse. Consider adding authentication middleware to protect this endpoint from unauthorized uploads, DoS attacks, or malicious file uploads.

Copilot · 2026-01-09T10:27:49Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.validator.ts

+      // Validate file extension
+      const fileExtension = file.originalname
+        .toLowerCase()
+        .substring(file.originalname.lastIndexOf('.'))
+
+      if (!this.ALLOWED_EXTENSIONS.includes(fileExtension)) {
+        SendResponse.error(
+          res,
+          `Invalid file extension. Allowed extensions: ${this.ALLOWED_EXTENSIONS.join(', ')}`,
+          400
+        )
+        return
+      }
+
+      // Validate MIME type
+      if (!this.ALLOWED_MIME_TYPES.includes(file.mimetype)) {
+        SendResponse.error(
+          res,
+          `Invalid file type. Allowed types: ${this.ALLOWED_MIME_TYPES.join(', ')}`,
+          400
+        )
+        return
+      }


The file validation only checks MIME type and extension, but doesn't validate file content. Malicious users could upload files with correct extensions but harmful content. Consider adding content validation, especially for PDF and JSON files, to ensure they are well-formed and don't contain malicious payloads.

Copilot · 2026-01-09T10:27:49Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.multer.ts

+  filename: (req, file, cb) => {
+    // Generate unique filename with timestamp
+    const uniqueSuffix = Date.now() + '-' + Math.round(Math.random() * 1e9)
+    const ext = path.extname(file.originalname)
+    const basename = path.basename(file.originalname, ext)
+    cb(null, `${basename}-${uniqueSuffix}${ext}`)


The filename generation uses Math.random() which is not cryptographically secure. For file uploads that may contain sensitive data, consider using crypto.randomBytes() or crypto.randomUUID() to generate more secure unique identifiers that are harder to predict.

Copilot · 2026-01-09T10:27:50Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.controller.ts

+          xlsx: 'Excel spreadsheet (not yet fully supported)',
+          tsv: 'Tab-separated values file',
+          json: 'JSON file with Q&A pairs',
+          pdf: 'PDF document',
+          txt: 'Plain text file',
+        },


The description states "Excel spreadsheet (not yet fully supported)" which is misleading. The XLSX format is included in the enum and allowed MIME types, but the implementation explicitly throws an error. Either remove XLSX from the supported formats list until it's implemented, or provide partial support with clear documentation about limitations.

Copilot · 2026-01-09T10:27:51Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.controller.ts

+      // Clean up uploaded file on error
+      if (req.file && fs.existsSync(req.file.path)) {
+        fs.unlinkSync(req.file.path)
+      }


If the file cleanup fails (e.g., due to permissions or file locks), the error is silently ignored. Consider logging cleanup failures so administrators can identify and resolve issues with orphaned files that couldn't be deleted.

Copilot · 2026-01-09T10:27:53Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.multer.ts

+import * as fs from 'fs'
+
+// Ensure uploads directory exists
+const uploadsDir = path.join(process.cwd(), 'uploads', 'datasets')


The uploads directory path is created using process.cwd() which can be fragile if the working directory changes at runtime. Consider using __dirname or a configuration-based approach to ensure the uploads directory is always relative to the application root, regardless of where the process is started from.

Suggested change

const uploadsDir = path.join(process.cwd(), 'uploads', 'datasets')

const uploadsDir = path.resolve(

__dirname,

'..',

'..',

'..',

'..',

'..',

'uploads',

'datasets'

)

Copilot · 2026-01-09T10:27:53Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.controller.ts

+        if (fs.existsSync(filePath)) {
+          fs.unlinkSync(filePath)
+        }
+        SendResponse.error(
+          res,
+          'Unsupported file format. Supported formats: CSV, PDF, TXT, JSON, TSV',
+          400
+        )
+        return
+      }
+
+      // Load documents from file
+      const documents = await FileLoaderUtils.loadFile(filePath, fileFormat)
+
+      if (!documents || documents.length === 0) {
+        if (fs.existsSync(filePath)) {
+          fs.unlinkSync(filePath)
+        }
+        SendResponse.error(res, 'No data found in the uploaded file', 400)
+        return
+      }

-      const loader = new CSVLoader(filePath)
-      const documents = await loader.load()
+      // Process the dataset
+      const processedData = await DataSetService.Prepate_DataSet(documents)

-      const Prepare_dataSet = await DataSetService.Prepate_DataSet(documents)
+      // Clean up uploaded file after processing
+      if (fs.existsSync(filePath)) {
+        fs.unlinkSync(filePath)
+      }

      SendResponse.success(
        res,
-        'Dataset uploaded and processed successfully',
-        JSON.parse(Prepare_dataSet)
+        `Dataset uploaded and processed successfully (Format: ${fileFormat.toUpperCase()})`,
+        JSON.parse(processedData)
      )
    } catch (error: any) {
-      SendResponse.error(res, 'Failed to upload dataset', 500, error)
+      // Clean up uploaded file on error
+      if (req.file && fs.existsSync(req.file.path)) {
+        fs.unlinkSync(req.file.path)
+      }


The file cleanup logic is duplicated in multiple places (lines 33-35, 48-50, 59-61, 70-72). Consider extracting this into a helper function like cleanupFile(filePath: string) to reduce code duplication and ensure consistent cleanup behavior across all code paths.

Copilot · 2026-01-09T10:27:53Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.multer.ts

+const uploadsDir = path.join(process.cwd(), 'uploads', 'datasets')
+if (!fs.existsSync(uploadsDir)) {
+  fs.mkdirSync(uploadsDir, { recursive: true })
+}
+


The uploaded files are stored in uploads/datasets directory but there's no mechanism to clean up old files. If files fail during processing or users upload many files, this could lead to disk space issues. Consider implementing a cleanup job or using temporary storage with TTL (time-to-live) to automatically remove old uploaded files.

Suggested change

const uploadsDir = path.join(process.cwd(), 'uploads', 'datasets')

if (!fs.existsSync(uploadsDir)) {

fs.mkdirSync(uploadsDir, { recursive: true })

}

const uploadsDir = path.join(process.cwd(), 'uploads', 'datasets')

// TTL (time-to-live) for uploaded files (e.g., 24 hours)

const UPLOAD_TTL_MS = 24 * 60 * 60 * 1000

// Interval at which to run cleanup (e.g., every hour)

const CLEANUP_INTERVAL_MS = 60 * 60 * 1000

async function cleanupOldUploads() {

try {

// If the directory does not exist, nothing to clean up

if (!fs.existsSync(uploadsDir)) {

return

}

const now = Date.now()

const files = await fs.promises.readdir(uploadsDir)

await Promise.all(

files.map(async (file) => {

const filePath = path.join(uploadsDir, file)

try {

const stat = await fs.promises.stat(filePath)

if (stat.isFile() && now - stat.mtimeMs > UPLOAD_TTL_MS) {

await fs.promises.unlink(filePath)

}

} catch {

// Ignore errors for individual files to avoid impacting the main process

}

})

)

} catch {

// Swallow errors from the cleanup job to avoid crashing the application

}

}

if (!fs.existsSync(uploadsDir)) {

fs.mkdirSync(uploadsDir, { recursive: true })

}

// Run an initial cleanup on startup

cleanupOldUploads()

// Schedule periodic cleanup of old uploaded files

const cleanupInterval = setInterval(cleanupOldUploads, CLEANUP_INTERVAL_MS)

// In Node.js environments, unref prevents the timer from keeping the event loop alive

if (typeof (cleanupInterval as any).unref === 'function') {

;(cleanupInterval as any).unref()

}

Copilot · 2026-01-09T10:27:53Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.type.ts

+export interface ValidationError {
+  row?: number
+  fieldName: string
+  error_message: string


The field name error_message uses snake_case which is inconsistent with the TypeScript convention of using camelCase for property names. Consider renaming to errorMessage for consistency with other fields like fieldName.

Suggested change

error_message: string

errorMessage: string

Copilot · 2026-01-09T10:27:54Z

LocalMind-Backend/src/api/v1/DataSet/v1/DataSet.fileLoader.ts

+  public getFileMetadata(filePath: string): UploadedFileMetadata {
+    const stats = fs.statSync(filePath)


Using synchronous file system operations (fs.readFileSync, fs.statSync) can block the Node.js event loop. Consider using async alternatives (fs.promises.readFile, fs.promises.stat) with await for better performance in the async methods loadTXT, loadJSON, and getFileMetadata.

Suggested change

public getFileMetadata(filePath: string): UploadedFileMetadata {

const stats = fs.statSync(filePath)

public async getFileMetadata(filePath: string): Promise<UploadedFileMetadata> {

const stats = await fs.promises.stat(filePath)

gemini-code-assist bot reviewed Jan 5, 2026

View reviewed changes

abhishek-nexgen-dev requested a review from Copilot January 9, 2026 10:19

Copilot started reviewing on behalf of abhishek-nexgen-dev January 9, 2026 10:19 View session

Copilot AI reviewed Jan 9, 2026

View reviewed changes

Merge branch 'master' into yash12991-add-pdf-txt-json-rag-support

67cc28d

-const uploadsDir = path.join(process.cwd(), 'uploads', 'datasets')
-if (!fs.existsSync(uploadsDir)) {
-  fs.mkdirSync(uploadsDir, { recursive: true })
-}
+const uploadsDir = path.join(process.cwd(), 'uploads', 'datasets')
+// TTL (time-to-live) for uploaded files (e.g., 24 hours)
+const UPLOAD_TTL_MS = 24 * 60 * 60 * 1000
+// Interval at which to run cleanup (e.g., every hour)
+const CLEANUP_INTERVAL_MS = 60 * 60 * 1000
+async function cleanupOldUploads() {
+  try {
+    // If the directory does not exist, nothing to clean up
+    if (!fs.existsSync(uploadsDir)) {
+      return
+    }
+    const now = Date.now()
+    const files = await fs.promises.readdir(uploadsDir)
+    await Promise.all(
+      files.map(async (file) => {
+        const filePath = path.join(uploadsDir, file)
+        try {
+          const stat = await fs.promises.stat(filePath)
+          if (stat.isFile() && now - stat.mtimeMs > UPLOAD_TTL_MS) {
+            await fs.promises.unlink(filePath)
+          }
+        } catch {
+          // Ignore errors for individual files to avoid impacting the main process
+        }
+      })
+    )
+  } catch {
+    // Swallow errors from the cleanup job to avoid crashing the application
+  }
+}
+if (!fs.existsSync(uploadsDir)) {
+  fs.mkdirSync(uploadsDir, { recursive: true })
+}
+// Run an initial cleanup on startup
+cleanupOldUploads()
+// Schedule periodic cleanup of old uploaded files
+const cleanupInterval = setInterval(cleanupOldUploads, CLEANUP_INTERVAL_MS)
+// In Node.js environments, unref prevents the timer from keeping the event loop alive
+if (typeof (cleanupInterval as any).unref === 'function') {
+  ;(cleanupInterval as any).unref()
+}

		public getFileMetadata(filePath: string): UploadedFileMetadata {
		const stats = fs.statSync(filePath)

feat: Add multi-format file upload support for RAG (PDF, TXT, JSON, TSV) #90

Are you sure you want to change the base?

feat: Add multi-format file upload support for RAG (PDF, TXT, JSON, TSV) #90

Uh oh!

Conversation

yash12991 commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Jan 5, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yash12991 commented Jan 5, 2026 •

edited

Loading