added script to carry out sanity in elevate-prod#615
added script to carry out sanity in elevate-prod#615Prajwal17Tunerlabs wants to merge 1 commit intomainfrom
Conversation
WalkthroughA new Node.js migration script normalizes orgId values across MongoDB collections. The script handles both single orgId fields and orgIds arrays, applying transformations like trimming, lowercasing, and replacing whitespace with underscores. It includes proper MongoDB connection management and error handling. Changes
Sequence Diagram(s)sequenceDiagram
participant Script as Migration Script
participant DB as MongoDB
Script->>Script: Load environment variables from .env
Script->>DB: Connect to MongoDB
loop For each collection
alt Collection has single orgId
Script->>DB: Query all documents
DB-->>Script: Return documents
Script->>Script: Normalize orgId (trim, lowercase, replace spaces)
Script->>DB: Update documents with original orgId value
else Collection has orgIds array
Script->>DB: Query documents with orgIds array
DB-->>Script: Return documents
Script->>Script: Normalize each orgId in array
alt Changes detected
Script->>DB: Update document with normalized orgIds
end
end
end
Script->>DB: Close connection
Script->>Script: Log completion
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (4)
migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js (4)
40-40: Remove unnecessary semicolon.The leading semicolon before the IIFE is unnecessary.
-;(async () => { +(async () => {As per static analysis hints.
84-89: Improve error handling and exit codes.The error handling is generic and doesn't provide context about which operation failed. Additionally, the script doesn't set exit codes for success/failure, making it harder to use in automated workflows.
console.log('\n Normalization completed!') - connection.close() + await connection.close() + process.exit(0) } catch (error) { - console.error('Error occurred:', error) - connection.close() + console.error('Error occurred during migration:', error) + try { + await connection.close() + } catch (closeError) { + console.error('Error closing connection:', closeError) + } + process.exit(1) } })()
46-62: Consider batch operations for better performance.The script performs individual
updateManyoperations for each distinct orgId value. For collections with many distinct orgIds, this could be slow.Consider using
bulkWritefor better performance:// Collect all updates first const bulkOps = [] for (const originalOrgId of orgIds) { if (typeof originalOrgId !== 'string') continue const normalizedOrgId = normalizeOrgId(originalOrgId) if (normalizedOrgId !== originalOrgId) { bulkOps.push({ updateMany: { filter: { orgId: originalOrgId }, update: { $set: { orgId: normalizedOrgId } } } }) } } // Execute in batch if (bulkOps.length > 0) { const result = await db.collection(collectionName).bulkWrite(bulkOps) console.log(`Updated ${result.modifiedCount} documents in ${collectionName}`) }
70-81: Add progress logging for long-running operations.For collections with many documents, the script provides no progress indication. This makes it difficult to determine if the script is stalled or still processing.
const cursor = db.collection(collectionName).find({ orgIds: { $exists: true, $type: 'array' } }) + let processedCount = 0 + let updatedCount = 0 try { while (await cursor.hasNext()) { const doc = await cursor.next() + processedCount++ const originalOrgIds = doc.orgIds const normalizedOrgIds = originalOrgIds.map((id) => (typeof id === 'string' ? normalizeOrgId(id) : id)) if (!_.isEqual(originalOrgIds, normalizedOrgIds)) { await db .collection(collectionName) .updateOne({ _id: doc._id }, { $set: { orgIds: normalizedOrgIds } }) + updatedCount++ - console.log(`Updated _id: ${doc._id} in ${collectionName}`) + if (updatedCount % 100 === 0) { + console.log(`Progress: Processed ${processedCount} documents, updated ${updatedCount}`) + } } } + console.log(`Completed ${collectionName}: Processed ${processedCount} documents, updated ${updatedCount}`) } finally { await cursor.close() }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js (1)
documentation/1.0.0/native/scripts/linux/import_forms.js (1)
collectionName(6-6)
🪛 ESLint
migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js
[error] 40-40: Unnecessary semicolon.
(no-extra-semi)
🔇 Additional comments (2)
migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js (2)
35-38: Normalization logic is correct.The transformation (trim, lowercase, replace spaces with underscores) is deterministic and appropriate for normalizing organization identifiers. Callers correctly check
typeofbefore invoking this function.
11-12: Update MongoDB driver to a supported version for better security and maintainability.The current latest version is 6.20.0, while the project uses 3.7.4 from 2021. Although the current driver supports 4.2+ servers, the older version limits access to modern features and security updates. Consider upgrading to a newer major version—at minimum to a v4 or v5 release—to align with maintained versions of the driver. No critical vulnerabilities were found for 3.7.4, but newer versions receive ongoing security patches and compatibility improvements.
| const path = require('path') | ||
| const rootPath = path.join(__dirname, '../../') | ||
| require('dotenv').config({ path: rootPath + '/.env' }) | ||
|
|
||
| const MongoClient = require('mongodb').MongoClient | ||
| const _ = require('lodash') | ||
|
|
||
| const mongoUrl = process.env.MONGODB_URL | ||
| const dbName = mongoUrl.split('/').pop() | ||
| const url = mongoUrl.split(dbName)[0] |
There was a problem hiding this comment.
Add validation for required environment variables.
The script proceeds to parse MONGODB_URL without verifying it exists or is valid. If the environment variable is missing or malformed, the error messages will be cryptic.
Apply this diff to add validation:
const mongoUrl = process.env.MONGODB_URL
+if (!mongoUrl) {
+ console.error('Error: MONGODB_URL environment variable is required')
+ process.exit(1)
+}
+
const dbName = mongoUrl.split('/').pop()
+if (!dbName) {
+ console.error('Error: Unable to parse database name from MONGODB_URL')
+ process.exit(1)
+}
+
const url = mongoUrl.split(dbName)[0]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const path = require('path') | |
| const rootPath = path.join(__dirname, '../../') | |
| require('dotenv').config({ path: rootPath + '/.env' }) | |
| const MongoClient = require('mongodb').MongoClient | |
| const _ = require('lodash') | |
| const mongoUrl = process.env.MONGODB_URL | |
| const dbName = mongoUrl.split('/').pop() | |
| const url = mongoUrl.split(dbName)[0] | |
| const path = require('path') | |
| const rootPath = path.join(__dirname, '../../') | |
| require('dotenv').config({ path: rootPath + '/.env' }) | |
| const MongoClient = require('mongodb').MongoClient | |
| const _ = require('lodash') | |
| const mongoUrl = process.env.MONGODB_URL | |
| if (!mongoUrl) { | |
| console.error('Error: MONGODB_URL environment variable is required') | |
| process.exit(1) | |
| } | |
| const dbName = mongoUrl.split('/').pop() | |
| if (!dbName) { | |
| console.error('Error: Unable to parse database name from MONGODB_URL') | |
| process.exit(1) | |
| } | |
| const url = mongoUrl.split(dbName)[0] |
🤖 Prompt for AI Agents
In migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js
around lines 7 to 16, the code uses process.env.MONGODB_URL and immediately
parses it without verifying it's present or well-formed; add validation that
MONGODB_URL is defined and non-empty, check it matches a basic MongoDB URL
pattern (e.g., starts with mongodb:// or mongodb+srv://), and if invalid log a
clear error message and exit with non-zero status; also defensively guard the
dbName/url split (e.g., parse via URL-like logic or verify split results) and
log the original value when failing so developers see the bad input.
| const mongoUrl = process.env.MONGODB_URL | ||
| const dbName = mongoUrl.split('/').pop() | ||
| const url = mongoUrl.split(dbName)[0] |
There was a problem hiding this comment.
URL parsing is fragile and error-prone.
The manual string splitting to extract the database name assumes a specific URL format and will fail with query parameters, URL-encoded characters, or complex connection strings (e.g., replica sets, SRV records).
Use the MongoDB driver's built-in URL parsing instead:
-const mongoUrl = process.env.MONGODB_URL
-const dbName = mongoUrl.split('/').pop()
-const url = mongoUrl.split(dbName)[0]
+const mongoUrl = process.env.MONGODB_URLThen update the connection code (line 41-42):
-const connection = await MongoClient.connect(url, { useNewUrlParser: true, useUnifiedTopology: true })
-const db = connection.db(dbName)
+const connection = await MongoClient.connect(mongoUrl, { useNewUrlParser: true, useUnifiedTopology: true })
+const db = connection.db()When called without arguments, connection.db() uses the database specified in the connection string.
Committable suggestion skipped: line range outside the PR's diff.
| try { | ||
| // Process single orgId collections | ||
| for (const collectionName of singleOrgIdCollections) { | ||
| console.log(`\nProcessing collection: ${collectionName}`) | ||
| const orgIds = await db.collection(collectionName).distinct('orgId') | ||
| console.log('orgids:------------------->', orgIds) | ||
| for (const originalOrgId of orgIds) { | ||
| if (typeof originalOrgId !== 'string') continue | ||
| const normalizedOrgId = normalizeOrgId(originalOrgId) | ||
| if (normalizedOrgId !== originalOrgId) { | ||
| const result = await db | ||
| .collection(collectionName) | ||
| .updateMany({ orgId: originalOrgId }, { $set: { orgId: normalizedOrgId } }) | ||
| console.log( | ||
| `Updated ${result.modifiedCount} documents in ${collectionName} from '${originalOrgId}' to '${normalizedOrgId}'` | ||
| ) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Critical: Non-idempotent writes without safeguards on production data.
This migration script directly modifies production data with several risks:
- No dry-run mode - Changes are applied immediately without preview
- No transaction support - Partial failures leave data in an inconsistent state
- Not safely re-runnable - While the normalization itself is idempotent, the script could fail mid-execution
- No backup mechanism - No way to rollback if issues occur
- Performance concern - Individual
updateManycalls instead of bulk operations
Consider these improvements:
- Add a dry-run mode:
+const DRY_RUN = process.env.DRY_RUN === 'true'
+
;(async () => {
const connection = await MongoClient.connect(url, { useNewUrlParser: true, useUnifiedTopology: true })
const db = connection.db(dbName)
+
+ if (DRY_RUN) {
+ console.log('Running in DRY-RUN mode. No changes will be made.\n')
+ }- Add safety confirmation and document the need for backup:
// At the start of the script, add:
console.log('WARNING: This script will modify data in the following collections:')
console.log([...singleOrgIdCollections, ...multiOrgIdsCollections].join(', '))
console.log('\nEnsure you have a backup before proceeding.')
console.log('Set DRY_RUN=true to preview changes without applying them.\n')- Make updates conditional in dry-run mode:
if (normalizedOrgId !== originalOrgId) {
+ if (DRY_RUN) {
+ const count = await db.collection(collectionName).countDocuments({ orgId: originalOrgId })
+ console.log(
+ `[DRY-RUN] Would update ${count} documents in ${collectionName} from '${originalOrgId}' to '${normalizedOrgId}'`
+ )
+ } else {
const result = await db
.collection(collectionName)
.updateMany({ orgId: originalOrgId }, { $set: { orgId: normalizedOrgId } })
console.log(
`Updated ${result.modifiedCount} documents in ${collectionName} from '${originalOrgId}' to '${normalizedOrgId}'`
)
+ }
}🤖 Prompt for AI Agents
In migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js
around lines 44 to 62, the migration performs immediate, potentially
non-idempotent writes to production without safeguards; add a DRY_RUN mode that
logs intended updates instead of applying them, require an explicit confirmation
flag (or env var) to run destructive actions and print a clear backup warning at
startup, wrap modifications in MongoDB transactions where supported to avoid
partial commits, convert per-org updateMany calls into batched bulkWrite
operations to improve performance and atomicity, and ensure each update is
conditional (skip when normalizedOrgId equals original) so the script is safely
re-runnable and produces no-op results on subsequent runs.
| const cursor = db.collection(collectionName).find({ orgIds: { $exists: true, $type: 'array' } }) | ||
| while (await cursor.hasNext()) { | ||
| const doc = await cursor.next() | ||
| const originalOrgIds = doc.orgIds | ||
| const normalizedOrgIds = originalOrgIds.map((id) => (typeof id === 'string' ? normalizeOrgId(id) : id)) | ||
|
|
||
| if (!_.isEqual(originalOrgIds, normalizedOrgIds)) { | ||
| await db | ||
| .collection(collectionName) | ||
| .updateOne({ _id: doc._id }, { $set: { orgIds: normalizedOrgIds } }) | ||
| console.log(`Updated _id: ${doc._id} in ${collectionName}`) | ||
| } | ||
| } |
There was a problem hiding this comment.
Resource leak: Cursor is never closed.
The cursor created on line 69 is never explicitly closed. If an error occurs during iteration or after the loop completes, the cursor remains open, potentially leaking resources.
Use try-finally to ensure cursor cleanup:
// Process orgIds array in userExtension collection
for (const collectionName of multiOrgIdsCollections) {
console.log(`\nProcessing array orgIds in collection: ${collectionName}`)
// Get all documents
const cursor = db.collection(collectionName).find({ orgIds: { $exists: true, $type: 'array' } })
- while (await cursor.hasNext()) {
- const doc = await cursor.next()
- const originalOrgIds = doc.orgIds
- const normalizedOrgIds = originalOrgIds.map((id) => (typeof id === 'string' ? normalizeOrgId(id) : id))
-
- if (!_.isEqual(originalOrgIds, normalizedOrgIds)) {
- await db
- .collection(collectionName)
- .updateOne({ _id: doc._id }, { $set: { orgIds: normalizedOrgIds } })
- console.log(`Updated _id: ${doc._id} in ${collectionName}`)
+ try {
+ while (await cursor.hasNext()) {
+ const doc = await cursor.next()
+ const originalOrgIds = doc.orgIds
+ const normalizedOrgIds = originalOrgIds.map((id) => (typeof id === 'string' ? normalizeOrgId(id) : id))
+
+ if (!_.isEqual(originalOrgIds, normalizedOrgIds)) {
+ await db
+ .collection(collectionName)
+ .updateOne({ _id: doc._id }, { $set: { orgIds: normalizedOrgIds } })
+ console.log(`Updated _id: ${doc._id} in ${collectionName}`)
+ }
}
+ } finally {
+ await cursor.close()
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const cursor = db.collection(collectionName).find({ orgIds: { $exists: true, $type: 'array' } }) | |
| while (await cursor.hasNext()) { | |
| const doc = await cursor.next() | |
| const originalOrgIds = doc.orgIds | |
| const normalizedOrgIds = originalOrgIds.map((id) => (typeof id === 'string' ? normalizeOrgId(id) : id)) | |
| if (!_.isEqual(originalOrgIds, normalizedOrgIds)) { | |
| await db | |
| .collection(collectionName) | |
| .updateOne({ _id: doc._id }, { $set: { orgIds: normalizedOrgIds } }) | |
| console.log(`Updated _id: ${doc._id} in ${collectionName}`) | |
| } | |
| } | |
| const cursor = db.collection(collectionName).find({ orgIds: { $exists: true, $type: 'array' } }) | |
| try { | |
| while (await cursor.hasNext()) { | |
| const doc = await cursor.next() | |
| const originalOrgIds = doc.orgIds | |
| const normalizedOrgIds = originalOrgIds.map((id) => (typeof id === 'string' ? normalizeOrgId(id) : id)) | |
| if (!_.isEqual(originalOrgIds, normalizedOrgIds)) { | |
| await db | |
| .collection(collectionName) | |
| .updateOne({ _id: doc._id }, { $set: { orgIds: normalizedOrgIds } }) | |
| console.log(`Updated _id: ${doc._id} in ${collectionName}`) | |
| } | |
| } | |
| } finally { | |
| await cursor.close() | |
| } |
🤖 Prompt for AI Agents
In migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js
around lines 69 to 81, the Mongo cursor opened with .find(...) is never closed
which can leak resources; wrap the cursor iteration in a try-finally block and
call await cursor.close() in the finally (or check cursor && !cursor.closed then
close) so the cursor is always cleaned up even on errors; keep the existing
logic inside the try and perform the update/console.log as before, ensuring any
thrown error still results in cursor.close() being invoked.
Description
These recommendations are intended to promote code quality and team communication during software development. They cover a variety of topics, including ensuring that pull requests are submitted to the correct branch, documenting new methods, preserving consistency across many services, and avoiding typical blunders like accessing APIs or DB queries within loops. Sensitive data should not be uploaded, and changes to environment variables or database models should be executed consistently. Teams may work more effectively and develop higher-quality software by adhering to these standards.
Type of change
Please choose appropriate options.
Checklist
Summary by CodeRabbit