Skip to content

added script to carry out sanity in elevate-prod#615

Open
Prajwal17Tunerlabs wants to merge 1 commit intomainfrom
chore/add-script
Open

added script to carry out sanity in elevate-prod#615
Prajwal17Tunerlabs wants to merge 1 commit intomainfrom
chore/add-script

Conversation

@Prajwal17Tunerlabs
Copy link
Collaborator

@Prajwal17Tunerlabs Prajwal17Tunerlabs commented Oct 29, 2025

Description

These recommendations are intended to promote code quality and team communication during software development. They cover a variety of topics, including ensuring that pull requests are submitted to the correct branch, documenting new methods, preserving consistency across many services, and avoiding typical blunders like accessing APIs or DB queries within loops. Sensitive data should not be uploaded, and changes to environment variables or database models should be executed consistently. Teams may work more effectively and develop higher-quality software by adhering to these standards.

Type of change

Please choose appropriate options.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Enhancement (additive changes to improve performance)
  • This change requires a documentation update

Checklist

  • It's critical to avoid making needless file modifications in contributions, such as adding new lines, console logs, or additional spaces, to guarantee cleaner and more efficient code. Furthermore, eliminating unnecessary imports from a file might enhance code readability and efficiency.
  • Ensure that the pull request is assigned to the right base branch and that the development branch name contains the JIRA Task Id. Furthermore, each commit message should include the JIRA Task Id in the manner "ED-100: message".
  • Only update packages if it is mentioned and authorized in the design document, and make sure that you have the required permissions.
  • Avoid making API and database queries inside a loop as it can lead to performance issues and slow down the system.
  • When calling another function inside a given function, add comments explaining the purpose and meaning of the passed arguments and expected return values.
  • If adding a blank argument in a function, add a comment explaining the reason for the blank argument.
  • Before submitting a pull request, do a self-review of your code to ensure there are no conflicts with the base branch and all comments have been addressed.
  • Before merging a pull request, it's important to have other team members review it to catch any potential errors or issues
  • To maintain code integrity, it's important to remove all related changes when removing code during a code review.
  • If new constants, endpoints, or utility functions are introduced, it is important to check if they already exist in the service to avoid any duplication.
  • Whenever a new environment variable is added to a service, it's important to ensure that the necessary changes are made to related files such as ".env.sample" and "envVariables.js" to maintain consistency and avoid errors. Additionally, the new environment variable should be added to the devops repository to ensure that it is properly documented and accessible to the team.
  • When adding a new function to a service, it is important to document it with relevant information such as the name, parameters, and return value in a consistent format across all services. Additionally, if there are any changes to the API response, ensure that the documentation in the controllers is updated accordingly.
  • Write a clear and concise commit message that describes the changes made.
  • Maintain consistent function signature and code across all services when adding a function to multiple services. Implement changes to database models in all services that use the same model.
  • Use only let and const. Do not use var.
  • Make common functions for repetitive code blocks.
  • Avoid uploading sensitive information such as secret tokens or passwords in pull requests to ensure data security.
  • Maintain consistent indentation and spacing throughout the code.

Summary by CodeRabbit

  • Chores
    • Standardized organization ID formatting across collections, including normalization of spacing and casing for improved consistency.

@coderabbitai
Copy link

coderabbitai bot commented Oct 29, 2025

Walkthrough

A new Node.js migration script normalizes orgId values across MongoDB collections. The script handles both single orgId fields and orgIds arrays, applying transformations like trimming, lowercasing, and replacing whitespace with underscores. It includes proper MongoDB connection management and error handling.

Changes

Cohort / File(s) Summary
New migration script
migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js
Adds migration script that normalizes orgId and orgIds values by trimming, lowercasing, and replacing whitespace with underscores across multiple collections. Distinguishes between single orgId fields and orgIds arrays, iterates through documents, and updates those with changes. Includes MongoDB connection lifecycle management and error handling.

Sequence Diagram(s)

sequenceDiagram
    participant Script as Migration Script
    participant DB as MongoDB
    
    Script->>Script: Load environment variables from .env
    Script->>DB: Connect to MongoDB
    
    loop For each collection
        alt Collection has single orgId
            Script->>DB: Query all documents
            DB-->>Script: Return documents
            Script->>Script: Normalize orgId (trim, lowercase, replace spaces)
            Script->>DB: Update documents with original orgId value
        else Collection has orgIds array
            Script->>DB: Query documents with orgIds array
            DB-->>Script: Return documents
            Script->>Script: Normalize each orgId in array
            alt Changes detected
                Script->>DB: Update document with normalized orgIds
            end
        end
    end
    
    Script->>DB: Close connection
    Script->>Script: Log completion
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

  • Verify normalization logic (trim, lowercase, replace spaces) is consistently applied to both single orgId and orgIds array cases
  • Confirm MongoDB query patterns and update operations are correct for each collection type
  • Review error handling and connection cleanup logic

Poem

A rabbit hops through data fields so wide,
Trimming spaces, case unified with pride,
Underscores dance where whitespace did hide,
OrgIds normalized, standing side by side, 🐰✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The PR description is identical to the description template and has not been customized for this specific pull request. The template consists entirely of generic contributing guidelines and recommendations, with no actual description of the changes introduced in this PR. Additionally, no type of change has been selected (all checkboxes remain unchecked), and there is no information about the new migration script being added. A proper description should explain what the script does, why it was added, and whether any of the checklist items apply to this specific change. The current submission is essentially an empty template rather than a substantive description.
Title Check ❓ Inconclusive The PR title "added script to carry out sanity in elevate-prod" uses vague terminology that lacks clarity. While the title does reference that a script was added, the phrase "carry out sanity" is generic and non-descriptive, failing to communicate what the script actually does. The raw summary indicates the script's purpose is to normalize orgId fields across MongoDB collections by trimming, lowercasing, and replacing whitespace, but the title does not convey this specific functionality. The vague nature of the language makes it difficult for teammates scanning history to understand the primary purpose of the change.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chore/add-script

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js (4)

40-40: Remove unnecessary semicolon.

The leading semicolon before the IIFE is unnecessary.

-;(async () => {
+(async () => {

As per static analysis hints.


84-89: Improve error handling and exit codes.

The error handling is generic and doesn't provide context about which operation failed. Additionally, the script doesn't set exit codes for success/failure, making it harder to use in automated workflows.

 		console.log('\n Normalization completed!')
-		connection.close()
+		await connection.close()
+		process.exit(0)
 	} catch (error) {
-		console.error('Error occurred:', error)
-		connection.close()
+		console.error('Error occurred during migration:', error)
+		try {
+			await connection.close()
+		} catch (closeError) {
+			console.error('Error closing connection:', closeError)
+		}
+		process.exit(1)
 	}
 })()

46-62: Consider batch operations for better performance.

The script performs individual updateMany operations for each distinct orgId value. For collections with many distinct orgIds, this could be slow.

Consider using bulkWrite for better performance:

// Collect all updates first
const bulkOps = []
for (const originalOrgId of orgIds) {
	if (typeof originalOrgId !== 'string') continue
	const normalizedOrgId = normalizeOrgId(originalOrgId)
	if (normalizedOrgId !== originalOrgId) {
		bulkOps.push({
			updateMany: {
				filter: { orgId: originalOrgId },
				update: { $set: { orgId: normalizedOrgId } }
			}
		})
	}
}

// Execute in batch
if (bulkOps.length > 0) {
	const result = await db.collection(collectionName).bulkWrite(bulkOps)
	console.log(`Updated ${result.modifiedCount} documents in ${collectionName}`)
}

70-81: Add progress logging for long-running operations.

For collections with many documents, the script provides no progress indication. This makes it difficult to determine if the script is stalled or still processing.

 			const cursor = db.collection(collectionName).find({ orgIds: { $exists: true, $type: 'array' } })
+			let processedCount = 0
+			let updatedCount = 0
 			try {
 				while (await cursor.hasNext()) {
 					const doc = await cursor.next()
+					processedCount++
 					const originalOrgIds = doc.orgIds
 					const normalizedOrgIds = originalOrgIds.map((id) => (typeof id === 'string' ? normalizeOrgId(id) : id))
 
 					if (!_.isEqual(originalOrgIds, normalizedOrgIds)) {
 						await db
 							.collection(collectionName)
 							.updateOne({ _id: doc._id }, { $set: { orgIds: normalizedOrgIds } })
+						updatedCount++
-						console.log(`Updated _id: ${doc._id} in ${collectionName}`)
+						if (updatedCount % 100 === 0) {
+							console.log(`Progress: Processed ${processedCount} documents, updated ${updatedCount}`)
+						}
 					}
 				}
+				console.log(`Completed ${collectionName}: Processed ${processedCount} documents, updated ${updatedCount}`)
 			} finally {
 				await cursor.close()
 			}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 18bc79e and e22730e.

📒 Files selected for processing (1)
  • migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js (1)
documentation/1.0.0/native/scripts/linux/import_forms.js (1)
  • collectionName (6-6)
🪛 ESLint
migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js

[error] 40-40: Unnecessary semicolon.

(no-extra-semi)

🔇 Additional comments (2)
migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js (2)

35-38: Normalization logic is correct.

The transformation (trim, lowercase, replace spaces with underscores) is deterministic and appropriate for normalizing organization identifiers. Callers correctly check typeof before invoking this function.


11-12: Update MongoDB driver to a supported version for better security and maintainability.

The current latest version is 6.20.0, while the project uses 3.7.4 from 2021. Although the current driver supports 4.2+ servers, the older version limits access to modern features and security updates. Consider upgrading to a newer major version—at minimum to a v4 or v5 release—to align with maintained versions of the driver. No critical vulnerabilities were found for 3.7.4, but newer versions receive ongoing security patches and compatibility improvements.

Comment on lines +7 to +16
const path = require('path')
const rootPath = path.join(__dirname, '../../')
require('dotenv').config({ path: rootPath + '/.env' })

const MongoClient = require('mongodb').MongoClient
const _ = require('lodash')

const mongoUrl = process.env.MONGODB_URL
const dbName = mongoUrl.split('/').pop()
const url = mongoUrl.split(dbName)[0]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add validation for required environment variables.

The script proceeds to parse MONGODB_URL without verifying it exists or is valid. If the environment variable is missing or malformed, the error messages will be cryptic.

Apply this diff to add validation:

 const mongoUrl = process.env.MONGODB_URL
+if (!mongoUrl) {
+	console.error('Error: MONGODB_URL environment variable is required')
+	process.exit(1)
+}
+
 const dbName = mongoUrl.split('/').pop()
+if (!dbName) {
+	console.error('Error: Unable to parse database name from MONGODB_URL')
+	process.exit(1)
+}
+
 const url = mongoUrl.split(dbName)[0]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const path = require('path')
const rootPath = path.join(__dirname, '../../')
require('dotenv').config({ path: rootPath + '/.env' })
const MongoClient = require('mongodb').MongoClient
const _ = require('lodash')
const mongoUrl = process.env.MONGODB_URL
const dbName = mongoUrl.split('/').pop()
const url = mongoUrl.split(dbName)[0]
const path = require('path')
const rootPath = path.join(__dirname, '../../')
require('dotenv').config({ path: rootPath + '/.env' })
const MongoClient = require('mongodb').MongoClient
const _ = require('lodash')
const mongoUrl = process.env.MONGODB_URL
if (!mongoUrl) {
console.error('Error: MONGODB_URL environment variable is required')
process.exit(1)
}
const dbName = mongoUrl.split('/').pop()
if (!dbName) {
console.error('Error: Unable to parse database name from MONGODB_URL')
process.exit(1)
}
const url = mongoUrl.split(dbName)[0]
🤖 Prompt for AI Agents
In migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js
around lines 7 to 16, the code uses process.env.MONGODB_URL and immediately
parses it without verifying it's present or well-formed; add validation that
MONGODB_URL is defined and non-empty, check it matches a basic MongoDB URL
pattern (e.g., starts with mongodb:// or mongodb+srv://), and if invalid log a
clear error message and exit with non-zero status; also defensively guard the
dbName/url split (e.g., parse via URL-like logic or verify split results) and
log the original value when failing so developers see the bad input.

Comment on lines +14 to +16
const mongoUrl = process.env.MONGODB_URL
const dbName = mongoUrl.split('/').pop()
const url = mongoUrl.split(dbName)[0]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

URL parsing is fragile and error-prone.

The manual string splitting to extract the database name assumes a specific URL format and will fail with query parameters, URL-encoded characters, or complex connection strings (e.g., replica sets, SRV records).

Use the MongoDB driver's built-in URL parsing instead:

-const mongoUrl = process.env.MONGODB_URL
-const dbName = mongoUrl.split('/').pop()
-const url = mongoUrl.split(dbName)[0]
+const mongoUrl = process.env.MONGODB_URL

Then update the connection code (line 41-42):

-const connection = await MongoClient.connect(url, { useNewUrlParser: true, useUnifiedTopology: true })
-const db = connection.db(dbName)
+const connection = await MongoClient.connect(mongoUrl, { useNewUrlParser: true, useUnifiedTopology: true })
+const db = connection.db()

When called without arguments, connection.db() uses the database specified in the connection string.

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +44 to +62
try {
// Process single orgId collections
for (const collectionName of singleOrgIdCollections) {
console.log(`\nProcessing collection: ${collectionName}`)
const orgIds = await db.collection(collectionName).distinct('orgId')
console.log('orgids:------------------->', orgIds)
for (const originalOrgId of orgIds) {
if (typeof originalOrgId !== 'string') continue
const normalizedOrgId = normalizeOrgId(originalOrgId)
if (normalizedOrgId !== originalOrgId) {
const result = await db
.collection(collectionName)
.updateMany({ orgId: originalOrgId }, { $set: { orgId: normalizedOrgId } })
console.log(
`Updated ${result.modifiedCount} documents in ${collectionName} from '${originalOrgId}' to '${normalizedOrgId}'`
)
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Non-idempotent writes without safeguards on production data.

This migration script directly modifies production data with several risks:

  1. No dry-run mode - Changes are applied immediately without preview
  2. No transaction support - Partial failures leave data in an inconsistent state
  3. Not safely re-runnable - While the normalization itself is idempotent, the script could fail mid-execution
  4. No backup mechanism - No way to rollback if issues occur
  5. Performance concern - Individual updateMany calls instead of bulk operations

Consider these improvements:

  1. Add a dry-run mode:
+const DRY_RUN = process.env.DRY_RUN === 'true'
+
 ;(async () => {
 	const connection = await MongoClient.connect(url, { useNewUrlParser: true, useUnifiedTopology: true })
 	const db = connection.db(dbName)
+	
+	if (DRY_RUN) {
+		console.log('Running in DRY-RUN mode. No changes will be made.\n')
+	}
  1. Add safety confirmation and document the need for backup:
// At the start of the script, add:
console.log('WARNING: This script will modify data in the following collections:')
console.log([...singleOrgIdCollections, ...multiOrgIdsCollections].join(', '))
console.log('\nEnsure you have a backup before proceeding.')
console.log('Set DRY_RUN=true to preview changes without applying them.\n')
  1. Make updates conditional in dry-run mode:
 				if (normalizedOrgId !== originalOrgId) {
+					if (DRY_RUN) {
+						const count = await db.collection(collectionName).countDocuments({ orgId: originalOrgId })
+						console.log(
+							`[DRY-RUN] Would update ${count} documents in ${collectionName} from '${originalOrgId}' to '${normalizedOrgId}'`
+						)
+					} else {
 						const result = await db
 							.collection(collectionName)
 							.updateMany({ orgId: originalOrgId }, { $set: { orgId: normalizedOrgId } })
 						console.log(
 							`Updated ${result.modifiedCount} documents in ${collectionName} from '${originalOrgId}' to '${normalizedOrgId}'`
 						)
+					}
 				}
🤖 Prompt for AI Agents
In migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js
around lines 44 to 62, the migration performs immediate, potentially
non-idempotent writes to production without safeguards; add a DRY_RUN mode that
logs intended updates instead of applying them, require an explicit confirmation
flag (or env var) to run destructive actions and print a clear backup warning at
startup, wrap modifications in MongoDB transactions where supported to avoid
partial commits, convert per-org updateMany calls into batched bulkWrite
operations to improve performance and atomicity, and ensure each update is
conditional (skip when normalizedOrgId equals original) so the script is safely
re-runnable and produces no-op results on subsequent runs.

Comment on lines +69 to +81
const cursor = db.collection(collectionName).find({ orgIds: { $exists: true, $type: 'array' } })
while (await cursor.hasNext()) {
const doc = await cursor.next()
const originalOrgIds = doc.orgIds
const normalizedOrgIds = originalOrgIds.map((id) => (typeof id === 'string' ? normalizeOrgId(id) : id))

if (!_.isEqual(originalOrgIds, normalizedOrgIds)) {
await db
.collection(collectionName)
.updateOne({ _id: doc._id }, { $set: { orgIds: normalizedOrgIds } })
console.log(`Updated _id: ${doc._id} in ${collectionName}`)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Resource leak: Cursor is never closed.

The cursor created on line 69 is never explicitly closed. If an error occurs during iteration or after the loop completes, the cursor remains open, potentially leaking resources.

Use try-finally to ensure cursor cleanup:

 		// Process orgIds array in userExtension collection
 		for (const collectionName of multiOrgIdsCollections) {
 			console.log(`\nProcessing array orgIds in collection: ${collectionName}`)
 
 			// Get all documents
 			const cursor = db.collection(collectionName).find({ orgIds: { $exists: true, $type: 'array' } })
-			while (await cursor.hasNext()) {
-				const doc = await cursor.next()
-				const originalOrgIds = doc.orgIds
-				const normalizedOrgIds = originalOrgIds.map((id) => (typeof id === 'string' ? normalizeOrgId(id) : id))
-
-				if (!_.isEqual(originalOrgIds, normalizedOrgIds)) {
-					await db
-						.collection(collectionName)
-						.updateOne({ _id: doc._id }, { $set: { orgIds: normalizedOrgIds } })
-					console.log(`Updated _id: ${doc._id} in ${collectionName}`)
+			try {
+				while (await cursor.hasNext()) {
+					const doc = await cursor.next()
+					const originalOrgIds = doc.orgIds
+					const normalizedOrgIds = originalOrgIds.map((id) => (typeof id === 'string' ? normalizeOrgId(id) : id))
+
+					if (!_.isEqual(originalOrgIds, normalizedOrgIds)) {
+						await db
+							.collection(collectionName)
+							.updateOne({ _id: doc._id }, { $set: { orgIds: normalizedOrgIds } })
+						console.log(`Updated _id: ${doc._id} in ${collectionName}`)
+					}
 				}
+			} finally {
+				await cursor.close()
 			}
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const cursor = db.collection(collectionName).find({ orgIds: { $exists: true, $type: 'array' } })
while (await cursor.hasNext()) {
const doc = await cursor.next()
const originalOrgIds = doc.orgIds
const normalizedOrgIds = originalOrgIds.map((id) => (typeof id === 'string' ? normalizeOrgId(id) : id))
if (!_.isEqual(originalOrgIds, normalizedOrgIds)) {
await db
.collection(collectionName)
.updateOne({ _id: doc._id }, { $set: { orgIds: normalizedOrgIds } })
console.log(`Updated _id: ${doc._id} in ${collectionName}`)
}
}
const cursor = db.collection(collectionName).find({ orgIds: { $exists: true, $type: 'array' } })
try {
while (await cursor.hasNext()) {
const doc = await cursor.next()
const originalOrgIds = doc.orgIds
const normalizedOrgIds = originalOrgIds.map((id) => (typeof id === 'string' ? normalizeOrgId(id) : id))
if (!_.isEqual(originalOrgIds, normalizedOrgIds)) {
await db
.collection(collectionName)
.updateOne({ _id: doc._id }, { $set: { orgIds: normalizedOrgIds } })
console.log(`Updated _id: ${doc._id} in ${collectionName}`)
}
}
} finally {
await cursor.close()
}
🤖 Prompt for AI Agents
In migrations/correctOrgIdValuesInCollections/correctOrgIdValuesInCollections.js
around lines 69 to 81, the Mongo cursor opened with .find(...) is never closed
which can leak resources; wrap the cursor iteration in a try-finally block and
call await cursor.close() in the finally (or check cursor && !cursor.closed then
close) so the cursor is always cleaned up even on errors; keep the existing
logic inside the try and perform the update/console.log as before, ensuring any
thrown error still results in cursor.close() being invoked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant