-
Notifications
You must be signed in to change notification settings - Fork 0
[FEAT] Generic Testing Harness + "Take All" tests #44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughReplaces single-run integration flow with a TEST_CASES-driven sequential runner, adds shared Playwright/VSCode test harness and helpers, introduces a "Take All" E2E test and bulk-accept UI buttons, removes the 'both' resolution option, and adds uniform "take all" resolver behavior. Changes
Sequence DiagramssequenceDiagram
actor User
participant UI as ConflictResolver UI
participant State as Component State
participant FS as Notebook File System
User->>UI: Click "All Base"/"All Current"/"All Incoming"
UI->>State: handleAcceptAll(choice)
loop each conflict row
State->>State: set resolvedContent & choice (or delete)
end
UI->>User: render updated resolved rows
User->>UI: Click "Apply Resolution"
UI->>FS: write resolved notebook
FS-->>User: notebook updated
sequenceDiagram
participant Runner as Integration Runner
participant FS as File System
participant Server as Merge-NB Server
participant VS as VS Code Test Runner
participant Browser as Playwright Browser
Runner->>Runner: iterate TEST_CASES
loop per test case
Runner->>FS: createMergeConflictRepo(base,current,incoming)
Runner->>Server: waitForServer(portFile)
Server-->>Runner: port available
Runner->>VS: launch test module with per-case config
VS->>Browser: run Playwright test (open workspace, UI interactions)
Browser->>FS: read/write notebook files
VS-->>Runner: report pass/fail
Runner->>FS: cleanup workspace & config
end
Runner->>Runner: exit with aggregated result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@src/tests/takeAllButtons.test.ts`:
- Around line 270-312: The capture logic assumes "current" was chosen when
building expectedCells for conflict rows; instead inspect the resolved-cell DOM
to derive which side was applied (e.g., check for an active/selected class or
which resolved-choice button is toggled on the resolvedCell element) and set a
variable like resolvedSide = 'current' | 'incoming' (default to 'current' or
'code' if not found); then replace direct calls to getColumnCellType(row,
'current') and getColumnCell(row, 'current', i) with getColumnCellType(row,
resolvedSide) and getColumnCell(row, resolvedSide, i) so metadata and cellType
come from the actual chosen side before pushing expectedCells. Ensure you still
treat deleted/textarea branches the same and handle absence of a referenceCell
safely.
In `@src/tests/testHelpers.ts`:
- Around line 51-59: The checkHealth function currently reads only
res.statusCode and never consumes the HTTP response body, which can keep the
socket open; modify checkHealth (the Promise created in checkHealth) so that in
the http.get callback you consume or drain the response stream (e.g., call
res.resume() or read data and listen for 'end') before resolving based on
res.statusCode, and ensure existing error and timeout handlers still resolve
false and cleanup the request; this guarantees the response body is drained and
the socket is freed.
- Around line 161-177: The mtime check in waitForFileWrite is hardcoded to 10
seconds which can miss a file written before polling begins; update
waitForFileWrite to record an anchor timestamp at function start (e.g., const
start = Date.now()) and replace the fixed 10000 mtime window with a comparison
using the provided timeoutMs (for example treat the file as recent if
stat.mtimeMs >= start - 0 or Date.now() - stat.mtimeMs <= timeoutMs), so the
mtime window is relative to when waitForFileWrite was called and honors the
timeoutMs parameter; adjust logic in waitForFileWrite (and keep maxAttempts
behavior) and ensure you reference the start timestamp and timeoutMs when
checking stat.mtimeMs.
🧹 Nitpick comments (8)
src/web/client/ConflictResolver.tsx (2)
272-298:handleAcceptAllsilently overwrites user-edited resolutions.The
MergeRowcomponent (seeMergeRow.tsxlines 106–116) shows a warning when a user has modified content and tries to switch branches.handleAcceptAllbypasses that guard by directly settingchoicesstate, discarding any per-row edits without confirmation. Consider showing a confirmation dialog when any row has been manually edited before applying the bulk action.
461-504: Consider extracting inline button styles to CSS classes.All three "Accept All" buttons share the same structure with only
backgroundandbordervarying by side. Extracting to CSS classes (e.g.,.btn-accept-all-base,.btn-accept-all-current,.btn-accept-all-incoming) would reduce JSX verbosity and keep styling consistent with the rest of the component.src/tests/testHelpers.ts (1)
84-88:getCellSourceduplicatesnormalizeCellSourcefromsrc/notebookUtils.ts.
normalizeCellSource(insrc/notebookUtils.tslines 14–19) already handlesstring | string[]normalization.getCellSourceadds a null-cell guard on top. Consider reusingnormalizeCellSourcehere to avoid divergence (e.g., if join separator ever changes).src/tests/runIntegrationTest.ts (1)
85-153: Shared config file path could collide if multiple runners execute in parallel.
configPathat Line 88 uses a fixed well-known path inos.tmpdir(). If two runner processes ever execute simultaneously (e.g., parallel CI jobs on the same machine), they would overwrite each other's config. Consider including a unique identifier (e.g., PID or random suffix) in the config filename.This is fine for the current sequential design but worth noting for future-proofing.
src/tests/vscodeIntegration.test.ts (2)
237-242:getColumnCellis duplicated between this file andtakeAllButtons.test.ts.This local function (lines 237–242) has an identical implementation extracted as a module-level function in
takeAllButtons.test.ts(lines 30–35). Consider moving it intotestHelpers.tsalongside the other shared utilities to keep the DRY principle.
49-49:executeCommandis fire-and-forget — command failure won't surface clearly.
vscode.commands.executeCommand('merge-nb.findConflicts')is not awaited. If the command fails (e.g., extension not activated), the only symptom is awaitForServertimeout with a generic message. Consider awaiting the promise and wrapping in a try/catch for a clearer failure diagnostic.src/tests/takeAllButtons.test.ts (2)
432-471: Cross-check against04_current.ipynb— hardcoded path depends on test runner convention.The path
../../test(Line 435) and filename04_current.ipynb(Line 436) are derived from__dirname, which assumes the compiled output structure matches the source layout. This is fragile if the build output directory changes. Consider reading the notebook path from the test config or theTestCaseDefinitionto keep it in sync with the runner.Also, the cross-check treats mismatches as warnings (Line 464–468), which is appropriate given cell-matching may rearrange sources.
106-162: Setup boilerplate is nearly identical tovscodeIntegration.test.ts.Lines 112–162 (config read, port file cleanup, command execution, server/session wait, browser launch, header check) are duplicated almost verbatim from
vscodeIntegration.test.tslines 31–72. Consider extracting a sharedsetupTestBrowser(configPath)helper intotestHelpers.tsthat returns{ page, browser, serverPort, conflictFile }.
| /** | ||
| * Wait for the conflict file to be written (mtime within last 10 seconds). | ||
| * Returns true if confirmed, false otherwise. | ||
| */ | ||
| export async function waitForFileWrite(filePath: string, fs: typeof import('fs'), timeoutMs = 10000): Promise<boolean> { | ||
| const maxAttempts = Math.ceil(timeoutMs / 500); | ||
| for (let attempt = 0; attempt < maxAttempts; attempt++) { | ||
| await new Promise(r => setTimeout(r, 500)); | ||
| try { | ||
| const stat = fs.statSync(filePath); | ||
| if (Date.now() - stat.mtimeMs < 10000) { | ||
| return true; | ||
| } | ||
| } catch { /* continue */ } | ||
| } | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
waitForFileWrite mtime window is hardcoded to 10 seconds regardless of timeoutMs.
The 10-second mtime check on Line 171 means if the file was written more than 10 seconds before the first poll, it will never be detected even though the overall timeout hasn't expired. Consider making the mtime window relative to when waitForFileWrite was first called, or document this constraint.
🛡️ Proposed fix — anchor to call time
export async function waitForFileWrite(filePath: string, fs: typeof import('fs'), timeoutMs = 10000): Promise<boolean> {
+ const callTime = Date.now();
const maxAttempts = Math.ceil(timeoutMs / 500);
for (let attempt = 0; attempt < maxAttempts; attempt++) {
await new Promise(r => setTimeout(r, 500));
try {
const stat = fs.statSync(filePath);
- if (Date.now() - stat.mtimeMs < 10000) {
+ if (stat.mtimeMs >= callTime) {
return true;
}
} catch { /* continue */ }
}
return false;
}🤖 Prompt for AI Agents
In `@src/tests/testHelpers.ts` around lines 161 - 177, The mtime check in
waitForFileWrite is hardcoded to 10 seconds which can miss a file written before
polling begins; update waitForFileWrite to record an anchor timestamp at
function start (e.g., const start = Date.now()) and replace the fixed 10000
mtime window with a comparison using the provided timeoutMs (for example treat
the file as recent if stat.mtimeMs >= start - 0 or Date.now() - stat.mtimeMs <=
timeoutMs), so the mtime window is relative to when waitForFileWrite was called
and honors the timeoutMs parameter; adjust logic in waitForFileWrite (and keep
maxAttempts behavior) and ensure you reference the start timestamp and timeoutMs
when checking stat.mtimeMs.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/tests/testHelpers.ts`:
- Around line 62-83: In getHealthInfo, when the response status is not 200 the
response body is not consumed which leaks the socket; update the non-200 branch
inside the http.get response handler (the callback in getHealthInfo) to drain
the response by calling res.resume() (or otherwise consume the stream) before
resolving null, mirroring the fix used in checkHealth, so the socket is
released.
| /** Get detailed health info from the web server */ | ||
| export function getHealthInfo(port: number): Promise<HealthResponse | null> { | ||
| return new Promise((resolve) => { | ||
| const req = http.get(`http://127.0.0.1:${port}/health`, { timeout: 1000 }, (res) => { | ||
| if (res.statusCode !== 200) { | ||
| resolve(null); | ||
| return; | ||
| } | ||
| let data = ''; | ||
| res.on('data', (chunk: string) => data += chunk); | ||
| res.on('end', () => { | ||
| try { | ||
| resolve(JSON.parse(data)); | ||
| } catch { | ||
| resolve(null); | ||
| } | ||
| }); | ||
| }); | ||
| req.on('error', () => resolve(null)); | ||
| req.on('timeout', () => { req.destroy(); resolve(null); }); | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing res.resume() on the non-200 path leaks the socket.
The same drain fix applied to checkHealth (Line 54) is missing here. When statusCode !== 200, the response body is never consumed, keeping the underlying socket allocated.
🐛 Proposed fix — drain on non-200
const req = http.get(`http://127.0.0.1:${port}/health`, { timeout: 1000 }, (res) => {
if (res.statusCode !== 200) {
+ res.resume();
resolve(null);
return;
}🤖 Prompt for AI Agents
In `@src/tests/testHelpers.ts` around lines 62 - 83, In getHealthInfo, when the
response status is not 200 the response body is not consumed which leaks the
socket; update the non-200 branch inside the http.get response handler (the
callback in getHealthInfo) to drain the response by calling res.resume() (or
otherwise consume the stream) before resolving null, mirroring the fix used in
checkHealth, so the socket is released.
- Updated the conflict resolution logic in resolver.ts to prioritize uniform "take all" actions based on the preferred side (base, current, or incoming). - Added new test cases in runIntegrationTest.ts to cover scenarios for each preferred side action. - Refactored takeAllButtons.test.ts to streamline the testing process and ensure proper verification of resolved cells. - Introduced integrationUtils.ts for shared utility functions related to conflict resolution and UI interactions. - Improved testHelpers.ts to accommodate new parameters for test configurations. - Updated .gitignore to exclude output.txt.
|
@coderabbitai a lot of things have changed since the last time you reviewed the code. Could you review it again? Also, please ensure the resolver.ts change is working as expected or if that needs to be documented better or what |
🧩 Analysis chain
🏁 Script executed: Length of output: 17076 Let me also check the ConflictResolver.tsx to understand how the UI sends resolutions: 🏁 Script executed: Length of output: 9941 🏁 Script executed: Length of output: 6056
Issues Found in resolver.ts1. Type Inconsistency (Line 274)const preferredSide = (uniqueChoices.size === 1
? [...uniqueChoices][0]
: undefined) as ('base' | 'current' | 'incoming' | 'both' | undefined);The type assertion includes
Recommendation: Remove 2. Unclear Fallback Behavior (Lines 323-327)} else if (preferredSide === 'base' || preferredSide === 'current' || preferredSide === 'incoming') {
// For uniform "take all", only include cells that exist on the preferred side.
if (preferredSide === 'base') cellToUse = baseCell;
else if (preferredSide === 'current') cellToUse = currentCell;
else if (preferredSide === 'incoming') cellToUse = incomingCell;
}This fallback triggers for rows without explicit resolutions when a
Questions:
3. Missing DocumentationThe code lacks explanation for:
4. Suggested Documentation ImprovementsHere's how I'd document this section: // Detect uniform "take all" action (e.g., when user clicks "All Base", "All Current", or "All Incoming").
// If all non-delete resolutions choose the same side, we:
// 1. Reorder rows to preserve that side's original cell ordering
// 2. For rows without explicit resolutions, only include cells that exist on that side
const nonDeleteChoices = resolvedRows
.map(r => r.resolution?.choice)
.filter((c): c is 'base' | 'current' | 'incoming' => !!c && c !== 'delete');
const uniqueChoices = new Set(nonDeleteChoices);
const preferredSide = (uniqueChoices.size === 1
? [...uniqueChoices][0]
: undefined) as ('base' | 'current' | 'incoming' | undefined);
let rowsForResolution = resolvedRows;
if (preferredSide === 'base' || preferredSide === 'current' || preferredSide === 'incoming') {
// Reorder rows to match the preferred side's original cell order.
// This ensures the final notebook preserves the structure of the chosen side.
const indexKey = preferredSide === 'base'
? 'baseCellIndex'
: preferredSide === 'current'
? 'currentCellIndex'
: 'incomingCellIndex';
const withIndex = resolvedRows
.filter(r => (r as any)[indexKey] !== undefined)
.sort((a, b) => (a as any)[indexKey] - (b as any)[indexKey]);
const withoutIndex = resolvedRows.filter(r => (r as any)[indexKey] === undefined);
rowsForResolution = [...withIndex, ...withoutIndex];
}Recommendations
I'll now trigger a full review of the PR as requested. 🧠 Learnings used✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/resolver.ts (1)
347-349:⚠️ Potential issue | 🟡 MinorEmpty
resolvedContentsilently drops the cell — is this intentional?If a user edits a resolved cell's content to be empty (but did not choose "delete"), this
continuedrops the cell from the output. This could surprise users who intended to keep an empty code cell. The explicitdeletechoice already handles deletion at Line 344. Consider limiting this guard to only thedeletecase, or documenting that empty content is treated as deletion.
🤖 Fix all issues with AI agents
In `@src/resolver.ts`:
- Around line 293-303: preferredSide can become 'both' because nonDeleteChoices
includes 'both', but downstream code only handles 'base'|'current'|'incoming';
update the detection to exclude 'both' or explicitly handle it. Fix by changing
the filter that builds nonDeleteChoices to remove 'both' (so nonDeleteChoices
only contains 'base'|'current'|'incoming') or, if you prefer to support 'both',
add handling for preferredSide === 'both' in the reordering and fallback logic
used with resolvedRows (the same branches that currently check preferredSide for
'base'|'current'|'incoming'); refer to nonDeleteChoices, preferredSide, and the
downstream reorder/fallback code that consumes preferredSide and ensure
consistency with handleAcceptAll which only sends base|current|incoming.
In `@src/tests/takeAllButtons.test.ts`:
- Around line 200-204: The log replacement is using the regex /\\n/g which
matches a literal backslash+n instead of real newline characters; update the two
replace calls where expected.source and actual.source are logged (in the failing
branch that increments sourceMismatches) to use /\n/g so actual newline
characters are escaped in the output (keep the replacement string as '\\n' so
the log shows the two-character sequence). This change should be made in the
test code in takeAllButtons.test.ts wherever expected.source.replace(/\\n/g,
'\\\\n') and actual.source.replace(/\\n/g, '\\\\n') are used.
🧹 Nitpick comments (7)
.gitignore (1)
6-6: Consider a more specific pattern if output.txt is generated in limited locations.The pattern
output.txtwill ignore any file with that name anywhere in the repository tree. If the test output files are only generated at the root level, consider using/output.txt(matches root only), or if they're only in test directories, use a pattern like**/test/**/output.txtorintegration/**/output.txt. This reduces the risk of accidentally ignoring important files with the same name elsewhere.That said, the current pattern is acceptable if output.txt may be generated in multiple locations throughout the repository.
src/web/client/ConflictResolver.tsx (1)
461-504: Consider extracting repeated inline button styles into a shared object or CSS class.All three buttons share the same
fontSize,padding,color, and differ only inbackground/border. This is a minor readability nit — the inline approach is consistent with the rest of the component.src/resolver.ts (1)
313-316:as anybypasses type safety onResolvedRow— use a typed accessor instead.The dynamic property access via
(r as any)[indexKey]is fragile. IfResolvedRowfield names change, the compiler won't catch this. A small helper or inline accessor eliminates the cast.♻️ Proposed refactor — typed index accessor
+ const getCellIndex = (r: import('./web/webTypes').ResolvedRow): number | undefined => { + if (preferredSide === 'base') return r.baseCellIndex; + if (preferredSide === 'current') return r.currentCellIndex; + return r.incomingCellIndex; + }; + - const withIndex = resolvedRows - .filter(r => (r as any)[indexKey] !== undefined) - .sort((a, b) => (a as any)[indexKey] - (b as any)[indexKey]); - const withoutIndex = resolvedRows.filter(r => (r as any)[indexKey] === undefined); + const withIndex = resolvedRows + .filter(r => getCellIndex(r) !== undefined) + .sort((a, b) => getCellIndex(a)! - getCellIndex(b)!); + const withoutIndex = resolvedRows.filter(r => getCellIndex(r) === undefined); rowsForResolution = [...withIndex, ...withoutIndex];src/tests/integrationUtils.ts (2)
15-21:getColumnCellTypedefaults to'code'for absent cells and doesn't account for'raw'cells.If a
rawcell type is present in the notebook, theclassList.contains('code-cell')check would returnfalse, and this function would classify it as'markdown'. This may or may not matter depending on whetherrawcells appear in practice — just something to be aware of.
104-120: Redundant initial poll before the loop body.Line 111 reads the count, then Line 113 immediately reads it again at the start of the first iteration. The initial read is never checked against the exit condition. Consider either checking the initial read or removing it.
♻️ Minor simplification
export async function waitForAllConflictsResolved( page: Page, timeoutMs = 5000, pollMs = 200, ): Promise<{ resolved: number; total: number }> { const start = Date.now(); - let last = await getResolvedCount(page); + let last = { resolved: 0, total: 0 }; while (Date.now() - start < timeoutMs) { last = await getResolvedCount(page); if (last.total > 0 && last.resolved === last.total) { return last; } await new Promise(r => setTimeout(r, pollMs)); } return last; }src/tests/takeAllButtons.test.ts (1)
93-93:executeCommandresult is not awaited — errors from the command are silently lost.If
merge-nb.findConflictsthrows, the error won't propagate. The test will wait for the server timeout and fail with a less informative error. This matches the existing pattern invscodeIntegration.test.ts, so it's consistent, but worth noting.src/tests/runIntegrationTest.ts (1)
40-42: Git config values have extraneous double-quote wrapping.
'"test@mergenb.test"'passes the literal string"test@mergenb.test"(with double-quote characters) toexecSync's shell. This stores the email/name with embedded quotes in git config. Benign for tests but unexpected.♻️ Proposed fix — remove inner quotes
- git(tmpDir, 'config', 'user.email', '"test@mergenb.test"'); - git(tmpDir, 'config', 'user.name', '"MergeNB Test"'); + git(tmpDir, 'config', 'user.email', 'test@mergenb.test'); + git(tmpDir, 'config', 'user.name', 'MergeNB Test');Same applies to commit messages at Lines 47, 55, 62:
- git(tmpDir, 'commit', '-m', '"base"'); + git(tmpDir, 'commit', '-m', 'base');
src/tests/takeAllButtons.test.ts
Outdated
| if (expected.source !== actual.source) { | ||
| sourceMismatches++; | ||
| console.log(`Source mismatch at cell ${i}:`); | ||
| console.log(` Expected: "${expected.source.substring(0, 80).replace(/\\n/g, '\\\\n')}..."`); | ||
| console.log(` Actual: "${actual.source.substring(0, 80).replace(/\\n/g, '\\\\n')}..."`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regex escapes literal \n two-char sequences instead of actual newline characters.
/\\n/g matches a literal backslash followed by n in the string, not newline characters (\n). The logged mismatch output will still contain raw newlines making it hard to read. Compare with vscodeIntegration.test.ts Line 390 which correctly uses /\n/g.
🐛 Proposed fix
- console.log(` Expected: "${expected.source.substring(0, 80).replace(/\\n/g, '\\\\n')}..."`);
- console.log(` Actual: "${actual.source.substring(0, 80).replace(/\\n/g, '\\\\n')}..."`);
+ console.log(` Expected: "${expected.source.substring(0, 80).replace(/\n/g, '\\n')}..."`);
+ console.log(` Actual: "${actual.source.substring(0, 80).replace(/\n/g, '\\n')}..."`);🤖 Prompt for AI Agents
In `@src/tests/takeAllButtons.test.ts` around lines 200 - 204, The log replacement
is using the regex /\\n/g which matches a literal backslash+n instead of real
newline characters; update the two replace calls where expected.source and
actual.source are logged (in the failing branch that increments
sourceMismatches) to use /\n/g so actual newline characters are escaped in the
output (keep the replacement string as '\\n' so the log shows the two-character
sequence). This change should be made in the test code in takeAllButtons.test.ts
wherever expected.source.replace(/\\n/g, '\\\\n') and
actual.source.replace(/\\n/g, '\\\\n') are used.
- Removed 'both' from possible cell resolutions
- Tests prior resolutions hold
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/types.ts (1)
103-111:⚠️ Potential issue | 🟡 MinorStale JSDoc:
bothis still documented but no longer in the union.Line 108 describes
both: Include both current and incoming content, but'both'was removed fromResolutionChoice. Update the doc block to match the current union.Proposed fix
/** * Resolution choices for conflict resolution. * - base: Use the base version (pre-merge common ancestor) * - current: Use the current branch version * - incoming: Use the incoming branch version - * - both: Include both current and incoming content * - delete: Remove the cell entirely */ export type ResolutionChoice = 'base' | 'current' | 'incoming' | 'delete';
🤖 Fix all issues with AI agents
In `@src/tests/testHarness.ts`:
- Around line 69-70: The temp-directory used to build portFilePath is
inconsistent with readTestConfig: replace the manual TMPDIR/TMP fallback used
when computing tmpDir/portFilePath with os.tmpdir() so both use the same
platform API; update the code that sets tmpDir (and thus portFilePath) to call
os.tmpdir() directly (keeping options.portFileName or default
'mergenb-server-port') so portFilePath and readTestConfig reference the same
directory.
In `@src/tests/vscodeIntegration.test.ts`:
- Around line 95-97: The parseInt result for testId can be NaN if data-testid is
missing/malformed; update the logic around row.getAttribute('data-testid'),
testId and rowIndex so that after computing rowIndex = parseInt(..., 10) you
check Number.isNaN(rowIndex) and if so set rowIndex = conflictIdx (or another
safe fallback) before using modulo checks (% 7, % 2); ensure the guard handles
empty strings and non-numeric suffixes so downstream branch logic (odd: prefer
current) does not silently get selected for every row.
🧹 Nitpick comments (11)
src/types.ts (1)
179-182:SemanticConflictResolution.choiceduplicates theResolutionChoiceunion inline.This should use the
ResolutionChoicetype alias for consistency, especially since the rest of this PR centralizes that type.Proposed fix
export interface SemanticConflictResolution { conflict: SemanticConflict; - choice: 'base' | 'current' | 'incoming' | 'delete'; + choice: ResolutionChoice; /** The resolved content from the editable text area (source of truth) */ resolvedContent?: string; }src/web/webTypes.ts (2)
184-188:ResolutionChoice | 'base'is redundant —'base'is already inResolutionChoice.
ResolutionChoiceis'base' | 'current' | 'incoming' | 'delete', soResolutionChoice | 'base'simplifies to justResolutionChoice. This looks like a leftover from before the type centralization.Proposed fix
export interface WebResolutionData { type: 'semantic'; resolutions: Array<{ index: number; - choice: ResolutionChoice | 'base'; + choice: ResolutionChoice; customContent?: string; }>;
162-175:BrowserToExtensionMessageuseschoice: stringinstead ofResolutionChoice.Line 168 types the resolution
choiceas a rawstring, bypassing the type safety you've just established everywhere else. Consider narrowing this toResolutionChoicefor consistency and compile-time safety.Proposed fix
export type BrowserToExtensionMessage = | { command: 'resolve'; type: 'semantic'; resolutions: Array<{ index: number; - choice: string; + choice: ResolutionChoice; customContent?: string; }>;src/web/client/types.ts (1)
65-71: Verboseimport('../../types').ResolutionChoicewhen the type is already re-exported in this file.
ResolutionChoiceis re-exported on line 19, so you can reference it directly in these interfaces instead of using the inlineimport(...)syntax. Same applies toNotebookCellusages on lines 29-31, 79-83.Example for ConflictChoice and ResolvedRow
export interface ConflictChoice { index: number; - choice: import('../../types').ResolutionChoice; + choice: ResolutionChoice; resolvedContent: string; } export interface ResolvedRow { - baseCell?: import('../../types').NotebookCell; - currentCell?: import('../../types').NotebookCell; - incomingCell?: import('../../types').NotebookCell; + baseCell?: NotebookCell; + currentCell?: NotebookCell; + incomingCell?: NotebookCell; baseCellIndex?: number; currentCellIndex?: number; incomingCellIndex?: number; resolution?: { - choice: import('../../types').ResolutionChoice; + choice: ResolutionChoice; resolvedContent: string; }; }Also applies to: 77-95
src/web/client/ConflictResolver.tsx (2)
271-301:handleAcceptAllskips already-resolved conflicts — consider a UX hint or override option.The logic is correct and the "respect prior resolutions" behavior matches the commit intent. However, if a user manually resolves a few conflicts, then clicks "All Incoming", the already-resolved ones silently remain unchanged. This could be confusing since the button label suggests all conflicts are affected.
Consider one of:
- A visual indicator (e.g., toast/badge) showing how many conflicts were skipped.
- A modifier behavior (e.g., holding Shift to override all, or a separate "Reset & Accept All" option).
This is not blocking — just a UX polish consideration.
464-507: Bulk-accept buttons look good; minor accessibility note.The buttons properly use CSS variables for theming and include
titleattributes. For improved accessibility, consider addingaria-labelattributes in addition totitle, since screen readers may not always announcetitlecontent.src/tests/testHarness.ts (2)
49-56:readTestConfiglacks error context on failure.If the config file is missing or contains invalid JSON, the raw
fs.readFileSync/JSON.parseerrors won't indicate what the test harness was trying to do. A small wrapper message would save debugging time.💡 Suggested improvement
export function readTestConfig(): TestConfig { const configPath = path.join(os.tmpdir(), 'mergenb-test-config.json'); - return JSON.parse(fs.readFileSync(configPath, 'utf8')); + try { + return JSON.parse(fs.readFileSync(configPath, 'utf8')); + } catch (err) { + throw new Error(`Failed to read test config from ${configPath}: ${err}`); + } }
211-218:JSON.stringifymetadata comparison is key-order sensitive.If the resolver emits metadata with keys in a different insertion order than the reference cell,
JSON.stringifywill produce different strings even when the contents are semantically equal. This is unlikely to bite in practice (both sides typically come from the same JSON parse), but worth noting for future-proofing.src/tests/integrationUtils.ts (3)
5-6:ConflictChoicename shadows the interface insrc/web/client/types.ts.
src/web/client/types.tsexports aConflictChoiceinterface (withindex,choice,resolvedContentfields), while this file exports a simple union typeMergeSide | 'delete'under the same name. They serve different purposes and live in different modules, but the naming overlap could confuse developers navigating the codebase. Consider a more specific name likeResolutionActionif this causes confusion down the line.
117-132: Redundant initialgetResolvedCountcall on Line 123.The
lastvariable is initialized on Line 123 but immediately overwritten on Line 125 in the first loop iteration. The initial call can be removed.💡 Minor cleanup
const start = Date.now(); - let last = await getResolvedCount(page); + let last = { resolved: 0, total: 0 }; while (Date.now() - start < timeoutMs) { last = await getResolvedCount(page);
134-238:collectExpectedCellsFromUIis comprehensive but lengthy.The function correctly handles identical rows, resolved conflicts, deleted conflicts, and unresolved (skipped) rows. The
conflictIdxtracking is properly decoupled from the overall row index.One thing to keep in mind: Line 229 hardcodes
hasOutputs: falsefor all conflict cells, meaning output content is never verified for resolved conflicts. This aligns with a known gap noted in prior learnings.Consider extracting the identical-row block (Lines 155–174) and the conflict-row block (Lines 176–234) into named helpers to reduce cognitive load, but this is optional.
| const testId = await row.getAttribute('data-testid') || ''; | ||
| const rowIndex = parseInt(testId.replace('conflict-row-', '').replace('row-', ''), 10); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parseInt on a missing/malformed data-testid silently produces NaN.
If the data-testid attribute is absent or doesn't contain a numeric suffix, parseInt returns NaN. The downstream modulo checks (% 7, % 2) all evaluate to false/NaN, silently funneling every such row into the "odd: prefer current" branch. Consider adding a guard or falling back to conflictIdx when parsing fails.
🛡️ Suggested guard
const testId = await row.getAttribute('data-testid') || '';
- const rowIndex = parseInt(testId.replace('conflict-row-', '').replace('row-', ''), 10);
+ const parsed = parseInt(testId.replace('conflict-row-', '').replace('row-', ''), 10);
+ const rowIndex = Number.isNaN(parsed) ? conflictIdx : parsed;🤖 Prompt for AI Agents
In `@src/tests/vscodeIntegration.test.ts` around lines 95 - 97, The parseInt
result for testId can be NaN if data-testid is missing/malformed; update the
logic around row.getAttribute('data-testid'), testId and rowIndex so that after
computing rowIndex = parseInt(..., 10) you check Number.isNaN(rowIndex) and if
so set rowIndex = conflictIdx (or another safe fallback) before using modulo
checks (% 7, % 2); ensure the guard handles empty strings and non-numeric
suffixes so downstream branch logic (odd: prefer current) does not silently get
selected for every row.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/tests/testHarness.ts`:
- Around line 128-134: The current code reads and JSON-parses the notebook even
when waitForFileWrite(conflictFile, fs, options.writeTimeoutMs) returns false,
risking stale data; update the function (the block handling fileWritten in
testHarness.ts) to not proceed on false — either throw a descriptive Error
(e.g., "Timed out waiting for file write for {conflictFile}") or return an
explicit failure value (e.g., null/undefined or a result object with
success:false) so callers can detect and handle the timeout instead of parsing
stale notebookContent; ensure references to conflictFile, fs,
options.writeTimeoutMs and waitForFileWrite are used so the change is applied in
the correct location.
- Around line 82-106: After launching Chromium with chromium.launch (creating
the browser variable), wrap all subsequent post-launch steps (page = await
browser.newPage(); page.goto(sessionUrl); page.waitForSelector('.header-title');
the header check, sleeps, etc.) in a try/catch so that any thrown error will
first await browser.close() before rethrowing; specifically, keep the
successful-return path returning the browser and page as-is, but on any failure
in the sequence (page.goto, page.waitForSelector, title check, etc.) call await
browser.close() in the catch and then throw the original error so the browser
process is not leaked.
🧹 Nitpick comments (4)
src/tests/testHarness.ts (4)
137-154: Outputs content is not captured for comparison.
buildExpectedCellsFromNotebookrecordshasOutputs(boolean) but does not capture the actual outputs content. This meansassertNotebookMatchescan only verify whether outputs exist, not whether they are correct. This aligns with a known verification gap in the test suite. Consider capturing and comparing output content if output correctness matters for "Take All" scenarios.Based on learnings: "The integration test has verification gaps: it never directly checks the outputs array content (only execution_count), uses getCellSource() normalization which masks source format inconsistencies (string vs array)."
211-218:JSON.stringifymetadata comparison is property-order–dependent.If the expected and actual metadata objects have the same keys/values but in different property order (common after serialization round-trips), this comparison produces a false mismatch. Use a deep-equal utility or sort keys before comparing.
💡 Quick fix — sort keys for stable comparison
if (options.compareMetadata) { const expectedMetadata = expected.metadata || {}; const actualMetadata = actual.metadata || {}; - if (JSON.stringify(expectedMetadata) !== JSON.stringify(actualMetadata)) { + const stableStringify = (obj: Record<string, unknown>) => + JSON.stringify(obj, Object.keys(obj).sort()); + if (stableStringify(expectedMetadata) !== stableStringify(actualMetadata)) { metadataMismatches++; console.log(`Metadata mismatch at cell ${i}`); } }Note: This only sorts top-level keys. For deeply nested metadata, consider a recursive key-sort or a library like
deep-equal.
16-24: Consider adding ateardown/cleanuphelper forConflictSession.Callers must remember to close the browser (and potentially the page) in a
finallyblock. A smallcloseSession(session: ConflictSession)helper would centralize cleanup and prevent browser process leaks across multiple test files.💡 Example helper
export async function closeSession(session: ConflictSession): Promise<void> { try { await session.page.close(); } catch { /* ignore */ } try { await session.browser.close(); } catch { /* ignore */ } }
58-67: Hardcodedconflict.ipynbfilename limits flexibility.Line 63 assumes every test workspace has a file named
conflict.ipynb. If future test cases use different filenames, this will need changing. Consider accepting the filename viaSetupOptionsorTestConfig.params.
| const fileWritten = await waitForFileWrite(conflictFile, fs, options.writeTimeoutMs); | ||
| if (!fileWritten) { | ||
| console.log('Warning: Could not confirm file write, proceeding anyway'); | ||
| } | ||
|
|
||
| const notebookContent = fs.readFileSync(conflictFile, 'utf8'); | ||
| return JSON.parse(notebookContent); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proceeding after unconfirmed file write can silently produce wrong test results.
When waitForFileWrite returns false (Line 129), the function logs a warning but still reads the file. This reads stale (pre-resolution) notebook content, causing downstream assertions to pass or fail for the wrong reasons. Consider throwing instead, or at minimum returning a flag so callers can decide.
💡 Suggested alternative — fail explicitly
const fileWritten = await waitForFileWrite(conflictFile, fs, options.writeTimeoutMs);
if (!fileWritten) {
- console.log('Warning: Could not confirm file write, proceeding anyway');
+ throw new Error(`File write not confirmed for ${conflictFile} within timeout`);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const fileWritten = await waitForFileWrite(conflictFile, fs, options.writeTimeoutMs); | |
| if (!fileWritten) { | |
| console.log('Warning: Could not confirm file write, proceeding anyway'); | |
| } | |
| const notebookContent = fs.readFileSync(conflictFile, 'utf8'); | |
| return JSON.parse(notebookContent); | |
| const fileWritten = await waitForFileWrite(conflictFile, fs, options.writeTimeoutMs); | |
| if (!fileWritten) { | |
| throw new Error(`File write not confirmed for ${conflictFile} within timeout`); | |
| } | |
| const notebookContent = fs.readFileSync(conflictFile, 'utf8'); | |
| return JSON.parse(notebookContent); |
🤖 Prompt for AI Agents
In `@src/tests/testHarness.ts` around lines 128 - 134, The current code reads and
JSON-parses the notebook even when waitForFileWrite(conflictFile, fs,
options.writeTimeoutMs) returns false, risking stale data; update the function
(the block handling fileWritten in testHarness.ts) to not proceed on false —
either throw a descriptive Error (e.g., "Timed out waiting for file write for
{conflictFile}") or return an explicit failure value (e.g., null/undefined or a
result object with success:false) so callers can detect and handle the timeout
instead of parsing stale notebookContent; ensure references to conflictFile, fs,
options.writeTimeoutMs and waitForFileWrite are used so the change is applied in
the correct location.
|
@coderabbitai the CI keeps failing whilst all my tests work locally. Could you check that out for yourself? |
🧩 Analysis chain
🏁 Script executed: Length of output: 1121 🏁 Script executed: Length of output: 2216 Let me check the logs from the failed CI runs to identify the specific error: 🏁 Script executed: Length of output: 178946 🏁 Script executed: Length of output: 267607
In the last green run (today ~18:05 UTC), the sequence included “Session created …” and “WebSocket connected …” quickly, and 02_perCellResolution + All Base + All Current passed. So the code path works; the flake is the Playwright-to-server handshake racing against a 30s window on CI. Concrete stabilization plan:
If you’re good with that, I’ll prep a tiny patch touching just the timeouts and Playwright launch, plus set the env in the workflow.
Would you like me to proceed with this patch now? I can also split it: just the workflow/env + test harness change first (no behavior change in extension), if you prefer. If you’d rather not change behavior, an alternative is simply increasing the harness wait and setting the CI env vars — that should already reduce flakes. 🧠 Learnings used💡 Use ✅ Actions performedReview triggered.
|
|
@coderabbitai branch off this pr and make a PR to main (if possible) for that task, I will happily review it, I didn't know those were options. The current implementation and last fix did seem to solve it though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/web/WebConflictPanel.ts`:
- Around line 85-98: The .then() callback after server.openSession can run after
the panel has been disposed, causing _sendConflictData() to send to a closed
session; add an _isDisposed boolean on WebConflictPanel, set it true in
dispose(), and at the top of the .then() callback check if (this._isDisposed)
return; before calling this._sendConflictData() and logging; ensure dispose()
clears any session state as currently done for _sessionId/_conflict to fully
guard against post-dispose execution.
🧹 Nitpick comments (3)
src/tests/testHarness.ts (3)
204-209: Trailing...is always appended, even for short sources.
substring(0, 80)+"..."prints"hello..."for a 5-char source. This is a cosmetic nit but can be confusing when debugging short cells.Optional fix
- console.log(` Expected: "${expected.source.substring(0, 80).replace(/\n/g, '\\n')}..."`); - console.log(` Actual: "${actualSource.substring(0, 80).replace(/\n/g, '\\n')}..."`); + const trunc = (s: string) => s.length > 80 ? s.substring(0, 80).replace(/\n/g, '\\n') + '...' : s.replace(/\n/g, '\\n'); + console.log(` Expected: "${trunc(expected.source)}"`); + console.log(` Actual: "${trunc(actualSource)}"`);
216-223:JSON.stringifymetadata comparison is key-order dependent.If the expected and actual metadata objects have identical keys/values but in a different insertion order, this comparison produces a false mismatch. This is unlikely with notebook files parsed from JSON (V8 preserves insertion order), but a deep-equal utility would be more robust.
58-111: Consider providing ateardown/ cleanup helper alongsidesetupConflictResolver.The harness creates a Playwright browser in
setupConflictResolverbut doesn't export a corresponding cleanup function. Every caller must remember to callbrowser.close()(and potentially clean up the VS Code documents). A thin helper would centralize teardown and prevent resource leaks in test code:export async function teardownConflictSession(session: ConflictSession): Promise<void> { await session.browser.close(); }
| // Do not await the WebSocket connection here to avoid deadlocking tests | ||
| // that need to open the session after the command returns. | ||
| void server.openSession( | ||
| this._sessionId, | ||
| '', // No HTML content needed - server generates shell | ||
| (message: unknown) => this._handleMessage(message) | ||
| ).then(() => { | ||
| // Send conflict data to browser once connected | ||
| this._sendConflictData(); | ||
|
|
||
| logger.info(`[WebConflictPanel] Opened conflict resolver in browser, session: ${this._sessionId}`); | ||
| } catch (error) { | ||
| }).catch((error) => { | ||
| logger.error('[WebConflictPanel] Failed to open browser session:', error); | ||
| vscode.window.showErrorMessage(`Failed to open conflict resolver in browser: ${error}`); | ||
| } | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against post-dispose execution in the .then() callback.
Since openSession is now fire-and-forget, there's a window where dispose() can run before .then() fires (e.g., rapid cancel or re-creation via createOrShow). When that happens, _sendConflictData() still executes because it only checks _sessionId and _conflict—neither of which is cleared in dispose(). This sends data to an already-closed session.
Add an _isDisposed guard at the top of the .then() callback:
Proposed fix
void server.openSession(
this._sessionId,
'', // No HTML content needed - server generates shell
(message: unknown) => this._handleMessage(message)
).then(() => {
+ if (this._isDisposed) return;
// Send conflict data to browser once connected
this._sendConflictData();
logger.info(`[WebConflictPanel] Opened conflict resolver in browser, session: ${this._sessionId}`);
}).catch((error) => {
+ if (this._isDisposed) return;
logger.error('[WebConflictPanel] Failed to open browser session:', error);
vscode.window.showErrorMessage(`Failed to open conflict resolver in browser: ${error}`);
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Do not await the WebSocket connection here to avoid deadlocking tests | |
| // that need to open the session after the command returns. | |
| void server.openSession( | |
| this._sessionId, | |
| '', // No HTML content needed - server generates shell | |
| (message: unknown) => this._handleMessage(message) | |
| ).then(() => { | |
| // Send conflict data to browser once connected | |
| this._sendConflictData(); | |
| logger.info(`[WebConflictPanel] Opened conflict resolver in browser, session: ${this._sessionId}`); | |
| } catch (error) { | |
| }).catch((error) => { | |
| logger.error('[WebConflictPanel] Failed to open browser session:', error); | |
| vscode.window.showErrorMessage(`Failed to open conflict resolver in browser: ${error}`); | |
| } | |
| }); | |
| // Do not await the WebSocket connection here to avoid deadlocking tests | |
| // that need to open the session after the command returns. | |
| void server.openSession( | |
| this._sessionId, | |
| '', // No HTML content needed - server generates shell | |
| (message: unknown) => this._handleMessage(message) | |
| ).then(() => { | |
| if (this._isDisposed) return; | |
| // Send conflict data to browser once connected | |
| this._sendConflictData(); | |
| logger.info(`[WebConflictPanel] Opened conflict resolver in browser, session: ${this._sessionId}`); | |
| }).catch((error) => { | |
| if (this._isDisposed) return; | |
| logger.error('[WebConflictPanel] Failed to open browser session:', error); | |
| vscode.window.showErrorMessage(`Failed to open conflict resolver in browser: ${error}`); | |
| }); |
🤖 Prompt for AI Agents
In `@src/web/WebConflictPanel.ts` around lines 85 - 98, The .then() callback after
server.openSession can run after the panel has been disposed, causing
_sendConflictData() to send to a closed session; add an _isDisposed boolean on
WebConflictPanel, set it true in dispose(), and at the top of the .then()
callback check if (this._isDisposed) return; before calling
this._sendConflictData() and logging; ensure dispose() clears any session state
as currently done for _sessionId/_conflict to fully guard against post-dispose
execution.
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain
CI Stability Improvements PlanThis plan addresses the WebSocket connection timeout flakes in CI by:
This plan will create all the necessary changes to stabilize the CI tests. Once executed, it should eliminate the WebSocket connection timeout flakes you're seeing. 🏁 Script executed: Length of output: 494 🏁 Script executed: Length of output: 823 🏁 Script executed: Length of output: 78 🏁 Script executed: Length of output: 5578 🏁 Script executed: Length of output: 2279 Perfect! I now have all the information needed. Here's the complete plan with specific code changes: CI Stability Improvements PlanThis plan addresses the WebSocket connection timeout flakes in CI by making timeouts configurable and adding CI-friendly Playwright settings. This plan should resolve the CI flakiness by:
The changes are minimal, focused, and backwards-compatible (defaults work locally, extended timeouts only activate in CI). 🧠 Learnings used💡 Use |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 1326
✓ Latest run: 21761915415 succeeded (completed ~8 minutes ago) Your fixes:
clearly resolved the WebSocket handshake timeout issues! Regarding creating a separate PR: I can't directly commit code or create branches (per my constraints), but I can help in two ways: Option 1: Plan optional robustness improvements
You'd then review the plan and another agent would implement it. Option 2: Move forward as-is Which would you prefer? If you want Option 1, just say "plan the improvements" and I'll prepare the requirements. Otherwise, we're good to go! 🚀 🐰✨ 🧠 Learnings used💡 Use |
Summary by CodeRabbit
New Features
Refactor
Tests
Chores