-
Notifications
You must be signed in to change notification settings - Fork 0
feat(column-operation): Implement common pattern for column operations #171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change introduces a common pattern for column operations in the `ColumnOperationService` class. The key changes are: 1. Adds a `executeColumnOperation` method that handles the common steps: - Get the original column type - Ensure the column is a string-like type if needed - Count the affected rows - Perform the operation if there are affected rows - Revert the column type if no rows were affected and the type was converted 2. Adds helper methods to change the column type, escape regex characters, and get the column type from the table schema. 3. Refactors the `ReplaceOperationService` to use the new `executeColumnOperation` method. 4. Adds a new `TrimWhitespaceSchema` for the trim whitespace operation API endpoint. 5. Adds tests for the new trim whitespace operation API endpoint.
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a shared ColumnOperationService and a TrimWhitespaceService; refactors ReplaceOperationService to use performOperation; exposes a new POST /:projectId/trim_whitespace endpoint and schema; updates backend and service tests; wires frontend ColumnHeaderMenu and project store to call the new API. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant UI as ColumnHeaderMenu (FE)
participant API as Project API
participant Svc as TrimWhitespaceService
participant Base as ColumnOperationService
participant DB as DuckDB
User->>UI: Click "Trim whitespace" for column
UI->>API: POST /projects/:projectId/trim_whitespace { column }
API->>Svc: performOperation({ table: project_<id>, column })
Svc->>Base: executeColumnOperation(table, column, buildUpdateQuery, countFn)
Base->>DB: PRAGMA table_info(table) -> column type
alt column not string-like
Base->>DB: ALTER TABLE ... ALTER COLUMN ... TYPE VARCHAR
end
Base->>DB: SELECT count(*) WHERE column IS NOT NULL AND column != trimmed(column)
alt count == 0
opt type was changed
Base->>DB: ALTER TABLE ... ALTER COLUMN ... TYPE original_type
end
Base-->>Svc: 0
else count > 0
Base->>DB: UPDATE table SET col = trimmed(col) WHERE condition
Base-->>Svc: affectedRows
end
Svc-->>API: affectedRows
API-->>UI: 200 { affectedRows }
UI-->>User: show result / emit replaceCompleted if affectedRows > 0
sequenceDiagram
autonumber
participant API as Project API
participant Replace as ReplaceOperationService
participant Base as ColumnOperationService
participant DB as DuckDB
API->>Replace: performOperation(params)
Replace->>Base: executeColumnOperation(table, column, buildReplaceQuery, countReplaceFn)
Base->>DB: type handling, pre-count, UPDATE, optional rollback
Base-->>Replace: affectedRows
Replace-->>API: affectedRows
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧬 Code graph analysis (1)backend/src/services/column-operation.service.ts (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (9)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
frontend/src/features/project-management/stores/project.store.ts (1)
58-69: ResetprojectIdwhen clearing the store
clearProjectnow leavesprojectIdpointing at the previously opened project. Any consumer that checksprojectIdto decide whether a project is loaded (or that immediately invokes project-scoped operations) will keep using the stale UUID even though the rest of the store has been reset. Please clear the ref alongside the other state so the store truly represents the “no project loaded” state.const clearProject = () => { + projectId.value = undefined data.value = [] meta.value = { name: '',
🧹 Nitpick comments (6)
backend/tests/api/project/project.trim-whitespace.test.ts (2)
53-54: Use unique temp file and clean it up after testsStatic filename risks collisions across parallel runs and leaves artifacts. Generate a unique path and delete it in an afterEach.
-const tempFilePath = tmpdir() + '/test-trim-whitespace-data.json' +const tempFilePath = + tmpdir() + + `/test-trim-whitespace-data-${Date.now()}-${Math.random().toString(36).slice(2)}.json`Add cleanup:
@@ beforeEach(async () => { @@ await importTestData() }) + + afterEach(async () => { + try { + await Bun.file(tempFilePath).delete() + } catch {} + })
149-159: Assert column type reversion when 0 rows are affected (optional)Since the base service reverts type on 0 updates, assert schema for int_col remains INTEGER to lock in behavior.
expect(data).toEqual({ affectedRows: 0, }) + + // Optional: verify type reverted + const { data: after } = await api.project({ projectId }).get({ query: { limit: 1, offset: 0 } }) + const intCol = (after as any).meta.schema.find((c: any) => c.name === 'int_col') + expect(intCol.type.toUpperCase()).toContain('INTEGER')backend/src/api/project/index.ts (1)
604-664: New trim_whitespace endpoint looks solid; consider route naming consistency and DRY helper
- Consider hyphen-case (trim-whitespace) to match typical REST path style alongside replace.
- Column existence check logic duplicates the replace route; factor into a small helper to reduce drift.
Please confirm frontend/docs already reference /trim_whitespace; if you’d like, I can submit a small follow-up to rename to /trim-whitespace and update callers.
backend/tests/services/trim-whitespace.service.test.ts (1)
44-61: Also assert the returned affected row count (optional)Capture and assert the service’s return value to guard against regressions.
- await service.performOperation({ table: 'test', column: 'text_col' }) + const affected = await service.performOperation({ table: 'test', column: 'text_col' }) + expect(affected).toBeGreaterThan(0)backend/src/services/column-operation.service.ts (1)
89-98: Avoid redundant schema roundtrip (optional)You already fetched the original type; pass it into ensureColumnIsStringType to avoid a second PRAGMA.
backend/src/services/trim-whitespace.service.ts (1)
33-42: Use transactional update and retrieve affected‐row count via client API
Wrap your UPDATE in a transaction and derive the number of affected rows from DuckDB’s client API (e.g. duckdb_rows_changed or your language binding’s rows_changed) instead of pre-counting. DuckDB’s UPDATE…RETURNING support is undocumented and has known issues; the rows_changed API is the recommended way.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
backend/src/api/project/index.ts(4 hunks)backend/src/api/project/schemas.ts(1 hunks)backend/src/services/column-operation.service.ts(1 hunks)backend/src/services/replace-operation.service.ts(3 hunks)backend/src/services/trim-whitespace.service.ts(1 hunks)backend/tests/api/project/project.trim-whitespace.test.ts(1 hunks)backend/tests/services/replace-operation.alter-table.service.test.ts(1 hunks)backend/tests/services/replace-operation.service.test.ts(10 hunks)backend/tests/services/trim-whitespace.service.test.ts(1 hunks)frontend/src/features/data-processing/components/ColumnHeaderMenu.vue(3 hunks)frontend/src/features/project-management/stores/project.store.ts(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
frontend/src/features/project-management/stores/project.store.ts (2)
frontend/auto-imports.d.ts (1)
UUID(372-372)frontend/src/core/plugins/api.ts (1)
api(34-37)
backend/tests/api/project/project.trim-whitespace.test.ts (2)
backend/src/api/project/index.ts (1)
projectRoutes(23-664)backend/src/plugins/database.ts (1)
initializeDb(8-48)
backend/src/services/trim-whitespace.service.ts (1)
backend/src/services/column-operation.service.ts (1)
ColumnOperationParams(3-6)
backend/src/api/project/index.ts (4)
backend/src/types/error-handler.ts (1)
ApiErrorHandler(6-211)backend/src/services/trim-whitespace.service.ts (1)
TrimWhitespaceService(4-43)backend/src/api/project/schemas.ts (1)
TrimWhitespaceSchema(75-80)backend/src/types/error-schemas.ts (2)
ApiErrors(20-26)ApiErrors(27-27)
backend/src/services/column-operation.service.ts (2)
backend/src/services/replace-operation.service.ts (1)
countAffectedRows(87-127)backend/src/services/trim-whitespace.service.ts (1)
countAffectedRows(33-42)
backend/tests/services/trim-whitespace.service.test.ts (2)
backend/src/services/trim-whitespace.service.ts (1)
TrimWhitespaceService(4-43)backend/src/plugins/database.ts (2)
initializeDb(8-48)getDb(50-55)
backend/src/services/replace-operation.service.ts (1)
backend/src/services/column-operation.service.ts (1)
ColumnOperationParams(3-6)
🔇 Additional comments (3)
backend/src/api/project/index.ts (2)
563-571: Rename to performOperation looks goodEndpoint now calls performOperation on ReplaceOperationService; response type updated to Integer. No issues spotted.
589-589: Use Integer for affectedRowsSwitch to t.Integer() is appropriate for row counts and tightens the contract.
backend/src/services/replace-operation.service.ts (1)
12-22: Great use of the shared column operation template.Delegating through
executeColumnOperationkeeps the replace workflow consistent with the new column-operation pattern and removes the duplicated bookkeeping that lived here before. Nice refactor.
| protected async changeColumnType(table: string, column: string, newType: string): Promise<void> { | ||
| await this.db.run(`ALTER TABLE "${table}" ALTER "${column}" TYPE ${newType}`) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Quote identifiers via a helper to prevent edge-case injection and quoting bugs
Centralize quoting and reuse everywhere identifiers are interpolated.
- protected async changeColumnType(table: string, column: string, newType: string): Promise<void> {
- await this.db.run(`ALTER TABLE "${table}" ALTER "${column}" TYPE ${newType}`)
- }
+ protected ident(name: string): string {
+ return `"${name.replace(/"/g, '""')}"`
+ }
+
+ protected async changeColumnType(table: string, column: string, newType: string): Promise<void> {
+ await this.db.run(`ALTER TABLE ${this.ident(table)} ALTER ${this.ident(column)} TYPE ${newType}`)
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| protected async changeColumnType(table: string, column: string, newType: string): Promise<void> { | |
| await this.db.run(`ALTER TABLE "${table}" ALTER "${column}" TYPE ${newType}`) | |
| } | |
| protected ident(name: string): string { | |
| return `"${name.replace(/"/g, '""')}"` | |
| } | |
| protected async changeColumnType(table: string, column: string, newType: string): Promise<void> { | |
| await this.db.run( | |
| `ALTER TABLE ${this.ident(table)} ALTER ${this.ident(column)} TYPE ${newType}` | |
| ) | |
| } |
🤖 Prompt for AI Agents
In backend/src/services/column-operation.service.ts around lines 59 to 61 the
ALTER TABLE statement interpolates table and column identifiers directly; create
and use a helper (e.g., quoteIdentifier(name: string): string) that returns the
identifier wrapped in double quotes with any internal double quotes escaped by
doubling them, then replace the inline interpolation with calls to that helper
for both table and column (and ensure this helper is used consistently across
the codebase wherever identifiers are interpolated).
| private async getColumnType(table: string, column: string): Promise<string> { | ||
| const result = await this.db.runAndReadAll(`PRAGMA table_info("${table}")`) | ||
| const columns = result.getRowObjectsJson() as Array<{ | ||
| name: string | ||
| type: string | ||
| }> | ||
|
|
||
| const columnInfo = columns.find((col) => col.name === column) | ||
| if (!columnInfo) { | ||
| throw new Error(`Column '${column}' not found in table '${table}'`) | ||
| } | ||
|
|
||
| return columnInfo.type.toUpperCase() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Use identifier helper in PRAGMA
Apply the same quoting strategy when reading schema.
- const result = await this.db.runAndReadAll(`PRAGMA table_info("${table}")`)
+ const result = await this.db.runAndReadAll(`PRAGMA table_info(${this.ident(table)})`)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| private async getColumnType(table: string, column: string): Promise<string> { | |
| const result = await this.db.runAndReadAll(`PRAGMA table_info("${table}")`) | |
| const columns = result.getRowObjectsJson() as Array<{ | |
| name: string | |
| type: string | |
| }> | |
| const columnInfo = columns.find((col) => col.name === column) | |
| if (!columnInfo) { | |
| throw new Error(`Column '${column}' not found in table '${table}'`) | |
| } | |
| return columnInfo.type.toUpperCase() | |
| } | |
| private async getColumnType(table: string, column: string): Promise<string> { | |
| const result = await this.db.runAndReadAll( | |
| `PRAGMA table_info(${this.ident(table)})` | |
| ) | |
| const columns = result.getRowObjectsJson() as Array<{ | |
| name: string | |
| type: string | |
| }> | |
| const columnInfo = columns.find((col) => col.name === column) | |
| if (!columnInfo) { | |
| throw new Error(`Column '${column}' not found in table '${table}'`) | |
| } | |
| return columnInfo.type.toUpperCase() | |
| } |
🤖 Prompt for AI Agents
In backend/src/services/column-operation.service.ts around lines 104-117, the
PRAGMA statement currently interpolates the table name directly; update it to
use the project’s identifier-quoting helper (the same helper used elsewhere in
this service) when building the SQL (e.g. construct PRAGMA
table_info(<quotedIdentifier>)) so the table name is quoted/escaped
consistently; replace the string interpolation with a call to that helper and
use the returned value in the PRAGMA call.
| private buildParameterizedUpdateQuery(table: string, column: string) { | ||
| const query = ` | ||
| UPDATE "${table}" | ||
| SET "${column}" = regexp_replace("${column}", '^\\s+|\\s+$', '', 'g') | ||
| WHERE "${column}" IS NOT NULL | ||
| AND "${column}" != regexp_replace("${column}", '^\\s+|\\s+$', '', 'g') | ||
| ` | ||
|
|
||
| return { query, params: [] } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Defensively quote identifiers in generated SQL
Even with prior existence checks, quote/escape identifiers to harden against edge cases. Consider a shared ident() helper in the base class and use it here.
- private buildParameterizedUpdateQuery(table: string, column: string) {
+ private buildParameterizedUpdateQuery(table: string, column: string) {
+ const t = this.ident(table)
+ const c = this.ident(column)
const query = `
- UPDATE "${table}"
- SET "${column}" = regexp_replace("${column}", '^\\s+|\\s+$', '', 'g')
- WHERE "${column}" IS NOT NULL
- AND "${column}" != regexp_replace("${column}", '^\\s+|\\s+$', '', 'g')
+ UPDATE ${t}
+ SET ${c} = regexp_replace(${c}, '^\\s+|\\s+$', '', 'g')
+ WHERE ${c} IS NOT NULL
+ AND ${c} != regexp_replace(${c}, '^\\s+|\\s+$', '', 'g')
`
return { query, params: [] }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| private buildParameterizedUpdateQuery(table: string, column: string) { | |
| const query = ` | |
| UPDATE "${table}" | |
| SET "${column}" = regexp_replace("${column}", '^\\s+|\\s+$', '', 'g') | |
| WHERE "${column}" IS NOT NULL | |
| AND "${column}" != regexp_replace("${column}", '^\\s+|\\s+$', '', 'g') | |
| ` | |
| return { query, params: [] } | |
| } | |
| private buildParameterizedUpdateQuery(table: string, column: string) { | |
| const t = this.ident(table) | |
| const c = this.ident(column) | |
| const query = ` | |
| UPDATE ${t} | |
| SET ${c} = regexp_replace(${c}, '^\\s+|\\s+$', '', 'g') | |
| WHERE ${c} IS NOT NULL | |
| AND ${c} != regexp_replace(${c}, '^\\s+|\\s+$', '', 'g') | |
| ` | |
| return { query, params: [] } | |
| } |
🤖 Prompt for AI Agents
In backend/src/services/trim-whitespace.service.ts around lines 19 to 28, the
generated SQL concatenates table and column names directly; change it to use the
shared ident() helper from the base class (or add a local ident(name) that
returns '"' + name.replace(/"/g, '""') + '"') to defensively quote/escape
identifiers, and update the template to use ident(table) and ident(column)
instead of raw values so identifiers are always safely quoted; keep the same
regexp string and params return.
| test('should handle non-existent column', () => { | ||
| expect( | ||
| service.performOperation({ table: 'test', column: 'nonexistent_col' }), | ||
| ).rejects.toThrowError(`Column 'nonexistent_col' not found in table 'test'`) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Await the rejected promise
Without awaiting, the test may pass spuriously. Await or return the promise.
- expect(
- service.performOperation({ table: 'test', column: 'nonexistent_col' }),
- ).rejects.toThrowError(`Column 'nonexistent_col' not found in table 'test'`)
+ await expect(
+ service.performOperation({ table: 'test', column: 'nonexistent_col' }),
+ ).rejects.toThrowError(`Column 'nonexistent_col' not found in table 'test'`)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In backend/tests/services/trim-whitespace.service.test.ts around lines 63 to 67,
the test calls expect(...).rejects.toThrowError(...) but does not await or
return the promise, which can cause a false positive; fix by awaiting the
expect(...) assertion (i.e., prepend await) or returning the promise from the
test so the test runner waits for the rejected promise to be asserted.
| test('should handle non-existent table', () => { | ||
| expect( | ||
| service.performOperation({ table: 'nonexistent_table', column: 'text_col' }), | ||
| ).rejects.toThrowError(/Table.*nonexistent_table.*does not exist/) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Await the rejected promise
Same issue here.
- expect(
- service.performOperation({ table: 'nonexistent_table', column: 'text_col' }),
- ).rejects.toThrowError(/Table.*nonexistent_table.*does not exist/)
+ await expect(
+ service.performOperation({ table: 'nonexistent_table', column: 'text_col' }),
+ ).rejects.toThrowError(/Table.*nonexistent_table.*does not exist/)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In backend/tests/services/trim-whitespace.service.test.ts around lines 69 to 73,
the test calls expect(...).rejects.toThrowError on a promise but neither returns
nor awaits the expectation, so the test may falsely pass; fix by either marking
the test async and prefixing the expectation with await (e.g., make the callback
async and await expect(...).rejects.toThrowError(...)) or by returning the
expect(...) promise from the test (e.g., return
expect(...).rejects.toThrowError(...)).
| test('should handle INTEGER column (0 affected rows)', () => { | ||
| expect(service.performOperation({ table: 'test', column: 'int_col' })).resolves.toBe(0) | ||
| }) | ||
|
|
||
| test('should handle DOUBLE column (0 affected rows)', () => { | ||
| expect(service.performOperation({ table: 'test', column: 'double_col' })).resolves.toBe(0) | ||
| }) | ||
|
|
||
| test('should handle BOOLEAN column (0 affected rows)', () => { | ||
| expect(service.performOperation({ table: 'test', column: 'bool_col' })).resolves.toBe(0) | ||
| }) | ||
|
|
||
| test('should handle DATE column (0 affected rows)', () => { | ||
| expect(service.performOperation({ table: 'test', column: 'date_col' })).resolves.toBe(0) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Await resolves for 0-affected assertions
Ensure the assertions actually run.
- expect(service.performOperation({ table: 'test', column: 'int_col' })).resolves.toBe(0)
+ await expect(service.performOperation({ table: 'test', column: 'int_col' })).resolves.toBe(0)
@@
- expect(service.performOperation({ table: 'test', column: 'double_col' })).resolves.toBe(0)
+ await expect(service.performOperation({ table: 'test', column: 'double_col' })).resolves.toBe(0)
@@
- expect(service.performOperation({ table: 'test', column: 'bool_col' })).resolves.toBe(0)
+ await expect(service.performOperation({ table: 'test', column: 'bool_col' })).resolves.toBe(0)
@@
- expect(service.performOperation({ table: 'test', column: 'date_col' })).resolves.toBe(0)
+ await expect(service.performOperation({ table: 'test', column: 'date_col' })).resolves.toBe(0)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In backend/tests/services/trim-whitespace.service.test.ts around lines 88 to
102, the tests use expect(service.performOperation(...)).resolves.toBe(0) but do
not await or return the promise so the assertions may not run; update each test
to either return the expectation (return expect(...).resolves.toBe(0)) or use
await (await expect(...).resolves.toBe(0)) and mark the test callback async if
using await.
This change introduces a common pattern for column operations in the
ColumnOperationServiceclass. The key changes are:Adds a
executeColumnOperationmethod that handles the common steps:Adds helper methods to change the column type, escape regex characters, and
get the column type from the table schema.
Refactors the
ReplaceOperationServiceto use the newexecuteColumnOperationmethod.Adds a new
TrimWhitespaceSchemafor the trim whitespace operation APIendpoint.
Adds tests for the new trim whitespace operation API endpoint.