Skip to content

Conversation

@PsiACE
Copy link
Member

@PsiACE PsiACE commented Jan 12, 2026

Summary

  • Centralize SQL query detection and parameter rendering to reduce duplication.
  • Reuse shared execution helpers across embedded and remote client implementations.
  • Keep behavior consistent for non-query statements.

Summary by CodeRabbit

  • New Features

    • Added database management APIs: create, retrieve, delete, and list databases with tenant support.
  • Utilities

    • Added SQL utilities to detect query statements and safely render parameterized SQL strings.
  • Refactor

    • Simplified client initialization and centralized password/connection handling to reduce duplication.
  • Bug Fixes

    • More consistent parameter handling and result-fetching across embedded and remote clients.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 12, 2026

📝 Walkthrough

Walkthrough

Centralized client initialization and password handling; added BaseClient database management APIs (create/get/delete/list); moved SQL parameter rendering and query-fetch decision logic to shared utilities; embedded and server clients now delegate database operations to BaseClient and use unified SQL rendering/fetch logic.

Changes

Cohort / File(s) Summary
Client initialization
src/pyseekdb/client/__init__.py
Added internal helpers _resolve_password, _default_seekdb_path, _create_server_client; refactored Client() and AdminClient() to use them and centralize embedded vs remote creation and password resolution.
Base client — DB management & execution
src/pyseekdb/client/client_base.py
Added public DB APIs: create_database, get_database, delete_database, list_databases. Added helpers _database_tenant, _database_context, _parse_schema_row, _should_fetch_results, and _execute. Integrated fetch-decision logic and surfaced Database type.
Embedded client adjustments
src/pyseekdb/client/client_seekdb_embedded.py
Removed local _execute; switched to render_sql_with_params() for embedding params; use _should_fetch_results() to decide fetching; delegate DB management methods to super().
Server client adjustments
src/pyseekdb/client/client_seekdb_server.py
Removed local _execute; delegate DB management methods to BaseClient; added _database_tenant() helper; removed duplicated in-method SQL construction and tenant warnings.
SQL utilities
src/pyseekdb/client/sql_utils.py
Removed SqlStringifier class. Added is_query_sql(sql: str) and render_sql_with_params(sql: str, params: Sequence[Any]) with strict placeholder validation and escaping rules; simplified typing imports.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant User
participant ClientFactory as Client()/AdminClient()
participant BaseClient
participant Impl as EmbeddedClient / ServerClient
participant Storage as DB file / Remote Server

User->>ClientFactory: instantiate(Client/Admin, host/seekdb_path, password?)
ClientFactory->>ClientFactory: _resolve_password()
ClientFactory->>ClientFactory: _default_seekdb_path() (if embedded)
ClientFactory->>Impl: _create_server_client(is_admin?)
User->>BaseClient: call create_database(name, tenant)
BaseClient->>Impl: prepare SQL (render_sql_with_params) / decide fetch (_should_fetch_results)
Impl->>Storage: execute SQL (local file or remote server)
Storage-->>Impl: result/ack
Impl-->>BaseClient: raw result
BaseClient-->>User: normalized Database object / confirmation

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • hnwyllmm

Poem

🐇
I hop through code with tiny paws,
Helpers gather tidy laws,
Queries quoted, tenants known,
Base and children share one throne,
I twitch my nose — the paths are sewn! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 72.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'refactor: add helpers for clients' accurately describes the main change—introducing shared helper functions and centralizing logic across client implementations to reduce duplication.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/pyseekdb/client/sql_utils.py (1)

60-66: Risky auto-detection of hex strings.

The heuristic at line 64 treats any even-length string containing only hex characters as a hex value for UNHEX. This can incorrectly transform legitimate string data (e.g., UUIDs without hyphens like "abc123def456...", short hex-like words like "DEAD", "CAFE").

Consider requiring an explicit marker or separate method for hex values rather than auto-detecting based on content:

# Example false positive: user's document ID "DEADBEEF" would be treated as hex
>>> stringifier.stringify_value("DEADBEEF")
"UNHEX('DEADBEEF')"  # Wrong! User wanted the literal string
🤖 Fix all issues with AI agents
In @src/pyseekdb/client/client_base.py:
- Around line 389-393: The get_database function constructs sql by interpolating
the name variable directly into the SQL string (sql) causing SQL injection;
change it to use a parameterized query or properly escape the identifier before
building the query (e.g., use the DB API parameter binding on the cursor.execute
call or an escape function provided by the connection) so that the SCHEMA_NAME
comparison is passed as a bound parameter instead of f"…{name}…"; update the
code around get_database where sql is defined and where the query is executed to
use the parameter tuple (or escape_string) rather than string interpolation.

In @src/pyseekdb/client/sql_utils.py:
- Around line 27-41: In render_sql_with_params, handle bytes explicitly and
avoid fragile sequential replace: detect bytes in params (e.g., convert to a
safe SQL literal by decoding with errors='replace' or converting to hex and then
escaping) instead of using str(param) which yields b'...'; then perform
substitution by splitting the original sql on "%s" and interleaving the
escaped/replaced parameter literals (or validate that count of "%s" matches
len(params)) to produce the final string, rather than repeatedly calling
rendered.replace("%s", ..., 1) which can be corrupted if a replacement contains
the substring "%s".
🧹 Nitpick comments (2)
src/pyseekdb/client/client_seekdb_embedded.py (1)

22-22: Unused import: is_query_sql.

is_query_sql is imported but not directly used in this file. The query detection is handled via self._should_fetch_results() which is inherited from the base class.

Proposed fix
-from .sql_utils import is_query_sql, render_sql_with_params
+from .sql_utils import render_sql_with_params
src/pyseekdb/client/client_base.py (1)

442-446: Consider using parameterized queries for LIMIT and OFFSET values.

While type hints indicate Optional[int] for both parameters, they are not enforced at runtime. Using f-string interpolation with these values works correctly for integers but would be more robust with parameterized queries. If type checking is added to the project later, consider refactoring to use parameter placeholders instead of f-string interpolation.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9e3f34d and 35d2650.

📒 Files selected for processing (5)
  • src/pyseekdb/client/__init__.py
  • src/pyseekdb/client/client_base.py
  • src/pyseekdb/client/client_seekdb_embedded.py
  • src/pyseekdb/client/client_seekdb_server.py
  • src/pyseekdb/client/sql_utils.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-12-22T12:44:57.424Z
Learnt from: longbingljw
Repo: oceanbase/pyseekdb PR: 91
File: tests/integration_tests/test_collection_dml.py:36-36
Timestamp: 2025-12-22T12:44:57.424Z
Learning: In tests/integration_tests/test_collection_dml.py and similar test files, using `db_client._server._execute()` to run SQL commands directly (e.g., CREATE TABLE statements) is acceptable for integration tests that need custom table setup with specific vector index configurations.

Applied to files:

  • src/pyseekdb/client/client_seekdb_embedded.py
  • src/pyseekdb/client/client_seekdb_server.py
📚 Learning: 2025-12-22T12:33:52.528Z
Learnt from: longbingljw
Repo: oceanbase/pyseekdb PR: 91
File: tests/integration_tests/test_admin_database_management.py:26-39
Timestamp: 2025-12-22T12:33:52.528Z
Learning: In tests/integration_tests/test_admin_database_management.py, using `isinstance(admin_client._server, pyseekdb.SeekdbEmbeddedClient)` to detect client type is acceptable. AdminClient should not expose a public `mode` property.

Applied to files:

  • src/pyseekdb/client/client_seekdb_embedded.py
  • src/pyseekdb/client/__init__.py
  • src/pyseekdb/client/client_seekdb_server.py
🧬 Code graph analysis (5)
src/pyseekdb/client/sql_utils.py (1)
src/pyseekdb/client/base_connection.py (1)
  • _execute (31-33)
src/pyseekdb/client/client_seekdb_embedded.py (2)
src/pyseekdb/client/sql_utils.py (2)
  • is_query_sql (15-24)
  • render_sql_with_params (27-41)
src/pyseekdb/client/client_base.py (1)
  • _should_fetch_results (1817-1821)
src/pyseekdb/client/__init__.py (3)
src/pyseekdb/client/collection.py (1)
  • get (364-428)
src/pyseekdb/client/client_seekdb_embedded.py (1)
  • SeekdbEmbeddedClient (27-284)
src/pyseekdb/client/client_seekdb_server.py (1)
  • RemoteServerClient (19-207)
src/pyseekdb/client/client_base.py (4)
src/pyseekdb/client/base_connection.py (3)
  • BaseConnection (7-65)
  • _execute (31-33)
  • _ensure_connection (16-18)
src/pyseekdb/client/admin_client.py (9)
  • AdminAPI (34-94)
  • create_database (41-49)
  • create_database (116-118)
  • get_database (52-63)
  • get_database (120-122)
  • delete_database (66-74)
  • delete_database (124-126)
  • list_databases (77-94)
  • list_databases (128-135)
src/pyseekdb/client/database.py (1)
  • Database (7-54)
src/pyseekdb/client/sql_utils.py (1)
  • is_query_sql (15-24)
src/pyseekdb/client/client_seekdb_server.py (1)
src/pyseekdb/client/client_base.py (5)
  • create_database (365-377)
  • get_database (379-408)
  • delete_database (410-422)
  • list_databases (424-463)
  • _database_tenant (341-343)
🪛 Ruff (0.14.10)
src/pyseekdb/client/__init__.py

123-126: Avoid specifying long messages outside the exception class

(TRY003)

src/pyseekdb/client/client_base.py

341-341: Unused method argument: tenant

(ARG002)


390-392: Possible SQL injection vector through string-based query construction

(S608)


397-397: Avoid specifying long messages outside the exception class

(TRY003)


401-401: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: integration-test (server)
  • GitHub Check: integration-test (oceanbase)
  • GitHub Check: integration-test (embedded)
🔇 Additional comments (12)
src/pyseekdb/client/client_seekdb_server.py (2)

133-196: LGTM! Clean delegation to base class.

The database management methods now properly delegate to the superclass, reducing code duplication while preserving the tenant context through the _database_tenant override below.


198-203: LGTM! Proper tenant resolution for remote server mode.

The _database_tenant override correctly enforces the client's tenant for all database operations, with an appropriate warning when the caller specifies a different tenant.

src/pyseekdb/client/sql_utils.py (1)

15-24: LGTM! Query detection helper.

The function correctly identifies common query statement types. The empty string guard at line 16-17 prevents exceptions from strip() on None.

src/pyseekdb/client/__init__.py (3)

57-64: LGTM! Clean helper functions for password and path resolution.

These helpers centralize the environment variable fallback and default path logic, reducing duplication between Client() and AdminClient().


67-126: LGTM! Well-structured factory consolidation.

The _create_server_client helper cleanly centralizes the embedded vs. remote client selection logic. The late import of _PYLIBSEEKDB_AVAILABLE at line 113 appropriately avoids circular import issues while keeping the fallback behavior localized.


221-232: LGTM! Clean delegation to shared helpers.

Both Client() and AdminClient() now route through _resolve_password and _create_server_client, reducing code duplication while maintaining existing behavior.

Also applies to: 293-305

src/pyseekdb/client/client_seekdb_embedded.py (2)

145-153: LGTM! Clean delegation to shared SQL utilities.

The use of render_sql_with_params and _should_fetch_results properly centralizes the SQL handling logic, removing duplication from the embedded client.


236-280: LGTM! Consistent delegation pattern.

The database management methods correctly delegate to the superclass, matching the pattern used in RemoteServerClient.

src/pyseekdb/client/client_base.py (4)

341-346: LGTM! Base tenant resolution with override hook.

The base implementation returning None allows subclasses to provide tenant-specific behavior while _database_context handles the logging format cleanly.


348-363: LGTM! Robust schema row parsing.

The _parse_schema_row helper handles both dict and tuple/list row formats, providing flexibility for different cursor implementations.


1571-1584: LGTM! Unified fetch decision logic.

The _should_fetch_results integration in both context-manager and try/finally branches ensures consistent behavior for query vs non-query statements.


1817-1841: LGTM! Well-structured execution helpers.

The _should_fetch_results method provides a sensible fallback chain (cursor.description → SQL prefix detection), and _execute properly handles both cursor management styles while returning results only for queries.

@PsiACE PsiACE changed the title Refactor SQL execution helpers for embedded/remote clients refactor: add helpers for clients Jan 12, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @src/pyseekdb/client/client_base.py:
- Around line 365-377: The create_database method interpolates the raw database
name into SQL and is vulnerable to identifier injection; fix it by validating
and/or escaping identifier characters before building the SQL (e.g., disallow
unsafe characters or replace any backtick with doubled backticks), add a
reusable helper like _escape_identifier(name: str) and use it when composing SQL
in create_database and the other affected routines noted (lines ~412-424), and
continue to call self._execute(sql) with the sanitized identifier to prevent
injection.
🧹 Nitpick comments (2)
src/pyseekdb/client/sql_utils.py (1)

36-38: Consider handling bytes decode errors more explicitly.

Using errors="replace" silently converts invalid UTF-8 sequences to replacement characters (U+FFFD). This could mask data corruption issues. If bytes parameters are expected to be valid UTF-8 text, consider using errors="strict" (the default) and letting the exception propagate to surface encoding issues early.

♻️ Suggested change
         elif isinstance(param, (bytes, bytearray, memoryview)):
-            text = bytes(param).decode("utf-8", errors="replace")
+            text = bytes(param).decode("utf-8")  # Let UnicodeDecodeError propagate
             replacement = f"'{escape_string(text)}'"
src/pyseekdb/client/client_base.py (1)

444-448: Consider validating LIMIT/OFFSET parameters.

While limit and offset are typed as Optional[int], adding runtime validation would provide defense-in-depth against unexpected values (e.g., negative numbers).

♻️ Suggested validation
         if limit is not None:
+            if not isinstance(limit, int) or limit < 0:
+                raise ValueError("limit must be a non-negative integer")
             if offset is not None:
+                if not isinstance(offset, int) or offset < 0:
+                    raise ValueError("offset must be a non-negative integer")
                 sql += f" LIMIT {offset}, {limit}"
             else:
                 sql += f" LIMIT {limit}"
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 35d2650 and 1a2d469.

📒 Files selected for processing (3)
  • src/pyseekdb/client/client_base.py
  • src/pyseekdb/client/client_seekdb_embedded.py
  • src/pyseekdb/client/sql_utils.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-12-22T12:44:57.424Z
Learnt from: longbingljw
Repo: oceanbase/pyseekdb PR: 91
File: tests/integration_tests/test_collection_dml.py:36-36
Timestamp: 2025-12-22T12:44:57.424Z
Learning: In tests/integration_tests/test_collection_dml.py and similar test files, using `db_client._server._execute()` to run SQL commands directly (e.g., CREATE TABLE statements) is acceptable for integration tests that need custom table setup with specific vector index configurations.

Applied to files:

  • src/pyseekdb/client/client_seekdb_embedded.py
📚 Learning: 2025-12-22T12:33:52.528Z
Learnt from: longbingljw
Repo: oceanbase/pyseekdb PR: 91
File: tests/integration_tests/test_admin_database_management.py:26-39
Timestamp: 2025-12-22T12:33:52.528Z
Learning: In tests/integration_tests/test_admin_database_management.py, using `isinstance(admin_client._server, pyseekdb.SeekdbEmbeddedClient)` to detect client type is acceptable. AdminClient should not expose a public `mode` property.

Applied to files:

  • src/pyseekdb/client/client_seekdb_embedded.py
🧬 Code graph analysis (2)
src/pyseekdb/client/client_seekdb_embedded.py (2)
src/pyseekdb/client/sql_utils.py (1)
  • render_sql_with_params (23-47)
src/pyseekdb/client/client_base.py (5)
  • _should_fetch_results (1819-1823)
  • create_database (365-377)
  • get_database (379-410)
  • delete_database (412-424)
  • list_databases (426-465)
src/pyseekdb/client/client_base.py (6)
src/pyseekdb/client/base_connection.py (3)
  • BaseConnection (7-65)
  • _execute (31-33)
  • _ensure_connection (16-18)
src/pyseekdb/client/admin_client.py (9)
  • AdminAPI (34-94)
  • create_database (41-49)
  • create_database (116-118)
  • get_database (52-63)
  • get_database (120-122)
  • delete_database (66-74)
  • delete_database (124-126)
  • list_databases (77-94)
  • list_databases (128-135)
src/pyseekdb/client/database.py (1)
  • Database (7-54)
src/pyseekdb/client/sql_utils.py (1)
  • is_query_sql (11-20)
src/pyseekdb/client/client_seekdb_server.py (6)
  • _database_tenant (198-203)
  • create_database (134-145)
  • get_database (147-161)
  • _ensure_connection (69-85)
  • delete_database (163-174)
  • list_databases (176-196)
src/pyseekdb/client/client_seekdb_embedded.py (7)
  • create_database (236-244)
  • get_database (246-254)
  • _ensure_connection (72-95)
  • _use_context_manager_for_cursor (116-121)
  • _execute_query_with_cursor (123-206)
  • delete_database (256-264)
  • list_databases (266-280)
🪛 Ruff (0.14.10)
src/pyseekdb/client/client_base.py

341-341: Unused method argument: tenant

(ARG002)


399-399: Avoid specifying long messages outside the exception class

(TRY003)


403-403: Avoid specifying long messages outside the exception class

(TRY003)

src/pyseekdb/client/sql_utils.py

29-31: Avoid specifying long messages outside the exception class

(TRY003)


33-33: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: integration-test (server)
  • GitHub Check: integration-test (oceanbase)
  • GitHub Check: integration-test (embedded)
🔇 Additional comments (11)
src/pyseekdb/client/sql_utils.py (2)

11-20: LGTM!

The is_query_sql function correctly identifies query statements by checking for common SQL prefixes. The logic handles empty strings and uses case-insensitive matching via .upper().


23-47: The split-based placeholder approach is acceptable for this use case, as SQL strings are internally controlled.

The sql.split("%s") implementation is not a practical concern here because render_sql_with_params receives only internally-constructed SQL—never user-supplied input. All SQL construction in the codebase uses programmatic building (FilterBuilder, f-strings) and does not include literal %s strings. No changes needed for the placeholder detection logic.

The suggestion to add strict=True to the zip() call on line 33 remains valid but minor, given that placeholder counts are already validated beforehand.

src/pyseekdb/client/client_seekdb_embedded.py (3)

143-157: LGTM!

The refactored _execute_query_with_cursor cleanly integrates with the new shared utilities:

  • Uses render_sql_with_params for parameter embedding
  • Uses _should_fetch_results to determine fetch behavior
  • Correctly returns an empty list for non-query statements

This maintains behavior consistency while reducing code duplication.


236-280: LGTM!

Clean delegation to base class implementations. The docstrings appropriately document that the tenant parameter is ignored for embedded mode, maintaining API consistency while deferring implementation to the shared base class.


8-8: Verify if escape_string import is still needed.

With parameter rendering now delegated to render_sql_with_params from sql_utils, the escape_string import on line 8 may be unused in this file. If confirmed, removing it would reduce unnecessary imports.

#!/bin/bash
# Check if escape_string is used elsewhere in client_seekdb_embedded.py
rg -n "escape_string" src/pyseekdb/client/client_seekdb_embedded.py

Also applies to: 22-22

src/pyseekdb/client/client_base.py (6)

9-9: LGTM!

Import additions are appropriate for the new functionality:

  • Sequence for type hints
  • Database class for return types
  • is_query_sql for fetch decision logic

Also applies to: 16-16, 34-35


341-363: LGTM!

The helper methods are well-designed:

  • _database_tenant provides a base implementation for subclasses to override (the unused tenant parameter is intentional for API consistency)
  • _database_context generates context strings for logging
  • _parse_schema_row handles both dict and tuple row formats robustly

The static analysis hint ARG002 is a false positive since the parameter enables subclass override patterns.


379-410: LGTM!

The get_database method correctly uses parameterized queries (line 392-396) to prevent SQL injection. The error handling appropriately raises ValueError when the database is not found.


1819-1823: LGTM!

The _should_fetch_results method uses a robust two-tier approach:

  1. First checks cursor.description (the most reliable indicator)
  2. Falls back to is_query_sql() when description is unavailable

This correctly handles both standard DB-API cursors and embedded cursors that may not populate description.


1825-1843: LGTM!

The new _execute method provides a clean, centralized implementation:

  • Consolidates connection and cursor management
  • Uses _should_fetch_results for consistent fetch behavior
  • Supports both context manager and manual cursor patterns
  • Returns None for non-query statements, avoiding unnecessary fetch attempts

1571-1594: LGTM!

The _execute_query_with_cursor method now consistently uses _should_fetch_results in both the context manager and try/finally paths. Returning an empty list for non-query statements aligns with the List[Dict[str, Any]] return type and prevents errors when callers iterate over results.

Signed-off-by: Chojan Shang <psiace@apache.org>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/pyseekdb/client/client_seekdb_embedded.py (1)

8-8: Remove unused import.

escape_string from pymysql.converters is not used in this file. Parameter escaping is handled entirely by render_sql_with_params from sql_utils.py (line 145).

🤖 Fix all issues with AI agents
In @src/pyseekdb/client/sql_utils.py:
- Around line 39-40: The branch handling numeric parameters currently converts
floats with replacement = str(param) which yields 'inf'/'nan' for special float
values; update the numeric handling (the isinstance(param, (int, float)) branch
that sets replacement) to check float finiteness (use math.isfinite(param) or
math.isinf/math.isnan) and raise a clear error (e.g., ValueError) when param is
not a finite number, otherwise continue using str(param) for valid ints/floats
so only finite numeric literals are emitted.
🧹 Nitpick comments (2)
src/pyseekdb/client/sql_utils.py (1)

33-46: Add strict=True to zip for defensive consistency.

While parameter count is validated above, adding strict=True provides an additional safety net and satisfies linter B905.

Suggested fix
-    for param, part in zip(params, parts[1:]):
+    for param, part in zip(params, parts[1:], strict=True):
src/pyseekdb/client/client_seekdb_embedded.py (1)

164-199: Regex-based column extraction has known edge cases.

The parentheses-depth tracking doesn't account for:

  • String literals containing commas (e.g., SELECT 'a,b' AS col)
  • Nested subqueries in SELECT clause
  • CASE expressions with commas

This is acceptable as a fallback when cursor.description is unavailable, but consider adding a comment documenting these limitations for future maintainers.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a2d469 and 27f7a4a.

📒 Files selected for processing (2)
  • src/pyseekdb/client/client_seekdb_embedded.py
  • src/pyseekdb/client/sql_utils.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-12-22T12:44:57.424Z
Learnt from: longbingljw
Repo: oceanbase/pyseekdb PR: 91
File: tests/integration_tests/test_collection_dml.py:36-36
Timestamp: 2025-12-22T12:44:57.424Z
Learning: In tests/integration_tests/test_collection_dml.py and similar test files, using `db_client._server._execute()` to run SQL commands directly (e.g., CREATE TABLE statements) is acceptable for integration tests that need custom table setup with specific vector index configurations.

Applied to files:

  • src/pyseekdb/client/client_seekdb_embedded.py
📚 Learning: 2025-12-22T12:33:52.528Z
Learnt from: longbingljw
Repo: oceanbase/pyseekdb PR: 91
File: tests/integration_tests/test_admin_database_management.py:26-39
Timestamp: 2025-12-22T12:33:52.528Z
Learning: In tests/integration_tests/test_admin_database_management.py, using `isinstance(admin_client._server, pyseekdb.SeekdbEmbeddedClient)` to detect client type is acceptable. AdminClient should not expose a public `mode` property.

Applied to files:

  • src/pyseekdb/client/client_seekdb_embedded.py
🧬 Code graph analysis (1)
src/pyseekdb/client/client_seekdb_embedded.py (2)
src/pyseekdb/client/sql_utils.py (1)
  • render_sql_with_params (23-47)
src/pyseekdb/client/client_base.py (1)
  • _should_fetch_results (1817-1821)
🪛 Ruff (0.14.10)
src/pyseekdb/client/sql_utils.py

29-31: Avoid specifying long messages outside the exception class

(TRY003)


33-33: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: integration-test (embedded)
  • GitHub Check: integration-test (oceanbase)
  • GitHub Check: integration-test (server)
🔇 Additional comments (3)
src/pyseekdb/client/sql_utils.py (1)

11-20: LGTM!

The query detection logic correctly identifies SELECT, SHOW, DESCRIBE, and DESC statements. The .strip().upper() handling ensures case-insensitivity and whitespace tolerance.

src/pyseekdb/client/client_seekdb_embedded.py (2)

145-153: LGTM!

Clean delegation to the shared utilities render_sql_with_params and _should_fetch_results. This aligns well with the PR objective to centralize SQL handling and reduce duplication.


236-280: LGTM!

Clean delegation to the base class for database management operations. The docstrings appropriately document that the tenant parameter is not meaningful for embedded mode while still forwarding it to maintain a consistent interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant