Skip to content

refactor init process#316

Merged
liunyl merged 2 commits intomainfrom
eloqdb
Dec 3, 2025
Merged

refactor init process#316
liunyl merged 2 commits intomainfrom
eloqdb

Conversation

@liunyl
Copy link
Contributor

@liunyl liunyl commented Dec 2, 2025

Summary by CodeRabbit

  • Refactor
    • Switched to instance-based, multi-step startup/shutdown for safer lifecycle management.
  • New Features
    • Added a second-phase service Start step to attach runtime subsystems (metrics, slow-log, vector index) after the core substrate is ready.
  • Bug Fixes
    • Improved failure and shutdown paths to reduce resource leaks and startup race conditions.
  • Chores
    • Updated an internal submodule pointer to a newer commit.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Warning

Rate limit exceeded

@liunyl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 24 minutes and 54 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between be4dda9 and 63dc8cd.

📒 Files selected for processing (4)
  • data_substrate (1 hunks)
  • include/redis_service.h (1 hunks)
  • src/redis_server.cpp (4 hunks)
  • src/redis_service.cpp (4 hunks)

Walkthrough

Introduce a two-phase startup: initialize DataSubstrate instance, enable and start substrate engines, then call a new RedisServiceImpl::Start(brpc::Server&) to attach TxService, DataStoreHandler, metrics, slow-log, and vector components after substrate start.

Changes

Cohort / File(s) Summary
Submodule Update
data_substrate
Submodule pointer updated from 83622e6eee55ea703f1092d9dd53b204c4e63700 to 247a1d06a9db9a08934351c0891a80f7cf382ec2. No functional code changes.
Redis Service Header
include/redis_service.h
Added public member bool Start(brpc::Server &brpc_server) to RedisServiceImpl for second-phase initialization after DataSubstrate startup.
Server Initialization
src/redis_server.cpp
Reworked global init to instance-based flow (DataSubstrate::Instance().Init), added engine enablement (EnableEngine), reordered startup (EnableEngine → Start substrate → RedisServiceImpl::Start), switched shutdown to DataSubstrate::Instance().Shutdown(), and manage Redis service via std::unique_ptr injected into brpc server options.
Redis Service Implementation
src/redis_service.cpp
Moved runtime bindings from Init() into new RedisServiceImpl::Start(brpc::Server &); Start attaches to TxService/DataStore via DataSubstrate, registers engine via ds.RegisterEngine, initializes prebuilt tables, metrics, slow-log, vector handler, and starts metrics collector threads. Updated internal queries to use DataSubstrate::Instance().

Sequence Diagram(s)

sequenceDiagram
    participant Admin as Startup script
    participant App as redis_server (main)
    participant DS as DataSubstrate::Instance
    participant RS as RedisServiceImpl
    participant BRPC as brpc::Server
    participant Tx as TxService / StoreHandler

    Admin->>App: launch with config
    App->>DS: Init(config_file)
    DS-->>App: initialized
    App->>DS: EnableEngine(EloqKv)
    DS-->>App: engine enabled
    App->>DS: Start()
    DS-->>App: substrate running
    App->>RS: construct RedisServiceImpl (unique_ptr)
    App->>RS: Start(brpc::Server&)
    RS->>DS: ds.RegisterEngine / Attach TxService & StoreHandler
    DS-->>RS: TxService & StoreHandler available
    RS->>RS: init prebuilt tables, metrics, slow-log, vector handler
    RS-->>BRPC: register service (via options)
    BRPC-->>Admin: server ready
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Inspect moved initialization in src/redis_service.cpp for readiness checks and error handling when DataSubstrate components are absent.
  • Verify shutdown/error paths and resource cleanup in src/redis_server.cpp, ensuring no leaks and consistent ordering.
  • Review metrics collector and threading lifecycle in RedisServiceImpl::Start/Stop.
  • Confirm engine registration sequence and interactions with DataSubstrate APIs (EnableEngine / RegisterEngine).

Poem

🐇 I hopped from globals to an instance bright,
Init the substrate, engines wake at night.
Then Redis joins, metrics hum along,
Slow-logs murmur, vectors find their song.
A rabbit nods — startup steps all right!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'refactor init process' is vague and generic, using non-descriptive terms that don't convey meaningful information about the specific changes made to the codebase. Consider a more specific title that describes the key change, such as 'Refactor DataSubstrate and RedisService initialization to use instance-based flow' or 'Implement two-phase initialization for Redis and DataSubstrate services'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
include/redis_service.h (1)

168-170: Clarify Start() lifecycle expectations in class-level docs

The second-phase Start(brpc::Server &) fits the new init flow. Consider adding a brief class comment (or doxygen on Start) stating it must be called exactly once, after DataSubstrate::Start() and before brpc::Server::Start(), and is not thread-safe, to avoid misuse by future callers.

src/redis_server.cpp (1)

452-501: Verify EnableEngine vs RegisterEngine ordering

The new startup sequence is clear, but note that DataSubstrate::Instance().EnableEngine(TableEngine::EloqKv) is called before RedisServiceImpl::Init(), which in turn calls DataSubstrate::Instance().RegisterEngine(...). If data_substrate expects engines to be registered before being enabled, this ordering could be problematic.

If the library contract prefers “register → enable”, consider moving EnableEngine after a successful RedisServiceImpl::Init() (or moving it into Init after RegisterEngine), so the sequence is:

  1. DataSubstrate::Instance().Init(...)
  2. RedisServiceImpl::Init(...) (does RegisterEngine)
  3. DataSubstrate::Instance().EnableEngine(...)
  4. DataSubstrate::Instance().Start()
src/redis_service.cpp (2)

446-569: Harden slow-log ID initialization and vector sizing in Start()

The refactored Start() correctly defers DS-dependent work (TxService/DataStore attach, slow-log, metrics, vector handler) until after DataSubstrate::Start(). Two nits around slow-log state, though:

  1. next_slow_log_unique_id_ is declared as std::vector<uint32_t> but seeded as if it were 64‑bit:
// high 16 bits: core_id
// low 48 bits: counter
next_slow_log_unique_id_.push_back(core_idx << 48);

Storing (core_idx << 48) into a uint32_t silently truncates the high bits, so the “high 16 bits = core_id” invariant never actually holds. If you care about that layout, consider changing the member to std::vector<uint64_t> and initializing via resize/index assignment, e.g.:

- std::vector<uint32_t> next_slow_log_unique_id_;
+ std::vector<uint64_t> next_slow_log_unique_id_;
...
- next_slow_log_unique_id_.push_back(core_idx << 48);
+ next_slow_log_unique_id_[core_idx] = (uint64_t(core_idx) << 48);
  1. Unlike the other slow‑log vectors, next_slow_log_unique_id_ is not cleared/resized; you only push_back per core. If Start() were ever called more than once (or core_num_ changed), this vector would keep growing and drift from core_num_. For robustness and symmetry with slow_log_, next_slow_log_idx_, and slow_log_len_, it would be safer to clear() or resize(core_num_) before seeding values.

Neither issue breaks the current single‑start flow, but tightening this makes the slow‑log bookkeeping match its documented 64‑bit ID semantics and avoids surprises if the lifecycle ever changes.


551-566: Vector handler Start() path looks correct, but depends on DS & config parse

The #ifdef VECTOR_INDEX_ENABLED block that initializes EloqVec::VectorHandler in Start() (after tx_service_ is available) is a good move and lines up with the InitHandlerInstance signature.

Since you re-parse config_file_ here just for vector configuration, you pay the INI parse cost twice (once in Init, once here). Not a big deal, but if startup parsing ever becomes a bottleneck, you could consider sharing a parsed config or extracting just the minimal vector config needed into a struct and passing that instead.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e21f17c and 1f3f5fc.

📒 Files selected for processing (4)
  • data_substrate (1 hunks)
  • include/redis_service.h (1 hunks)
  • src/redis_server.cpp (4 hunks)
  • src/redis_service.cpp (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
include/redis_service.h (1)
src/redis_service.cpp (2)
  • Start (446-570)
  • Start (446-446)
src/redis_service.cpp (2)
src/vector/vector_handler.cpp (4)
  • Instance (125-132)
  • Instance (125-125)
  • InitHandlerInstance (92-118)
  • InitHandlerInstance (92-96)
src/redis_server.cpp (1)
  • config_reader (441-441)
🔇 Additional comments (2)
src/redis_service.cpp (2)

186-259: Engine registration in Init() looks consistent with prebuilt tables & metrics

The new block that:

  • builds prebuilt_tables for databases Redis DBs,
  • defines engine_metrics from EloqKV::command_types, and
  • calls DataSubstrate::Instance().RegisterEngine(TableEngine::EloqKv, ...)

is coherent with the rest of the class state (redis_table_names_, metrics names) and keeps Redis-specific engine wiring localized in RedisServiceImpl::Init().

Given only one RedisServiceImpl is constructed in main(), repeated registration is not a concern in the current flow.


6051-6055: MaxConnectionCount() now depends on DataSubstrate core config

Switching MaxConnectionCount() to:

auto &ds = DataSubstrate::Instance();
return ds.GetCoreConfig().maxclients;

is consistent with the broader DS‑driven configuration approach and should keep connection limits aligned with DS settings, as long as this is only called after DataSubstrate::Init() (which main() currently guarantees).

No change requested; just noting the dependency so future call sites don’t invoke this before DS initialization.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/redis_server.cpp (1)

534-535: CRITICAL: Reverse shutdown order to prevent use-after-free.

The current shutdown order calls DataSubstrate::Instance().Shutdown() (line 534) before redis_service_ptr->Stop() (line 535). This is incorrect because:

  1. The metrics collector thread is stopped and joined in RedisServiceImpl::Stop()
  2. This thread accesses metrics::redis_meter, which is initialized and managed by DataSubstrate (as noted in Start())
  3. Tearing down DataSubstrate first can cause the metrics thread to access freed/invalid metrics state

This matches the concern raised in the past review comment. All error paths in this file correctly call redis_service_ptr->Stop() before DataSubstrate::Instance().Shutdown(). The normal shutdown path should follow the same pattern.

Apply this fix:

-    DataSubstrate::Instance().Shutdown();
-    redis_service_ptr->Stop();
+    redis_service_ptr->Stop();
+    DataSubstrate::Instance().Shutdown();
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f3f5fc and b6f8723.

📒 Files selected for processing (4)
  • data_substrate (1 hunks)
  • include/redis_service.h (1 hunks)
  • src/redis_server.cpp (4 hunks)
  • src/redis_service.cpp (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • data_substrate
🚧 Files skipped from review as they are similar to previous changes (1)
  • include/redis_service.h
🧰 Additional context used
🧬 Code graph analysis (1)
src/redis_service.cpp (2)
src/vector/vector_handler.cpp (4)
  • Instance (125-132)
  • Instance (125-125)
  • InitHandlerInstance (92-118)
  • InitHandlerInstance (92-96)
src/redis_server.cpp (1)
  • config_reader (441-441)
🔇 Additional comments (3)
src/redis_service.cpp (2)

6053-6054: LGTM!

The transition from global DataSubstrate access to instance-based access via DataSubstrate::Instance() is correct and aligns with the refactoring objectives.


186-258: The engine registration timing is correct as implemented. EnableEngine() (redis_server.cpp:461) enables the engine capability in DataSubstrate, and RegisterEngine() (redis_service.cpp:249) is subsequently called within Init() to provide the actual implementation (tables, metrics, factory). This two-step pattern is intentional and logical—no conflict exists.

src/redis_server.cpp (1)

452-500: LGTM! Multi-step startup sequence is well-structured.

The new phased initialization sequence clearly separates concerns:

  1. Initialize DataSubstrate instance
  2. Enable and register the EloqKv engine
  3. Start DataSubstrate (brings up substrate engines)
  4. Start Redis service (binds runtime dependencies)

This provides better error handling and makes the dependencies explicit. Each step properly cleans up on failure by calling Stop() and Shutdown() in the correct order.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
src/redis_service.cpp (1)

446-570: Start() sequencing is good, but eloq_data_path usage is still unresolved and will likely break the build

The new RedisServiceImpl::Start(brpc::Server &) generally follows a solid pattern:

  • Attaches to TxService and (optionally) DataStoreHandler only after DataSubstrate::Start().
  • Sizes slow‑log structures off core_num_, and uses DS network config to derive redis_ip_port.
  • Starts the metrics collector thread only when metrics are enabled.
  • Handles bootstrap mode by calling Stop(), ds.Shutdown(), and exiting.

However, in the vector‑index initialization block (roughly Line 560), you still pass eloq_data_path into EloqVec::VectorHandler::InitHandlerInstance without any visible declaration in this translation unit:

if (!EloqVec::VectorHandler::InitHandlerInstance(
        tx_service_,
        vector_index_worker_pool_.get(),
        eloq_data_path,
        &vector_cloud_config))

Unless eloq_data_path is declared in one of the included headers (as an extern or similar), this will be a compile‑time error. Even if it is declared elsewhere, you need to ensure that it is initialized before Start() is called and actually represents the correct vector‑index data path.

You probably want either:

  • To obtain the data path from DataSubstrate (e.g., a core or storage config field), or
  • To store the required path as a member of RedisServiceImpl during Init() (derived from the config file) and use that member here instead of a free variable.

For example, if you decide to store it as a member:

-        if (!EloqVec::VectorHandler::InitHandlerInstance(
-            tx_service_,
-            vector_index_worker_pool_.get(),
-            eloq_data_path,
-            &vector_cloud_config))
+        if (!EloqVec::VectorHandler::InitHandlerInstance(
+                tx_service_,
+                vector_index_worker_pool_.get(),
+                eloq_data_path_,  // member set in Init() from config
+                &vector_cloud_config))

and add an appropriate std::string eloq_data_path_; member plus initialization.

#!/bin/bash
# Verify that `eloq_data_path` is properly declared and defined.

echo "=== Search for any declaration/definition of eloq_data_path ==="
rg -n "eloq_data_path" --type cpp -C3 || true

echo
echo "If you only see uses inside redis_service.cpp and no declaration such as"
echo "\"extern std::string eloq_data_path;\" in a header or a definition in another"
echo "translation unit, you'll need to introduce a proper member or config-based"
echo "source for this path as suggested in the review comment."
src/redis_server.cpp (1)

534-535: Normal shutdown should stop Redis before tearing down DataSubstrate (reopens prior concern)

Here the shutdown order is reversed compared to your error paths:

DataSubstrate::Instance().Shutdown();
redis_service_ptr->Stop();

Since RedisServiceImpl::Stop() is what stops and joins the metrics collector thread, which in turn uses metrics state initialized/owned by DataSubstrate, calling Shutdown() first risks the metrics thread running against torn‑down DataSubstrate state during shutdown.

To keep shutdown symmetric and avoid that race, stop Redis (and its background threads) before shutting down DataSubstrate:

-    DataSubstrate::Instance().Shutdown();
-    redis_service_ptr->Stop();
+    redis_service_ptr->Stop();
+    DataSubstrate::Instance().Shutdown();

This matches the error paths above and the bootstrap path in RedisServiceImpl::Start().

🧹 Nitpick comments (1)
src/redis_service.cpp (1)

186-257: Engine registration and metrics wiring via DataSubstrate look good; consider guarding databases for reserve

The move to build prebuilt_tables and engine_metrics and register the EloqKv engine through DataSubstrate::RegisterEngine is clean and keeps Redis‑specific setup localized in Init().

One minor robustness point: prebuilt_tables.reserve(databases); uses an int that comes from config. If databases were misconfigured to a negative value, it would be converted to a huge size_t and cause reserve() to throw instead of simply resulting in zero tables as before. You could defensively clamp or guard it, e.g.:

-    std::vector<std::pair<txservice::TableName, std::string>> prebuilt_tables;
-    prebuilt_tables.reserve(databases);
+    std::vector<std::pair<txservice::TableName, std::string>> prebuilt_tables;
+    if (databases > 0)
+    {
+        prebuilt_tables.reserve(databases);
+    }

Not critical, but would make behavior under bad config inputs more predictable.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b6f8723 and 55cd0b9.

📒 Files selected for processing (4)
  • data_substrate (1 hunks)
  • include/redis_service.h (1 hunks)
  • src/redis_server.cpp (4 hunks)
  • src/redis_service.cpp (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • data_substrate
🧰 Additional context used
🧬 Code graph analysis (1)
include/redis_service.h (1)
src/redis_service.cpp (2)
  • Start (446-570)
  • Start (446-446)
🔇 Additional comments (4)
include/redis_service.h (1)

168-170: Two‑phase Start API for RedisServiceImpl looks consistent with usage

The added Start(brpc::Server &) and its docstring match the new startup sequence in redis_server.cpp; the lifecycle surface here looks good.

src/redis_server.cpp (2)

452-500: Startup refactor (DataSubstrate Init/Start + Redis Init/Start) reads clean

Sequencing InitEnableEngine(EloqKv)DataSubstrate::Start()RedisServiceImpl::Start(server) with consistent failure cleanup looks correct and matches the intended two‑phase initialization.


507-512: Error path on server.Start preserves safe teardown order

On listen failure, calling redis_service_ptr->Stop() before DataSubstrate::Instance().Shutdown() keeps the Redis metrics thread and other Redis runtime work from touching DataSubstrate after it's torn down. This ordering is appropriate.

src/redis_service.cpp (1)

6053-6055: Using DataSubstrate core config for MaxConnectionCount is appropriate

MaxConnectionCount() now delegating to DataSubstrate::Instance().GetCoreConfig().maxclients keeps the limit centralized in the substrate’s configuration and is consistent with the broader refactor. Given the startup sequence in redis_server.cpp, this should always be called after DataSubstrate is initialized and started.

@liunyl liunyl force-pushed the eloqdb branch 2 times, most recently from abb339f to 92ade76 Compare December 3, 2025 09:53
@liunyl liunyl merged commit 3fbc407 into main Dec 3, 2025
4 of 5 checks passed
@liunyl liunyl deleted the eloqdb branch December 3, 2025 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants