Skip to content

Conversation

@lqvp
Copy link
Owner

@lqvp lqvp commented Jul 19, 2025

Migrate database from LokiJS (memory.json) to SQLite while maintaining API compatibility.

This migration addresses issue #134 by replacing the in-memory LokiJS database with a more robust SQLite solution using better-sqlite3. The DatabaseManager has been rewritten to maintain the existing LokiJS API, minimizing changes to other modules. It also includes automatic migration from memory.json on first run, performance optimizations (WAL mode, prepared statements, transactions), and new database management features like backup and optimization.

Co-authored-by: rei.mizushima <rei.mizushima@go.sfcollege.edu>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 19, 2025

Warning

Rate limit exceeded

@lqvp has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 25 minutes and 6 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 6aab9ec and 077148b.

📒 Files selected for processing (18)
  • .gitignore (1 hunks)
  • MIGRATION-GUIDE.md (1 hunks)
  • README.md (1 hunks)
  • SQLITE-MIGRATION.md (1 hunks)
  • example.config.toml (1 hunks)
  • package.json (3 hunks)
  • src/ai.ts (3 hunks)
  • src/database/DatabaseManager.test.ts (1 hunks)
  • src/database/DatabaseManager.ts (4 hunks)
  • src/database/performance.test.ts (1 hunks)
  • src/index.ts (2 hunks)
  • src/migrate-to-sqlite.ts (1 hunks)
  • src/modules/aichat/index.ts (2 hunks)
  • src/modules/check-custom-emojis/index.ts (1 hunks)
  • src/modules/guessing-game/index.ts (1 hunks)
  • src/modules/kazutori/index.ts (2 hunks)
  • src/modules/keyword/index.ts (1 hunks)
  • src/modules/reminder/index.ts (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai auto-generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @lqvp, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly upgrades the application's data persistence by transitioning from an in-memory LokiJS database to a more robust and scalable SQLite solution. The change aims to improve overall performance and reliability, while introducing new database management capabilities like backup and optimization, all without breaking existing module integrations.

Highlights

  • Database Migration: The core database persistence layer has been migrated from LokiJS (using memory.json) to SQLite (using ai.db). This provides a more robust and performant solution.
  • API Compatibility: The DatabaseManager has been completely rewritten to use better-sqlite3 while maintaining a LokiJS-compatible API. This minimizes changes required in other modules that interact with the database.
  • Automatic and Manual Migration: The system now supports automatic migration of existing memory.json data to SQLite on the first run. A new pnpm migrate command has also been added for manual migration.
  • Performance Enhancements: The new SQLite implementation includes performance optimizations such as WAL mode for better concurrency, prepared statement caching, and transaction support for bulk operations.
  • New Database Management Features: Added new functionalities for database management, including backup, optimization (VACUUM), and retrieval of database statistics. Graceful shutdown has also been implemented to ensure proper database closure.
  • Comprehensive Testing: New unit tests (DatabaseManager.test.ts) and performance benchmarks (performance.test.ts) have been added to ensure the reliability and efficiency of the new database layer.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The code changes migrate the database from LokiJS to SQLite, maintaining API compatibility. The new DatabaseManager is well-structured, and the addition of features like database backup, optimization, and graceful shutdown are excellent improvements. The review focuses on type safety, performance, robustness, and API clarity to further enhance the new implementation.

Comment on lines +440 to +451
find(query?: any): T[] {
const stmt = this.stmtCache.get('find')!;
const rows = stmt.all(this.name) as any[];

let results = rows.map(row => JSON.parse(row.data) as T);

if (query && Object.keys(query).length > 0) {
results = results.filter(doc => this.matchesQuery(doc, query));
}

return results;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The find method fetches all documents for a collection into memory and then filters them using JavaScript. This approach does not scale and will cause significant performance degradation and high memory usage as collections grow. It undermines one of the key benefits of moving to a real database like SQLite.

To fix this, you should translate the query object into a SQL WHERE clause to let the database do the filtering. The current matchesQuery is also very limited and doesn't support the range of operators LokiJS does (e.g., $gt, $in). A proper implementation is critical for both performance and API compatibility.

Comment on lines +379 to 380
public getCollection(name: string, opts?: any): any {
return this.dbManager.getCollection(name, opts);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The return type of getCollection has been changed to any, which weakens type safety for all modules. Once SQLiteCollection is exported from DatabaseManager.ts, you should use it here as the return type.

Suggested change
public getCollection(name: string, opts?: any): any {
return this.dbManager.getCollection(name, opts);
public getCollection<T extends object = any>(name: string, opts?: any): SQLiteCollection<T> {
return this.dbManager.getCollection<T>(name, opts);
}

Comment on lines +12 to +26
interface SQLiteCollection<T extends object = any> {
find(query?: any): T[];
findOne(query?: any): T | null;
insertOne(doc: T): T;
insert(docs: T[]): T[];
update(doc: T): void;
updateWhere(filterFunction: (obj: T) => boolean, updateFunction: (obj: T) => T): void;
remove(doc: T): void;
removeWhere(query: any): void;
clear(): void;
count(): number;
chain(): any;
data: T[];
name: string;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The SQLiteCollection interface is a great way to maintain compatibility with the LokiJS API. However, it's not exported from this module. This forces consumers like ai.ts and all the modules to use any as the type for collections, which sacrifices type safety across the application. Please export this interface so it can be used throughout the codebase.

Suggested change
interface SQLiteCollection<T extends object = any> {
find(query?: any): T[];
findOne(query?: any): T | null;
insertOne(doc: T): T;
insert(docs: T[]): T[];
update(doc: T): void;
updateWhere(filterFunction: (obj: T) => boolean, updateFunction: (obj: T) => T): void;
remove(doc: T): void;
removeWhere(query: any): void;
clear(): void;
count(): number;
chain(): any;
data: T[];
name: string;
}
export interface SQLiteCollection<T extends object = any> {
find(query?: any): T[];
findOne(query?: any): T | null;
insertOne(doc: T): T;
insert(docs: T[]): T[];
update(doc: T): void;
updateWhere(filterFunction: (obj: T) => boolean, updateFunction: (obj: T) => T): void;
remove(doc: T): void;
removeWhere(query: any): void;
clear(): void;
count(): number;
chain(): any;
data: T[];
name: string;
}

Comment on lines 294 to +297
public getCollection<T extends object = any>(
name: string,
opts?: any
): loki.Collection<T> {
let collection = this.db.getCollection<T>(name);

if (collection === null) {
collection = this.db.addCollection<T>(name, opts);
): SQLiteCollection<T> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The getCollection method accepts an opts parameter, which in LokiJS is used to define indices. However, the SQLiteCollectionImpl implementation completely ignores this parameter. No SQL indexes are created based on the provided options.

This is misleading because developers will assume they are creating performance-enhancing indexes when they are not. This can lead to unexpected performance issues.

Please either:

  1. Implement logic to parse the indices option and create the corresponding SQL indexes on the JSON data (e.g., using json_extract).
  2. Remove the opts parameter and document that this feature is not supported to avoid confusion.

Comment on lines +401 to +403
const maxId = this.db.prepare(
'SELECT MAX(CAST(json_extract(data, "$.id") AS INTEGER)) as max_id FROM collections WHERE collection_name = ?'
).get(name) as any;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The query to initialize idCounter assumes that the id field in the JSON data is always an integer by using CAST(... AS INTEGER). If an id is a non-integer string (e.g., a UUID), this cast will produce 0 in SQLite for non-numeric prefixes. This can lead to an incorrect idCounter and subsequent ID collisions when new documents are inserted.

This logic needs to be more robust. If ids can be non-integers, you should reconsider how idCounter is managed or ensure you only handle integer IDs.

Comment on lines 1 to -6
node_modules
npm-debug.log
config.toml
memory.json
font.ttf

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It appears serifs.toml and font.ttf were removed from the .gitignore file in this change. These files seem to be user-provided or user-configurable and likely should not be tracked in version control. Was this intentional? If they should be ignored, please add them back.

Comment on lines +28 to +36
afterEach(() => {
// Clean up
if (fs.existsSync(testDbPath)) {
fs.unlinkSync(testDbPath);
}
if (fs.existsSync(testMigrationMarker)) {
fs.unlinkSync(testMigrationMarker);
}
});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The afterEach hook cleans up the test database files but doesn't close the database connection held by dbManager. This can lead to resource leaks during the test run and may cause issues with file cleanup on some operating systems (like Windows, which can lock open files).

Please call dbManager.close() in afterEach to ensure the connection is properly terminated. The performance.test.ts file already does this, so it would be good to be consistent.

Suggested change
afterEach(() => {
// Clean up
if (fs.existsSync(testDbPath)) {
fs.unlinkSync(testDbPath);
}
if (fs.existsSync(testMigrationMarker)) {
fs.unlinkSync(testMigrationMarker);
}
});
afterEach(() => {
dbManager.close();
// Clean up
if (fs.existsSync(testDbPath)) {
fs.unlinkSync(testDbPath);
}
if (fs.existsSync(testMigrationMarker)) {
fs.unlinkSync(testMigrationMarker);
}
});

Comment on lines +154 to +162
if (collection.data && Array.isArray(collection.data)) {
for (const doc of collection.data) {
// LokiJSのメタデータを削除
const cleanDoc = { ...doc };
delete cleanDoc.$loki;
delete cleanDoc.meta;
sqliteCollection.insertOne(cleanDoc);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The migrateFromLokiJS method inserts documents one by one in a loop. For a large memory.json file, this will be slow because each insertOne call is a separate database transaction.

To significantly improve migration performance, you should batch the documents for each collection and use the insert method, which correctly uses a transaction for bulk insertion.

Comment on lines +104 to +134
const stmt = this.sqliteDb.prepare(
'INSERT OR REPLACE INTO collections (collection_name, document_id, data, created_at, updated_at) VALUES (?, ?, ?, ?, ?)'
);

let count = 0;
for (const doc of collection.data) {
// Clean up LokiJS metadata
const cleanDoc = { ...doc };
delete cleanDoc.$loki;
delete cleanDoc.meta;

// Determine document ID
const documentId = cleanDoc.id || cleanDoc._id || cleanDoc.userId || count;

// Get timestamps from LokiJS metadata if available
const createdAt = doc.meta?.created || Date.now();
const updatedAt = doc.meta?.updated || Date.now();

stmt.run(
collection.name,
String(documentId),
JSON.stringify(cleanDoc),
createdAt,
updatedAt
);

count++;
}

this.log(chalk.green(` Migrated ${count} documents`));
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The manual migration script inserts documents one by one, which will be slow for large databases. You should wrap the loop that inserts documents for a collection within a single database transaction to dramatically improve performance.

delete cleanDoc.meta;

// Determine document ID
const documentId = cleanDoc.id || cleanDoc._id || cleanDoc.userId || count;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for determining documentId falls back to the loop counter count. This is risky and can cause ID collisions if some documents have an explicit numeric ID that happens to match a counter value.

A more robust approach is to generate a new unique ID when a reliable one (id, _id, userId) isn't found. You can use the uuid library, which is already a dependency.

Suggested change
const documentId = cleanDoc.id || cleanDoc._id || cleanDoc.userId || count;
const documentId = cleanDoc.id || cleanDoc._id || cleanDoc.userId || uuid();

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Incorrect API Key Check in AI Module

The aichat module's install method incorrectly checks config.openAiApiKey. Since this is a Gemini AI module, it should check config.gemini?.apiKey or config.gemini?.enabled instead. This prevents the module from installing and functioning even when Gemini is properly configured.

src/modules/aichat/index.ts#L146-L147

public install() {
if (config.openAiApiKey == null) return {};

Fix in CursorFix in Web


Bug: Data Migration Issue Causes Inaccessibility

The guessing game module's data collection name was changed from 'guessingGame' to '_guesses'. However, the system's migration process preserves original collection names. This causes existing guessing game data to become inaccessible, as the new code expects data in the '_guesses' collection while migrated data remains under 'guessingGame', leading to data loss for existing users.

src/modules/guessing-game/index.ts#L12-L13

public install() {
this.guesses = this.ai.getCollection('_guesses', {

Fix in CursorFix in Web


Bug: Emoji Data Loss Due to Collection Name Change

The database collection name for emoji tracking was changed from 'lastEmoji' to '_lastEmoji' without corresponding migration logic. This causes existing emoji tracking data to be lost for users migrating from memory.json, as the application will create a new empty '_lastEmoji' collection instead of using the previously stored 'lastEmoji' data.

src/modules/check-custom-emojis/index.ts#L15-L16

if (!config.checkEmojisEnabled) return {};
this.lastEmoji = this.ai.getCollection('_lastEmoji', {});

Fix in CursorFix in Web


Bug: ID Overwrite for Valid Zero IDs

The ID generation logic in SQLiteCollectionImpl (methods insertOne and insert) incorrectly assigns new IDs to documents that have a valid id or _id of 0. This occurs because the code uses loose truthy/falsy evaluations (e.g., !docWithId.id or docWithId.id || ...) instead of strict checks for undefined or null, causing 0 to be treated as a missing ID and subsequently overwritten.

src/database/DatabaseManager.ts#L458-L467

insertOne(doc: T): T {
const docWithId = { ...doc } as any;
if (!docWithId.id && !docWithId._id) {
docWithId.id = this.idCounter++;
}
const documentId = docWithId.id || docWithId._id || this.idCounter++;
const stmt = this.stmtCache.get('insert')!;

src/database/DatabaseManager.ts#L476-L490

const insertMany = this.db.transaction((documents: T[]) => {
const results: T[] = [];
for (const doc of documents) {
const docWithId = { ...doc } as any;
if (!docWithId.id && !docWithId._id) {
docWithId.id = this.idCounter++;
}
const documentId = docWithId.id || docWithId._id || this.idCounter++;
insertStmt.run(this.name, String(documentId), JSON.stringify(docWithId), Date.now());
results.push(docWithId);
}
return results;
});

Fix in CursorFix in Web


BugBot free trial expires on July 25, 2025
Learn more in the Cursor dashboard.

Was this report helpful? Give feedback by reacting with 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants