-
Notifications
You must be signed in to change notification settings - Fork 8
Lokijsからsqliteへのデータベース移行 #190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: rei.mizushima <rei.mizushima@go.sfcollege.edu>
|
Warning Rate limit exceeded@lqvp has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 25 minutes and 6 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (18)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @lqvp, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly upgrades the application's data persistence by transitioning from an in-memory LokiJS database to a more robust and scalable SQLite solution. The change aims to improve overall performance and reliability, while introducing new database management capabilities like backup and optimization, all without breaking existing module integrations.
Highlights
- Database Migration: The core database persistence layer has been migrated from LokiJS (using
memory.json) to SQLite (usingai.db). This provides a more robust and performant solution. - API Compatibility: The
DatabaseManagerhas been completely rewritten to usebetter-sqlite3while maintaining a LokiJS-compatible API. This minimizes changes required in other modules that interact with the database. - Automatic and Manual Migration: The system now supports automatic migration of existing
memory.jsondata to SQLite on the first run. A newpnpm migratecommand has also been added for manual migration. - Performance Enhancements: The new SQLite implementation includes performance optimizations such as WAL mode for better concurrency, prepared statement caching, and transaction support for bulk operations.
- New Database Management Features: Added new functionalities for database management, including backup, optimization (VACUUM), and retrieval of database statistics. Graceful shutdown has also been implemented to ensure proper database closure.
- Comprehensive Testing: New unit tests (
DatabaseManager.test.ts) and performance benchmarks (performance.test.ts) have been added to ensure the reliability and efficiency of the new database layer.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The code changes migrate the database from LokiJS to SQLite, maintaining API compatibility. The new DatabaseManager is well-structured, and the addition of features like database backup, optimization, and graceful shutdown are excellent improvements. The review focuses on type safety, performance, robustness, and API clarity to further enhance the new implementation.
| find(query?: any): T[] { | ||
| const stmt = this.stmtCache.get('find')!; | ||
| const rows = stmt.all(this.name) as any[]; | ||
|
|
||
| let results = rows.map(row => JSON.parse(row.data) as T); | ||
|
|
||
| if (query && Object.keys(query).length > 0) { | ||
| results = results.filter(doc => this.matchesQuery(doc, query)); | ||
| } | ||
|
|
||
| return results; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The find method fetches all documents for a collection into memory and then filters them using JavaScript. This approach does not scale and will cause significant performance degradation and high memory usage as collections grow. It undermines one of the key benefits of moving to a real database like SQLite.
To fix this, you should translate the query object into a SQL WHERE clause to let the database do the filtering. The current matchesQuery is also very limited and doesn't support the range of operators LokiJS does (e.g., $gt, $in). A proper implementation is critical for both performance and API compatibility.
| public getCollection(name: string, opts?: any): any { | ||
| return this.dbManager.getCollection(name, opts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type of getCollection has been changed to any, which weakens type safety for all modules. Once SQLiteCollection is exported from DatabaseManager.ts, you should use it here as the return type.
| public getCollection(name: string, opts?: any): any { | |
| return this.dbManager.getCollection(name, opts); | |
| public getCollection<T extends object = any>(name: string, opts?: any): SQLiteCollection<T> { | |
| return this.dbManager.getCollection<T>(name, opts); | |
| } |
| interface SQLiteCollection<T extends object = any> { | ||
| find(query?: any): T[]; | ||
| findOne(query?: any): T | null; | ||
| insertOne(doc: T): T; | ||
| insert(docs: T[]): T[]; | ||
| update(doc: T): void; | ||
| updateWhere(filterFunction: (obj: T) => boolean, updateFunction: (obj: T) => T): void; | ||
| remove(doc: T): void; | ||
| removeWhere(query: any): void; | ||
| clear(): void; | ||
| count(): number; | ||
| chain(): any; | ||
| data: T[]; | ||
| name: string; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SQLiteCollection interface is a great way to maintain compatibility with the LokiJS API. However, it's not exported from this module. This forces consumers like ai.ts and all the modules to use any as the type for collections, which sacrifices type safety across the application. Please export this interface so it can be used throughout the codebase.
| interface SQLiteCollection<T extends object = any> { | |
| find(query?: any): T[]; | |
| findOne(query?: any): T | null; | |
| insertOne(doc: T): T; | |
| insert(docs: T[]): T[]; | |
| update(doc: T): void; | |
| updateWhere(filterFunction: (obj: T) => boolean, updateFunction: (obj: T) => T): void; | |
| remove(doc: T): void; | |
| removeWhere(query: any): void; | |
| clear(): void; | |
| count(): number; | |
| chain(): any; | |
| data: T[]; | |
| name: string; | |
| } | |
| export interface SQLiteCollection<T extends object = any> { | |
| find(query?: any): T[]; | |
| findOne(query?: any): T | null; | |
| insertOne(doc: T): T; | |
| insert(docs: T[]): T[]; | |
| update(doc: T): void; | |
| updateWhere(filterFunction: (obj: T) => boolean, updateFunction: (obj: T) => T): void; | |
| remove(doc: T): void; | |
| removeWhere(query: any): void; | |
| clear(): void; | |
| count(): number; | |
| chain(): any; | |
| data: T[]; | |
| name: string; | |
| } |
| public getCollection<T extends object = any>( | ||
| name: string, | ||
| opts?: any | ||
| ): loki.Collection<T> { | ||
| let collection = this.db.getCollection<T>(name); | ||
|
|
||
| if (collection === null) { | ||
| collection = this.db.addCollection<T>(name, opts); | ||
| ): SQLiteCollection<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getCollection method accepts an opts parameter, which in LokiJS is used to define indices. However, the SQLiteCollectionImpl implementation completely ignores this parameter. No SQL indexes are created based on the provided options.
This is misleading because developers will assume they are creating performance-enhancing indexes when they are not. This can lead to unexpected performance issues.
Please either:
- Implement logic to parse the
indicesoption and create the corresponding SQL indexes on the JSON data (e.g., usingjson_extract). - Remove the
optsparameter and document that this feature is not supported to avoid confusion.
| const maxId = this.db.prepare( | ||
| 'SELECT MAX(CAST(json_extract(data, "$.id") AS INTEGER)) as max_id FROM collections WHERE collection_name = ?' | ||
| ).get(name) as any; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The query to initialize idCounter assumes that the id field in the JSON data is always an integer by using CAST(... AS INTEGER). If an id is a non-integer string (e.g., a UUID), this cast will produce 0 in SQLite for non-numeric prefixes. This can lead to an incorrect idCounter and subsequent ID collisions when new documents are inserted.
This logic needs to be more robust. If ids can be non-integers, you should reconsider how idCounter is managed or ensure you only handle integer IDs.
| node_modules | ||
| npm-debug.log | ||
| config.toml | ||
| memory.json | ||
| font.ttf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| afterEach(() => { | ||
| // Clean up | ||
| if (fs.existsSync(testDbPath)) { | ||
| fs.unlinkSync(testDbPath); | ||
| } | ||
| if (fs.existsSync(testMigrationMarker)) { | ||
| fs.unlinkSync(testMigrationMarker); | ||
| } | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The afterEach hook cleans up the test database files but doesn't close the database connection held by dbManager. This can lead to resource leaks during the test run and may cause issues with file cleanup on some operating systems (like Windows, which can lock open files).
Please call dbManager.close() in afterEach to ensure the connection is properly terminated. The performance.test.ts file already does this, so it would be good to be consistent.
| afterEach(() => { | |
| // Clean up | |
| if (fs.existsSync(testDbPath)) { | |
| fs.unlinkSync(testDbPath); | |
| } | |
| if (fs.existsSync(testMigrationMarker)) { | |
| fs.unlinkSync(testMigrationMarker); | |
| } | |
| }); | |
| afterEach(() => { | |
| dbManager.close(); | |
| // Clean up | |
| if (fs.existsSync(testDbPath)) { | |
| fs.unlinkSync(testDbPath); | |
| } | |
| if (fs.existsSync(testMigrationMarker)) { | |
| fs.unlinkSync(testMigrationMarker); | |
| } | |
| }); |
| if (collection.data && Array.isArray(collection.data)) { | ||
| for (const doc of collection.data) { | ||
| // LokiJSのメタデータを削除 | ||
| const cleanDoc = { ...doc }; | ||
| delete cleanDoc.$loki; | ||
| delete cleanDoc.meta; | ||
| sqliteCollection.insertOne(cleanDoc); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The migrateFromLokiJS method inserts documents one by one in a loop. For a large memory.json file, this will be slow because each insertOne call is a separate database transaction.
To significantly improve migration performance, you should batch the documents for each collection and use the insert method, which correctly uses a transaction for bulk insertion.
| const stmt = this.sqliteDb.prepare( | ||
| 'INSERT OR REPLACE INTO collections (collection_name, document_id, data, created_at, updated_at) VALUES (?, ?, ?, ?, ?)' | ||
| ); | ||
|
|
||
| let count = 0; | ||
| for (const doc of collection.data) { | ||
| // Clean up LokiJS metadata | ||
| const cleanDoc = { ...doc }; | ||
| delete cleanDoc.$loki; | ||
| delete cleanDoc.meta; | ||
|
|
||
| // Determine document ID | ||
| const documentId = cleanDoc.id || cleanDoc._id || cleanDoc.userId || count; | ||
|
|
||
| // Get timestamps from LokiJS metadata if available | ||
| const createdAt = doc.meta?.created || Date.now(); | ||
| const updatedAt = doc.meta?.updated || Date.now(); | ||
|
|
||
| stmt.run( | ||
| collection.name, | ||
| String(documentId), | ||
| JSON.stringify(cleanDoc), | ||
| createdAt, | ||
| updatedAt | ||
| ); | ||
|
|
||
| count++; | ||
| } | ||
|
|
||
| this.log(chalk.green(` Migrated ${count} documents`)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| delete cleanDoc.meta; | ||
|
|
||
| // Determine document ID | ||
| const documentId = cleanDoc.id || cleanDoc._id || cleanDoc.userId || count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for determining documentId falls back to the loop counter count. This is risky and can cause ID collisions if some documents have an explicit numeric ID that happens to match a counter value.
A more robust approach is to generate a new unique ID when a reliable one (id, _id, userId) isn't found. You can use the uuid library, which is already a dependency.
| const documentId = cleanDoc.id || cleanDoc._id || cleanDoc.userId || count; | |
| const documentId = cleanDoc.id || cleanDoc._id || cleanDoc.userId || uuid(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Incorrect API Key Check in AI Module
The aichat module's install method incorrectly checks config.openAiApiKey. Since this is a Gemini AI module, it should check config.gemini?.apiKey or config.gemini?.enabled instead. This prevents the module from installing and functioning even when Gemini is properly configured.
src/modules/aichat/index.ts#L146-L147
ai/src/modules/aichat/index.ts
Lines 146 to 147 in 077148b
| public install() { | |
| if (config.openAiApiKey == null) return {}; |
Bug: Data Migration Issue Causes Inaccessibility
The guessing game module's data collection name was changed from 'guessingGame' to '_guesses'. However, the system's migration process preserves original collection names. This causes existing guessing game data to become inaccessible, as the new code expects data in the '_guesses' collection while migrated data remains under 'guessingGame', leading to data loss for existing users.
src/modules/guessing-game/index.ts#L12-L13
ai/src/modules/guessing-game/index.ts
Lines 12 to 13 in 077148b
| public install() { | |
| this.guesses = this.ai.getCollection('_guesses', { |
Bug: Emoji Data Loss Due to Collection Name Change
The database collection name for emoji tracking was changed from 'lastEmoji' to '_lastEmoji' without corresponding migration logic. This causes existing emoji tracking data to be lost for users migrating from memory.json, as the application will create a new empty '_lastEmoji' collection instead of using the previously stored 'lastEmoji' data.
src/modules/check-custom-emojis/index.ts#L15-L16
ai/src/modules/check-custom-emojis/index.ts
Lines 15 to 16 in 077148b
| if (!config.checkEmojisEnabled) return {}; | |
| this.lastEmoji = this.ai.getCollection('_lastEmoji', {}); |
Bug: ID Overwrite for Valid Zero IDs
The ID generation logic in SQLiteCollectionImpl (methods insertOne and insert) incorrectly assigns new IDs to documents that have a valid id or _id of 0. This occurs because the code uses loose truthy/falsy evaluations (e.g., !docWithId.id or docWithId.id || ...) instead of strict checks for undefined or null, causing 0 to be treated as a missing ID and subsequently overwritten.
src/database/DatabaseManager.ts#L458-L467
ai/src/database/DatabaseManager.ts
Lines 458 to 467 in 077148b
| insertOne(doc: T): T { | |
| const docWithId = { ...doc } as any; | |
| if (!docWithId.id && !docWithId._id) { | |
| docWithId.id = this.idCounter++; | |
| } | |
| const documentId = docWithId.id || docWithId._id || this.idCounter++; | |
| const stmt = this.stmtCache.get('insert')!; | |
src/database/DatabaseManager.ts#L476-L490
ai/src/database/DatabaseManager.ts
Lines 476 to 490 in 077148b
| const insertMany = this.db.transaction((documents: T[]) => { | |
| const results: T[] = []; | |
| for (const doc of documents) { | |
| const docWithId = { ...doc } as any; | |
| if (!docWithId.id && !docWithId._id) { | |
| docWithId.id = this.idCounter++; | |
| } | |
| const documentId = docWithId.id || docWithId._id || this.idCounter++; | |
| insertStmt.run(this.name, String(documentId), JSON.stringify(docWithId), Date.now()); | |
| results.push(docWithId); | |
| } | |
| return results; | |
| }); | |
BugBot free trial expires on July 25, 2025
Learn more in the Cursor dashboard.
Was this report helpful? Give feedback by reacting with 👍 or 👎
Migrate database from LokiJS (memory.json) to SQLite while maintaining API compatibility.
This migration addresses issue #134 by replacing the in-memory LokiJS database with a more robust SQLite solution using
better-sqlite3. TheDatabaseManagerhas been rewritten to maintain the existing LokiJS API, minimizing changes to other modules. It also includes automatic migration frommemory.jsonon first run, performance optimizations (WAL mode, prepared statements, transactions), and new database management features like backup and optimization.