improve: speed up pull-db and add restore-db command#187
Conversation
pull-db: リモートで sqlite3 .backup + tar.gz を1回のSSHで実行し、 1回のSFTPで取得する方式に変更。SSH接続17回→3回に削減。 .backup によりアトミックで一貫性のあるコピーを保証。 restore-db: バックアップ一覧表示・指定バックアップからのリストア機能を追加。 db:setup 後でも pull したデータに即座に戻せるようになった。 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds atomic remote SQLite backup + archive retrieval and a restore command. Introduces a remote prepare-pull.sh, refactors pull-db to pull a single tar.gz of DBs, adds restore-db CLI command, a small data-dir utility, and includes remote scripts in the production Docker image. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant CLI
participant LocalFS as Local FS
participant FlySSH as Fly / SSH/SFTP
participant RemoteScript as Remote Script
participant SQLite as SQLite DB
User->>CLI: pull-db --app myapp
activate CLI
CLI->>LocalFS: create local backup dir
CLI->>FlySSH: ssh run prepare-pull.sh
activate FlySSH
FlySSH->>RemoteScript: execute prepare-pull.sh
activate RemoteScript
RemoteScript->>SQLite: sqlite3 .backup (atomic)
RemoteScript->>RemoteScript: create tar.gz of backups
RemoteScript-->>FlySSH: print ---FILES--- list ---DONE---
deactivate RemoteScript
FlySSH->>LocalFS: sftp pull tar.gz
deactivate FlySSH
CLI->>LocalFS: extract tar.gz and sanitize files
CLI->>User: report N dbs pulled
deactivate CLI
sequenceDiagram
actor User
participant CLI
participant LocalFS as Local FS
participant SQLite as SQLite DB
User->>CLI: restore-db --name backup_YYYYMMDD
activate CLI
CLI->>LocalFS: list backup_* directories
CLI->>LocalFS: validate selected backup exists
CLI->>LocalFS: enumerate *.db, *.db-wal, *.db-shm in backup
CLI->>LocalFS: remove current non-backup db files from data/
CLI->>LocalFS: copy backup files into data/
CLI->>User: Success (N dbs restored)
deactivate CLI
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Extract DATA_DIR, BACKUP_PREFIX, isDbFile() to ops/lib/data-dir.ts
- Remove remote cleanup SSH call (prepare-pull.sh self-cleans)
- Use withFileTypes in listBackups() to avoid per-entry statSync
- Use statSync with throwIfNoEntry instead of existsSync+statSync
- Remove dead !startsWith('backup_') guard (db files never match)
- Remove redundant existsSync before sanitization
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
ops/commands/restore-db.ts (1)
11-22: Consider handling stat errors in listBackups.If a backup directory is deleted between
readdirSyncandstatSync, an uncaught exception would be thrown. This is unlikely in practice for a CLI tool, but wrapping in try-catch would be more robust.♻️ Optional: handle potential race
function listBackups(): string[] { if (!fs.existsSync(DATA_DIR)) return [] return fs .readdirSync(DATA_DIR) - .filter( - (f) => - f.startsWith('backup_') && - fs.statSync(path.join(DATA_DIR, f)).isDirectory(), - ) + .filter((f) => { + if (!f.startsWith('backup_')) return false + try { + return fs.statSync(path.join(DATA_DIR, f)).isDirectory() + } catch { + return false + } + }) .sort() .reverse() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ops/commands/restore-db.ts` around lines 11 - 22, The listBackups function can throw if a file is removed between fs.readdirSync and fs.statSync; update listBackups to catch errors around the fs.statSync (or use fs.lstatSync) when checking each entry so a missing/removed entry is skipped instead of bubbling an exception. In practice wrap the fs.statSync(path.join(DATA_DIR, f)).isDirectory() check in a try-catch (inside the .filter callback), return false on error, and keep the existing startsWith('backup_') logic so deleted entries are ignored safely.ops/commands/pull-db.ts (2)
60-71: Consider wrapping remote execution in try-catch for friendlier errors.If the remote script fails (e.g., SSH connection drops, script error),
execFileSyncthrows with a raw error. A try-catch with a user-friendly message would improve the CLI experience.♻️ Optional: add error handling
consola.start('Remote: checkpointing databases and creating archive...') + let output: string + try { - const output = execFileSync( + output = execFileSync( 'fly', [ 'ssh', 'console', '-a', app, '-C', 'sh /upflow/ops/remote/prepare-pull.sh', ], { encoding: 'utf-8', stdio: ['pipe', 'pipe', 'pipe'] }, ) + } catch (err) { + consola.error('Failed to execute remote backup script') + consola.error(err instanceof Error ? err.message : String(err)) + process.exit(1) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ops/commands/pull-db.ts` around lines 60 - 71, Wrap the execFileSync call that assigns output in pull-db.ts in a try-catch to handle thrown errors from remote execution; catch the error from execFileSync, log a friendlier message including contextual info (e.g., the app variable and that the remote prepare-pull.sh failed) and the underlying error.message, and then rethrow or exit with a non-zero code as appropriate so the CLI fails gracefully; reference the execFileSync invocation and the output variable when adding the try-catch.
99-105: Tar extraction path traversal consideration.The tar archive is created by the remote
prepare-pull.shusing-C "$STAGING_DIR" ., which should produce relative paths. However, if the remote archive were compromised, it could contain paths like../malicious. The extraction usescwd: DATA_DIRwithout--strip-componentsor path validation.Given this is an internal ops tool pulling from your own Fly app, this is low risk, but for defense-in-depth you could add safeguards.
🛡️ Optional: restrict extraction to DATA_DIR
consola.start('Extracting archive...') - execFileSync('tar', ['xzf', path.resolve(localTar)], { + execFileSync('tar', ['xzf', path.resolve(localTar), '--no-absolute-names'], { cwd: DATA_DIR, stdio: 'inherit', })
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ops/commands/restore-db.ts`:
- Around line 64-79: The restore routine currently removes files from DATA_DIR
and copies backups with fs.unlinkSync and fs.copyFileSync
(variables/currentDbFiles, filesToRestore, backupDir) which is unsafe; change it
to: prompt the user for an explicit confirmation before any destructive action,
verify no related processes are running (or require a --force flag), copy backup
files into a temporary directory (e.g., DATA_DIR + '.tmp' or os.tmpdir()/staging
path) and only after all copies succeed atomically rename/swap the temp
directory into DATA_DIR (or rename individual files into place) to avoid partial
restores, and make sure to handle/rollback failures and surface clear error
messages rather than deleting originals up front.
In `@ops/remote/prepare-pull.sh`:
- Around line 16-18: The for-loop "for f in *.db; do" in prepare-pull.sh can
iterate the literal "*.db" when no files exist; update the script to detect/skip
the empty-glob case before calling sqlite3 by either enabling bash nullglob
(e.g., shopt -s nullglob at top) or by collecting matches into an array and
exiting/echoing a message when the array is empty, and ensure the loop or a
guard like "[[ -e $f ]]" is used so the sqlite3 call that writes to
"$STAGING_DIR/$f" never runs with a non-existent filename.
---
Nitpick comments:
In `@ops/commands/pull-db.ts`:
- Around line 60-71: Wrap the execFileSync call that assigns output in
pull-db.ts in a try-catch to handle thrown errors from remote execution; catch
the error from execFileSync, log a friendlier message including contextual info
(e.g., the app variable and that the remote prepare-pull.sh failed) and the
underlying error.message, and then rethrow or exit with a non-zero code as
appropriate so the CLI fails gracefully; reference the execFileSync invocation
and the output variable when adding the try-catch.
In `@ops/commands/restore-db.ts`:
- Around line 11-22: The listBackups function can throw if a file is removed
between fs.readdirSync and fs.statSync; update listBackups to catch errors
around the fs.statSync (or use fs.lstatSync) when checking each entry so a
missing/removed entry is skipped instead of bubbling an exception. In practice
wrap the fs.statSync(path.join(DATA_DIR, f)).isDirectory() check in a try-catch
(inside the .filter callback), return false on error, and keep the existing
startsWith('backup_') logic so deleted entries are ignored safely.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 39f13240-d1a3-4f40-a9c5-cb86db4e59cb
📒 Files selected for processing (5)
Dockerfileops/cli.tsops/commands/pull-db.tsops/commands/restore-db.tsops/remote/prepare-pull.sh
- prepare-pull.sh: guard against empty *.db glob with [ -e ] check - restore-db: add confirmation prompt before destructive restore Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
ops/commands/pull-db.ts (1)
55-66: Consider capturing stderr for debugging remote failures.With
stdio: ['pipe', 'pipe', 'pipe'], stderr is piped but not captured byexecFileSync(which only returns stdout). If the remote script outputs diagnostic messages to stderr before failing, they'll be lost.♻️ Optional: capture stderr for better diagnostics
- const output = execFileSync( + let output: string + try { + output = execFileSync( 'fly', [ 'ssh', 'console', '-a', app, '-C', 'sh /upflow/ops/remote/prepare-pull.sh', ], { encoding: 'utf-8', stdio: ['pipe', 'pipe', 'pipe'] }, ) + } catch (err) { + if (err instanceof Error && 'stderr' in err) { + consola.error('Remote stderr:', (err as { stderr: string }).stderr) + } + throw err + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ops/commands/pull-db.ts` around lines 55 - 66, The execFileSync call that assigns to output currently loses stderr because execFileSync only returns stdout; to preserve remote diagnostic messages, wrap the execFileSync(...) call in a try/catch and, on error, read and surface the error.stderr (and error.stdout) properties (or change to spawnSync and capture both stdout/stderr) so you can log or include stderr in your failure message; update the code around the const output = execFileSync(...) invocation to catch the thrown Error from execFileSync and log/return error.stderr for debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@ops/commands/pull-db.ts`:
- Around line 55-66: The execFileSync call that assigns to output currently
loses stderr because execFileSync only returns stdout; to preserve remote
diagnostic messages, wrap the execFileSync(...) call in a try/catch and, on
error, read and surface the error.stderr (and error.stdout) properties (or
change to spawnSync and capture both stdout/stderr) so you can log or include
stderr in your failure message; update the code around the const output =
execFileSync(...) invocation to catch the thrown Error from execFileSync and
log/return error.stderr for debugging.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9d9747da-4633-4e2a-839d-5010eb4c4118
📒 Files selected for processing (5)
ops/cli.tsops/commands/pull-db.tsops/commands/restore-db.tsops/lib/data-dir.tsops/remote/prepare-pull.sh
🚧 Files skipped from review as they are similar to previous changes (2)
- ops/cli.ts
- ops/commands/restore-db.ts
Summary
sqlite3 .backup+tar czfを1回のSSHで実行し、1回のSFTPでアーカイブを取得する方式に変更。SSH接続17回→3回に削減PRAGMA wal_checkpointからsqlite3 .backupに変更。アトミックで一貫性のあるコピーを保証し、本番の書き込みもブロックしないpnpm ops restore-dbでバックアップ一覧表示、--nameで指定バックアップからリストア。db:setup後でもpullしたデータに即座に戻せるTest plan
pnpm ops pull-dbで本番DBの取得を確認pnpm ops restore-dbでバックアップ一覧が表示されるpnpm ops restore-db --name <backup>でリストアできるprepare-pull.sh経由で動作確認🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Enhancements