diff --git a/docs/ops/DISASTER_RECOVERY_RUNBOOK.md b/docs/ops/DISASTER_RECOVERY_RUNBOOK.md new file mode 100644 index 000000000..53c78820c --- /dev/null +++ b/docs/ops/DISASTER_RECOVERY_RUNBOOK.md @@ -0,0 +1,371 @@ +# Disaster Recovery Runbook + +Last Updated: 2026-04-01 +Issue: `#86` OPS-08 backup/restore automation and disaster-recovery drill playbook + +--- + +## Overview + +Taskdeck is a local-first application backed by a single SQLite database file. All boards, +cards, columns, audit records, and automation state live in that file. This runbook covers: + +- Backup automation (what the scripts do and when to run them) +- Manual restore procedure (step-by-step) +- RTO and RPO targets +- DR drill schedule and evidence requirements +- Access controls for backup artefacts + +--- + +## RTO and RPO Targets + +| Tier | Target | Notes | +| --- | --- | --- | +| RTO (local SQLite instance) | **< 30 minutes** | Time from decision-to-restore to API serving healthy requests | +| RTO (Docker / hosted instance) | **< 60 minutes** | Includes container restart and volume reattachment | +| RPO (default daily rotation) | **< 24 hours** | Maximum data loss under the default 7-backup daily schedule | +| RPO (high-frequency rotation) | **< 1 hour** | Achievable by scheduling `backup.sh` hourly via cron | + +These are targets for a single-operator local-first deployment. Cloud/multi-user deployments +should tighten RPO by increasing backup frequency and consider continuous WAL shipping if +eventual consistency is insufficient. + +--- + +## Backup Automation + +### Scripts + +| Script | Platform | Location | +| --- | --- | --- | +| `backup.sh` | Linux / macOS / WSL | `scripts/backup.sh` | +| `backup.ps1` | Windows PowerShell | `scripts/backup.ps1` | +| `restore.sh` | Linux / macOS / WSL | `scripts/restore.sh` | +| `restore.ps1` | Windows PowerShell | `scripts/restore.ps1` | + +### How backups work + +`backup.sh` (and the PS1 equivalent) uses `sqlite3 .backup` — SQLite's online backup API. +This acquires a shared lock, flushes any pending WAL (write-ahead log) frames, and copies +pages to the destination. It is **safe while the API is running and writing**. The fallback +(`cp`) is explicitly unsafe with active writers and should only be used in development. + +### Quick start + +```bash +# Default paths (~/.taskdeck/taskdeck.db -> ~/.taskdeck/backups/) +bash scripts/backup.sh + +# Explicit paths +bash scripts/backup.sh \ + --db-path /app/data/taskdeck.db \ + --output-dir /backups/taskdeck + +# Keep 14 backups instead of the default 7 +bash scripts/backup.sh --retain 14 +``` + +PowerShell (Windows): + +```powershell +.\scripts\backup.ps1 +.\scripts\backup.ps1 -DbPath "C:\app\data\taskdeck.db" -OutputDir "D:\backups" -Retain 14 +``` + +### Scheduling (cron / Task Scheduler) + +**Linux / macOS — daily at 02:00:** + +```cron +0 2 * * * /path/to/repo/scripts/backup.sh \ + --db-path /app/data/taskdeck.db \ + --output-dir /backups/taskdeck \ + >> /var/log/taskdeck-backup.log 2>&1 +``` + +**Windows — Task Scheduler (run as the app-service account):** + +```powershell +# Create a daily backup task +$action = New-ScheduledTaskAction -Execute "pwsh.exe" ` + -Argument "-NonInteractive -File C:\taskdeck\scripts\backup.ps1" +$trigger = New-ScheduledTaskTrigger -Daily -At "02:00" +Register-ScheduledTask -TaskName "Taskdeck-Daily-Backup" ` + -Action $action -Trigger $trigger -RunLevel Highest +``` + +### Docker volume backups + +The Docker Compose deployment mounts `taskdeck-db:/app/data`. To back up from the host: + +```bash +# Option A: exec into the container and run the backup script +docker compose -f deploy/docker-compose.yml --profile baseline exec api \ + bash /repo/scripts/backup.sh \ + --db-path /app/data/taskdeck.db \ + --output-dir /app/data/backups + +# Option B: copy the volume contents to the host (requires API to be stopped or paused) +docker compose -f deploy/docker-compose.yml --profile baseline stop api +docker run --rm \ + -v taskdeck_taskdeck-db:/data \ + -v "$(pwd)/local-backups:/backup" \ + alpine:3 \ + sh -c "cp /data/taskdeck.db /backup/taskdeck-$(date +%Y%m%d-%H%M%S).db" +docker compose -f deploy/docker-compose.yml --profile baseline start api + +# Option C: add a dedicated backup sidecar (extend docker-compose.yml): +# +# backup: +# profiles: ["backup"] +# image: alpine:3 +# volumes: +# - taskdeck-db:/data:ro +# - ./backups:/backup +# command: > +# sh -c "cp /data/taskdeck.db /backup/taskdeck-$(date +%Y%m%d-%H%M%S).db +# && echo 'Backup done.'" +# +# Run one-off: docker compose --profile backup run --rm backup +``` + +--- + +## Restore Procedure + +Use this procedure whenever a database restore is required (corruption, accidental deletion, +or rollback after a bad migration). + +### Pre-conditions + +- You have a known-good backup file (`taskdeck-backup-YYYY-MM-DD-HHmmss.db`). +- The Taskdeck API is stopped (or you are willing to restart it after restore). +- You have write access to the directory containing the live database. + +### Step 1 — Stop the API (recommended) + +Stopping the API avoids any writes racing with the restore. It is not strictly required +(`restore.sh` uses `sqlite3 .restore` which acquires an exclusive lock), but stopping first +eliminates all risk. + +```bash +# Docker Compose deployment +docker compose -f deploy/docker-compose.yml --profile baseline stop api + +# Local dotnet run — send SIGTERM / Ctrl+C +# systemd +sudo systemctl stop taskdeck-api +``` + +### Step 2 — Choose the backup to restore + +```bash +# List available backups, newest first +ls -lt ~/.taskdeck/backups/taskdeck-backup-*.db + +# Or for Docker volume backups +ls -lt ./local-backups/ +``` + +Select the most recent backup before the incident, or a specific point-in-time backup if +you know the target date. + +### Step 3 — Run the restore script + +```bash +bash scripts/restore.sh \ + --backup-file ~/.taskdeck/backups/taskdeck-backup-2026-04-01-120000.db + +# With explicit DB path (required for Docker or non-default paths) +bash scripts/restore.sh \ + --backup-file /backups/taskdeck/taskdeck-backup-2026-04-01-120000.db \ + --db-path /app/data/taskdeck.db + +# Skip interactive confirmation (for automation) +bash scripts/restore.sh \ + --backup-file /backups/taskdeck-backup-2026-04-01-120000.db \ + --yes +``` + +PowerShell (Windows): + +```powershell +.\scripts\restore.ps1 ` + -BackupFile "$env:USERPROFILE\.taskdeck\backups\taskdeck-backup-2026-04-01-120000.db" + +.\scripts\restore.ps1 ` + -BackupFile "D:\backups\taskdeck-backup-2026-04-01-120000.db" ` + -DbPath "C:\app\data\taskdeck.db" ` + -Yes +``` + +The script will: +1. Verify the backup is a valid SQLite file (magic bytes + `PRAGMA integrity_check`). +2. Check that the backup contains a `Boards` table (Taskdeck schema sanity check). +3. Prompt for confirmation (skip with `--yes` / `-Yes`). +4. Create a timestamped safety copy of the current live database. +5. Restore the backup into the live path. +6. Run a post-restore `PRAGMA integrity_check`. + +### Step 4 — Verify row counts + +After restore, spot-check that the data volume is plausible: + +```bash +sqlite3 /path/to/taskdeck.db <<'SQL' +SELECT 'Boards' AS tbl, COUNT(*) AS rows FROM Boards +UNION ALL +SELECT 'Columns', COUNT(*) FROM Columns +UNION ALL +SELECT 'Cards', COUNT(*) FROM Cards +UNION ALL +SELECT 'Users', COUNT(*) FROM Users; +SQL +``` + +Compare against your last known-good row counts (see evidence log if available). + +### Step 5 — Start the API and verify health + +```bash +# Docker Compose deployment +docker compose -f deploy/docker-compose.yml --profile baseline start api + +# Wait for health +for i in $(seq 1 30); do + STATUS=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:5000/health/ready 2>/dev/null || true) + if [[ "$STATUS" == "200" ]]; then echo "API healthy."; break; fi + echo "Waiting... ($i/30)" + sleep 2 +done + +# Detailed health response +curl -s http://localhost:5000/health/ready | python3 -m json.tool +``` + +### Step 6 — Record the restore in the evidence log + +File an evidence entry in `docs/ops/rehearsals/` using the template in +`docs/ops/EVIDENCE_TEMPLATE.md`. Tag it with `restore-event` rather than `rehearsal` if +this was a real recovery. + +--- + +## Backup Verification + +Run these checks after every backup to confirm it is usable for recovery. They can be +automated in CI or a monitoring cron job. + +```bash +BACKUP_FILE="/path/to/latest.db" + +# 1. Integrity check +sqlite3 "$BACKUP_FILE" 'PRAGMA integrity_check;' +# Expected: ok + +# 2. Page count / file size sanity +sqlite3 "$BACKUP_FILE" 'PRAGMA page_count; PRAGMA page_size;' +# Should match or exceed the previous backup + +# 3. Schema presence +sqlite3 "$BACKUP_FILE" '.tables' +# Should contain: Boards Columns Cards Users AuditLogs AutomationProposals ... + +# 4. Row count spot check +sqlite3 "$BACKUP_FILE" 'SELECT COUNT(*) FROM Boards;' +# Should be >= 0 (positive for non-empty deployments) + +# 5. Last write recency (check that the backup is not stale) +sqlite3 "$BACKUP_FILE" " + SELECT MAX(UpdatedAt) AS last_write + FROM ( + SELECT UpdatedAt FROM Boards + UNION ALL SELECT UpdatedAt FROM Cards + ); +" +``` + +--- + +## Access Controls + +| Artefact | Required permission | How enforced | +| --- | --- | --- | +| Backup directory (`~/.taskdeck/backups/`) | Owner read/write only | `chmod 700` (bash) / restricted ACL (PowerShell) | +| Backup files (`taskdeck-backup-*.db`) | Owner read/write only | `chmod 600` (bash) / restricted ACL (PowerShell) | +| Pre-restore safety copies | Owner read/write only | Same as backup files | +| Live database (`taskdeck.db`) | Owner read/write only | Set after restore by restore scripts | + +On Linux/macOS: the scripts set `chmod 700` on the backup directory and `chmod 600` on each +file. Verify with `ls -la ~/.taskdeck/backups/`. + +On Windows: the scripts apply a restricted ACL granting FullControl to the current user only +and removing inherited permissions. Verify with `Get-Acl | Format-List`. + +**For Docker deployments**: ensure the Docker volume is not world-readable. The named volume +`taskdeck-db` is accessible only to containers with the volume mounted. Restrict host-level +access to the volume directory if the host filesystem is shared. + +--- + +## DR Drill Schedule + +| Drill type | Cadence | Scope | Evidence required | +| --- | --- | --- | --- | +| Backup verification | Monthly (automated preferred) | Run `PRAGMA integrity_check` and row-count spot-check on the latest backup | Log entry in backup cron output | +| Manual restore drill | Monthly | Full restore to a separate test directory; verify health | Evidence package in `docs/ops/rehearsals/` | +| Full DR drill | Quarterly | Restore + API restart + user acceptance test | Evidence package + retrospective | + +Drill dates align with the cadence defined in `docs/ops/INCIDENT_REHEARSAL_CADENCE.md`. +The backup-restore scenario should be added to the monthly rotation. + +--- + +## DR Drill Evidence Template + +For each manual restore drill, file an evidence package at: + +``` +docs/ops/rehearsals/YYYY-MM-DD_backup-restore-drill.md +``` + +Use this table as a minimum record: + +| Date | Operator | Backup Age | Backup File | Restore Duration | `integrity_check` | Row Count Match | Pass/Fail | Notes | +| --- | --- | --- | --- | --- | --- | --- | --- | --- | +| 2026-04-01 | @operator | 3h | taskdeck-backup-2026-04-01-090000.db | 4m 12s | ok | yes | Pass | Docker volume restore | +| YYYY-MM-DD | @username | Xh | taskdeck-backup-YYYY-MM-DD-HHmmss.db | Xm Xs | ok/fail | yes/no | Pass/Fail | | + +Attach or inline: +- `PRAGMA integrity_check` output +- Row count query results (before and after restore) +- API `/health/ready` response after restart +- Any deviations from expected state + +--- + +## Escalation Path + +| Condition | Action | +| --- | --- | +| `PRAGMA integrity_check` returns anything other than `ok` | Do NOT restore this backup. Try the next-oldest backup. File an issue tagged `P1`. | +| Restore script fails with permission error | Check file ownership, ACLs, and whether the API process holds an exclusive lock. | +| All available backups fail integrity check | Escalate to the project owner immediately. Check the live database — it may still be intact. | +| Post-restore API health check returns non-200 | Inspect `/health/ready` response for which subsystem failed. Check for EF migration drift between backup schema and current binary. | +| Data loss confirmed after restore | File a P1 incident issue. Document the RPO gap in the evidence package. Increase backup frequency. | + +For this project, escalation means: create a GitHub issue with label `incident` and +`data-loss` (or `data-risk`) and assign it to `@Chris0Jeky`. + +--- + +## Related Documents + +- `scripts/backup.sh` / `scripts/backup.ps1` — backup automation +- `scripts/restore.sh` / `scripts/restore.ps1` — restore automation +- `docs/ops/EVIDENCE_TEMPLATE.md` — evidence package format +- `docs/ops/INCIDENT_REHEARSAL_CADENCE.md` — rehearsal schedule +- `docs/ops/FAILURE_INJECTION_DRILLS.md` — automated failure-injection drills +- `docs/ops/REHEARSAL_BACKOFF_RULES.md` — issue filing rules for drill findings +- `docs/ops/rehearsal-scenarios/` — scenario library diff --git a/docs/ops/INCIDENT_REHEARSAL_CADENCE.md b/docs/ops/INCIDENT_REHEARSAL_CADENCE.md index aa14fb988..186e4f37b 100644 --- a/docs/ops/INCIDENT_REHEARSAL_CADENCE.md +++ b/docs/ops/INCIDENT_REHEARSAL_CADENCE.md @@ -79,6 +79,7 @@ Available scenarios in `docs/ops/rehearsal-scenarios/`: - `missing-telemetry-signal.md` -- Correlation ID missing from OpenTelemetry traces - `mcp-server-startup-regression.md` -- Optional MCP server fails at boot - `deployment-readiness-failure.md` -- Docker Compose startup fails readiness checks +- `backup-restore-drill.md` -- Full backup and restore loop; validates scripts, integrity checks, and RTO target New scenarios should follow the same template structure (pre-conditions, injection, diagnosis, recovery, evidence checklist). File them in the `rehearsal-scenarios/` directory with a descriptive kebab-case filename. diff --git a/docs/ops/rehearsal-scenarios/backup-restore-drill.md b/docs/ops/rehearsal-scenarios/backup-restore-drill.md new file mode 100644 index 000000000..7d204d9c8 --- /dev/null +++ b/docs/ops/rehearsal-scenarios/backup-restore-drill.md @@ -0,0 +1,223 @@ +# Scenario: Backup and Restore + +Last Updated: 2026-04-01 +Issue: `#86` OPS-08 backup/restore automation and disaster-recovery drill playbook + +## Overview + +Verify that the Taskdeck database can be backed up and fully restored using the automation +scripts. This drill validates the complete backup-restore loop: create a backup, simulate +data loss (or schema drift), restore from backup, and confirm the API returns to a healthy +state with the expected data intact. + +## Pre-Conditions + +- Repository checked out at a known commit. +- Backend builds successfully: `dotnet build backend/Taskdeck.sln -c Release` +- `sqlite3` CLI available on PATH (strongly recommended — enables hot backup and integrity + checking; install via `apt install sqlite3` / `brew install sqlite3` / scoop/choco on Windows). +- No other Taskdeck API instance running on port 5000 (or the target port). +- SQLite database exists with at least one board and one card (seed via the API or use a + development database). + +## Injection Method + +### Option A: Simulate data loss (delete the live database) + +This is the most realistic scenario: the live database file is accidentally deleted or the +volume is lost. + +```bash +# 1. Create a backup first +bash scripts/backup.sh --db-path backend/src/Taskdeck.Api/taskdeck.db \ + --output-dir /tmp/taskdeck-dr-drill + +# Note the backup filename printed by the script, e.g.: +# Backup written: /tmp/taskdeck-dr-drill/taskdeck-backup-2026-04-01-120000.db + +# 2. Record current row counts BEFORE deletion +sqlite3 backend/src/Taskdeck.Api/taskdeck.db \ + "SELECT 'Boards', COUNT(*) FROM Boards UNION ALL SELECT 'Cards', COUNT(*) FROM Cards;" + +# 3. Simulate data loss +rm backend/src/Taskdeck.Api/taskdeck.db + +# 4. Verify the API is degraded (or start it to observe the auto-create behavior) +``` + +### Option B: Simulate accidental destructive query + +This tests restore after bad data mutation — more realistic for operational accidents. + +```bash +# 1. Create a backup +bash scripts/backup.sh --db-path backend/src/Taskdeck.Api/taskdeck.db \ + --output-dir /tmp/taskdeck-dr-drill + +# 2. Record baseline row counts +sqlite3 backend/src/Taskdeck.Api/taskdeck.db \ + "SELECT 'Boards', COUNT(*) FROM Boards UNION ALL SELECT 'Cards', COUNT(*) FROM Cards;" + +# 3. Simulate accidental deletion of all cards +sqlite3 backend/src/Taskdeck.Api/taskdeck.db "DELETE FROM Cards;" +sqlite3 backend/src/Taskdeck.Api/taskdeck.db "SELECT COUNT(*) FROM Cards;" +# Expected: 0 (data lost) +``` + +### Option C: Docker volume restore + +```bash +# 1. Exec backup into the container +docker compose -f deploy/docker-compose.yml --profile baseline exec api \ + bash /repo/scripts/backup.sh \ + --db-path /app/data/taskdeck.db \ + --output-dir /app/data/backups + +# 2. Stop the API +docker compose -f deploy/docker-compose.yml --profile baseline stop api + +# 3. Corrupt or delete the volume database (from host): +docker run --rm -v taskdeck_taskdeck-db:/data alpine:3 rm /data/taskdeck.db + +# 4. Restore via the restore script (exec into a temp container with bash + sqlite3) +docker run --rm \ + -v taskdeck_taskdeck-db:/data \ + -v "$(pwd):/repo" \ + --workdir /repo \ + alpine:3 \ + sh -c "apk add --no-cache bash sqlite && bash scripts/restore.sh \ + --backup-file /data/backups/taskdeck-backup-.db \ + --db-path /data/taskdeck.db --yes" +``` + +## Expected Diagnosis Path + +1. **Observe the fault**: API returns degraded health or the database is missing. + + ```bash + curl -s http://localhost:5000/health/ready | python3 -m json.tool + # Expected for missing DB: checks.database.status = "Unhealthy" + # For empty DB after auto-create: checks.database.status = "Healthy" but + # checks.queue.depth = 0 and row counts will be 0 + ``` + +2. **Identify the backup to use**: + + ```bash + ls -lt /tmp/taskdeck-dr-drill/taskdeck-backup-*.db + # Select the most recent backup before the incident + ``` + +3. **Verify the backup**: + + ```bash + sqlite3 /tmp/taskdeck-dr-drill/taskdeck-backup-2026-04-01-120000.db \ + 'PRAGMA integrity_check;' + # Expected: ok + + sqlite3 /tmp/taskdeck-dr-drill/taskdeck-backup-2026-04-01-120000.db \ + "SELECT 'Boards', COUNT(*) FROM Boards UNION ALL SELECT 'Cards', COUNT(*) FROM Cards;" + # Should match pre-incident row counts + ``` + +## Recovery Steps + +### Step 1 — Stop the API + +```bash +# Local process: Ctrl+C or kill +# Docker Compose: +docker compose -f deploy/docker-compose.yml --profile baseline stop api +# systemd: +sudo systemctl stop taskdeck-api +``` + +### Step 2 — Restore from backup + +```bash +bash scripts/restore.sh \ + --backup-file /tmp/taskdeck-dr-drill/taskdeck-backup-2026-04-01-120000.db \ + --db-path backend/src/Taskdeck.Api/taskdeck.db +``` + +Expected output: +``` +Verifying backup file: /tmp/taskdeck-dr-drill/taskdeck-backup-2026-04-01-120000.db +File type check: SQLite magic bytes verified +Running integrity check on backup... +Integrity check: ok +Safety copy created: .../taskdeck-pre-restore-2026-04-01-120001.db +Restored: /tmp/.../taskdeck-backup-... -> backend/src/Taskdeck.Api/taskdeck.db +Post-restore integrity check: ok +Done. Restart the Taskdeck API to pick up the restored database. +``` + +### Step 3 — Verify row counts match pre-incident baseline + +```bash +sqlite3 backend/src/Taskdeck.Api/taskdeck.db \ + "SELECT 'Boards', COUNT(*) FROM Boards UNION ALL SELECT 'Cards', COUNT(*) FROM Cards;" +# Counts should match baseline recorded in Step 2 of injection +``` + +### Step 4 — Start the API and verify health + +```bash +dotnet run --project backend/src/Taskdeck.Api/Taskdeck.Api.csproj & +API_PID=$! + +# Wait for health +for i in $(seq 1 30); do + STATUS=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:5000/health/ready 2>/dev/null || true) + if [[ "$STATUS" == "200" ]]; then echo "API healthy."; break; fi + sleep 2 +done + +curl -s http://localhost:5000/health/ready | python3 -m json.tool +``` + +### Step 5 — Smoke-test data access via API + +```bash +TOKEN=$(curl -s -X POST http://localhost:5000/api/auth/login \ + -H "Content-Type: application/json" \ + -d '{"username":"","password":""}' | python3 -c "import sys,json; print(json.load(sys.stdin)['token'])") + +curl -s http://localhost:5000/api/boards \ + -H "Authorization: Bearer $TOKEN" | python3 -m json.tool +# Should return boards matching the pre-incident state +``` + +## Evidence Checklist + +After completing the rehearsal, the evidence package must include: + +- [ ] Backup command and output (including backup filename and integrity result) +- [ ] Pre-incident row counts (Boards, Cards minimum) +- [ ] Fault injection command and confirmation that data was lost/corrupt +- [ ] `PRAGMA integrity_check` output on the chosen backup file +- [ ] Restore script output (full stdout) +- [ ] Post-restore row counts confirming match with pre-incident baseline +- [ ] API `/health/ready` response after restart (200 OK expected) +- [ ] API smoke-test result (boards list with expected data) +- [ ] Elapsed wall-clock time from decision-to-restore to API healthy (RTO measurement) +- [ ] Any deviations, findings, or gaps observed + +## Pass Criteria + +| Check | Expected | +| --- | --- | +| Backup script exits 0 | Yes | +| `PRAGMA integrity_check` on backup | `ok` | +| Restore script exits 0 | Yes | +| Post-restore row counts match baseline | Yes | +| API `/health/ready` returns 200 after restart | Yes | +| Total elapsed time (decision to healthy API) | < 30 minutes | + +## Related Documents + +- `docs/ops/DISASTER_RECOVERY_RUNBOOK.md` — full backup/restore reference +- `scripts/backup.sh` / `scripts/backup.ps1` — backup automation +- `scripts/restore.sh` / `scripts/restore.ps1` — restore automation +- `docs/ops/EVIDENCE_TEMPLATE.md` — evidence package format +- `docs/ops/INCIDENT_REHEARSAL_CADENCE.md` — drill schedule diff --git a/scripts/backup.ps1 b/scripts/backup.ps1 new file mode 100644 index 000000000..125fa3903 --- /dev/null +++ b/scripts/backup.ps1 @@ -0,0 +1,170 @@ +#Requires -Version 5.1 +<# +.SYNOPSIS + Create a timestamped hot backup of the Taskdeck SQLite database. + +.DESCRIPTION + Uses sqlite3.exe's .backup command for a consistent online backup (safe while the DB is + being written). Falls back to Copy-Item with a warning if sqlite3.exe is not on PATH. + + Retention: keeps the N most-recent backup files and deletes older ones. + +.PARAMETER DbPath + Path to the SQLite database file. + Default: resolved from ConnectionStrings__DefaultConnection env var, then + "$env:USERPROFILE\.taskdeck\taskdeck.db". + +.PARAMETER OutputDir + Directory to write backup files into. + Default: "$env:USERPROFILE\.taskdeck\backups". + +.PARAMETER Retain + Number of most-recent backups to keep. Default: 7. + +.EXAMPLE + .\scripts\backup.ps1 + .\scripts\backup.ps1 -DbPath "C:\app\data\taskdeck.db" -OutputDir "D:\backups" + .\scripts\backup.ps1 -Retain 14 +#> + +[CmdletBinding()] +param( + [string]$DbPath = "", + [string]$OutputDir = "", + [int] $Retain = 7 +) + +Set-StrictMode -Version Latest +$ErrorActionPreference = "Stop" + +# --------------------------------------------------------------------------- +# Resolve DB path +# --------------------------------------------------------------------------- +if ([string]::IsNullOrWhiteSpace($DbPath)) { + $csEnv = $env:ConnectionStrings__DefaultConnection + if (-not [string]::IsNullOrWhiteSpace($csEnv)) { + # Parse "Data Source=/path/to/taskdeck.db" (handles extra parameters like ";Pooling=true") + $DbPath = ($csEnv -split ';' | Where-Object { $_ -match 'Data Source=' } | ForEach-Object { $_ -replace '.*Data Source=', '' }).Trim() + } elseif (-not [string]::IsNullOrWhiteSpace($env:TASKDECK_DB_PATH)) { + $DbPath = $env:TASKDECK_DB_PATH + } else { + $DbPath = Join-Path $env:USERPROFILE ".taskdeck\taskdeck.db" + } +} + +if ([string]::IsNullOrWhiteSpace($OutputDir)) { + $OutputDir = Join-Path $env:USERPROFILE ".taskdeck\backups" +} + +# --------------------------------------------------------------------------- +# Validate inputs +# --------------------------------------------------------------------------- +if (-not (Test-Path $DbPath -PathType Leaf)) { + Write-Error "Database file not found: $DbPath`n Set -DbPath, TASKDECK_DB_PATH, or ConnectionStrings__DefaultConnection." +} + +if ($Retain -lt 1) { + Write-Error "-Retain must be >= 1 (got: $Retain)" +} + +# --------------------------------------------------------------------------- +# Create output directory with restricted ACL (owner-only) +# --------------------------------------------------------------------------- +if (-not (Test-Path $OutputDir)) { + New-Item -ItemType Directory -Path $OutputDir -Force | Out-Null +} + +# Restrict directory to current user only +try { + $acl = Get-Acl $OutputDir + $acl.SetAccessRuleProtection($true, $false) + $acl.Access | ForEach-Object { $acl.RemoveAccessRule($_) | Out-Null } + $rule = New-Object System.Security.AccessControl.FileSystemAccessRule( + [System.Security.Principal.WindowsIdentity]::GetCurrent().Name, + "FullControl", "ContainerInherit,ObjectInherit", "None", "Allow" + ) + $acl.AddAccessRule($rule) + Set-Acl $OutputDir $acl +} catch { + Write-Warning "Could not restrict backup directory ACL: $_" +} + +# --------------------------------------------------------------------------- +# Build backup filename +# --------------------------------------------------------------------------- +$Timestamp = (Get-Date).ToUniversalTime().ToString("yyyy-MM-dd-HHmmss") +$BackupFile = Join-Path $OutputDir "taskdeck-backup-$Timestamp.db" + +Write-Host "Backing up: $DbPath" +Write-Host " to: $BackupFile" + +# --------------------------------------------------------------------------- +# Perform backup +# --------------------------------------------------------------------------- +$Sqlite3 = Get-Command sqlite3.exe -ErrorAction SilentlyContinue + +if ($Sqlite3) { + # sqlite3 .backup is a hot backup: copies pages under a shared lock, + # flushing WAL frames first. Safe with active readers and writers. + $SafeBackupFile = $BackupFile -replace "'", "''" + & $Sqlite3.Source $DbPath ".backup '$SafeBackupFile'" + if ($LASTEXITCODE -ne 0) { + Write-Error "sqlite3 .backup failed (exit code $LASTEXITCODE)." + } + Write-Host "Method: sqlite3 hot backup (safe with active writers)" +} else { + Write-Warning "sqlite3.exe not found on PATH. Falling back to Copy-Item." + Write-Warning "Copy-Item is NOT safe if the database has active writers." + Write-Warning "Install sqlite3 for production use." + Copy-Item $DbPath $BackupFile +} + +# Restrict backup file to current user only +try { + $acl = Get-Acl $BackupFile + $acl.SetAccessRuleProtection($true, $false) + $acl.Access | ForEach-Object { $acl.RemoveAccessRule($_) | Out-Null } + $rule = New-Object System.Security.AccessControl.FileSystemAccessRule( + [System.Security.Principal.WindowsIdentity]::GetCurrent().Name, + "FullControl", "None", "None", "Allow" + ) + $acl.AddAccessRule($rule) + Set-Acl $BackupFile $acl +} catch { + Write-Warning "Could not restrict backup file ACL: $_" +} + +# --------------------------------------------------------------------------- +# Quick integrity check on the backup +# --------------------------------------------------------------------------- +if ($Sqlite3) { + $Integrity = & $Sqlite3.Source $BackupFile "PRAGMA integrity_check;" 2>&1 + if ($Integrity -ne "ok") { + Remove-Item $BackupFile -Force -ErrorAction SilentlyContinue + Write-Error "Backup integrity check failed: $Integrity" + } + Write-Host "Integrity: ok" +} + +Write-Host "Backup written: $BackupFile" + +# --------------------------------------------------------------------------- +# Retention: keep only the N most-recent backups; delete older ones +# --------------------------------------------------------------------------- +$AllBackups = @(Get-ChildItem -Path $OutputDir -Filter "taskdeck-backup-*.db" | + Sort-Object LastWriteTime -Descending) + +$Total = $AllBackups.Count +if ($Total -gt $Retain) { + $DeleteCount = $Total - $Retain + $ToDelete = $AllBackups | Select-Object -Skip $Retain + foreach ($File in $ToDelete) { + Remove-Item $File.FullName -Force + Write-Host "Removed old backup: $($File.FullName)" + } + Write-Host "Retention: kept $Retain of $Total backups, removed $DeleteCount." +} else { + Write-Host "Retention: $Total backup(s) kept (limit $Retain)." +} + +Write-Host "Done." diff --git a/scripts/backup.sh b/scripts/backup.sh new file mode 100644 index 000000000..7ad10a771 --- /dev/null +++ b/scripts/backup.sh @@ -0,0 +1,198 @@ +#!/usr/bin/env bash +# scripts/backup.sh +# +# Create a timestamped hot backup of the Taskdeck SQLite database. +# Uses sqlite3's .backup command for a consistent online backup (safe while +# the DB is being written). Falls back to cp with a warning if sqlite3 is +# not available (cp is NOT safe with an active writer — avoid in production). +# +# Usage: +# bash scripts/backup.sh [OPTIONS] +# +# Options: +# --db-path PATH Path to the SQLite database file. +# Default: resolves from ConnectionStrings env var, +# then ~/.taskdeck/taskdeck.db +# --output-dir DIR Directory to write backup files into. +# Default: ~/.taskdeck/backups/ +# --retain N Number of most-recent backups to keep (delete older). +# Default: 7 +# --help Show this help message and exit. +# +# Examples: +# bash scripts/backup.sh +# bash scripts/backup.sh --db-path /app/data/taskdeck.db --output-dir /backups +# bash scripts/backup.sh --retain 14 + +set -euo pipefail + +# --------------------------------------------------------------------------- +# Defaults +# --------------------------------------------------------------------------- +DEFAULT_DB_PATH="${HOME}/.taskdeck/taskdeck.db" +DEFAULT_OUTPUT_DIR="${HOME}/.taskdeck/backups" +DEFAULT_RETAIN=7 + +DB_PATH="" +OUTPUT_DIR="" +RETAIN="" + +# --------------------------------------------------------------------------- +# Argument parsing +# --------------------------------------------------------------------------- +usage() { + sed -n '/^# Usage:/,/^[^#]/p' "$0" | head -n -1 | sed 's/^# \{0,1\}//' + exit 0 +} + +while [[ $# -gt 0 ]]; do + case "$1" in + --db-path) + DB_PATH="$2" + shift 2 + ;; + --output-dir) + OUTPUT_DIR="$2" + shift 2 + ;; + --retain) + RETAIN="$2" + shift 2 + ;; + --help|-h) + usage + ;; + *) + echo "ERROR: unknown argument: $1" >&2 + exit 1 + ;; + esac +done + +# --------------------------------------------------------------------------- +# Resolve DB path +# --------------------------------------------------------------------------- +if [[ -z "$DB_PATH" ]]; then + # Try to extract from the ConnectionStrings env var (e.g. "Data Source=/app/data/taskdeck.db;Pooling=true") + if [[ -n "${ConnectionStrings__DefaultConnection:-}" ]]; then + DB_PATH=$(echo "${ConnectionStrings__DefaultConnection}" | sed -n 's/.*Data Source=\([^;]*\).*/\1/p') + elif [[ -n "${TASKDECK_DB_PATH:-}" ]]; then + DB_PATH="$TASKDECK_DB_PATH" + else + DB_PATH="$DEFAULT_DB_PATH" + fi +fi + +OUTPUT_DIR="${OUTPUT_DIR:-$DEFAULT_OUTPUT_DIR}" +RETAIN="${RETAIN:-$DEFAULT_RETAIN}" + +# --------------------------------------------------------------------------- +# Validate inputs +# --------------------------------------------------------------------------- +if [[ ! -f "$DB_PATH" ]]; then + echo "ERROR: database file not found: $DB_PATH" >&2 + echo " Set --db-path, TASKDECK_DB_PATH, or ConnectionStrings__DefaultConnection." >&2 + exit 1 +fi + +# Reject paths containing single quotes: sqlite3 dot-commands pass paths +# as string literals delimited by single quotes; an embedded quote would +# truncate or misroute the command. This is a deliberate hard stop — paths +# with single quotes are unusual and the risk of silent data mishandling +# outweighs the convenience of supporting them. +if [[ "$DB_PATH" == *"'"* ]]; then + echo "ERROR: --db-path must not contain single-quote characters: $DB_PATH" >&2 + exit 1 +fi + +if [[ "$OUTPUT_DIR" == *"'"* ]]; then + echo "ERROR: --output-dir must not contain single-quote characters: $OUTPUT_DIR" >&2 + exit 1 +fi + +if ! [[ "$RETAIN" =~ ^[0-9]+$ ]] || [[ "$RETAIN" -lt 1 ]]; then + echo "ERROR: --retain must be a positive integer (got: $RETAIN)" >&2 + exit 1 +fi + +# --------------------------------------------------------------------------- +# Create output directory +# --------------------------------------------------------------------------- +mkdir -p "$OUTPUT_DIR" +# Restrict backup directory permissions (owner read/write only) +chmod 700 "$OUTPUT_DIR" 2>/dev/null || true + +# --------------------------------------------------------------------------- +# Build backup filename +# --------------------------------------------------------------------------- +TIMESTAMP="$(date -u '+%Y-%m-%d-%H%M%S')" +BACKUP_FILE="${OUTPUT_DIR}/taskdeck-backup-${TIMESTAMP}.db" + +# --------------------------------------------------------------------------- +# Perform backup +# --------------------------------------------------------------------------- +echo "Backing up: $DB_PATH" +echo " to: $BACKUP_FILE" + +if command -v sqlite3 &>/dev/null; then + # sqlite3 .backup is a hot backup: it copies pages under an SQLite shared + # lock, flushing any pending WAL frames first. Safe with active readers and + # writers — the output is a consistent snapshot. + SAFE_BACKUP_FILE="${BACKUP_FILE//\'/\'\'}" + sqlite3 "$DB_PATH" ".backup '${SAFE_BACKUP_FILE}'" + echo "Method: sqlite3 hot backup (safe with active writers)" +else + echo "WARNING: sqlite3 not found. Falling back to cp." >&2 + echo "WARNING: cp is NOT safe if the database has active writers." >&2 + echo "WARNING: For WAL-mode databases the backup may be incomplete without the -wal file." >&2 + echo " Install sqlite3 for production use." >&2 + cp "$DB_PATH" "$BACKUP_FILE" + # Copy WAL file if it exists so the backup is usable for WAL-mode databases + [ -f "${DB_PATH}-wal" ] && cp "${DB_PATH}-wal" "${BACKUP_FILE}-wal" +fi + +# Restrict backup file permissions (owner read/write only) +chmod 600 "$BACKUP_FILE" + +# --------------------------------------------------------------------------- +# Quick integrity check on the backup +# --------------------------------------------------------------------------- +if command -v sqlite3 &>/dev/null; then + INTEGRITY="$(sqlite3 "$BACKUP_FILE" 'PRAGMA integrity_check;' 2>&1)" + if [[ "$INTEGRITY" != "ok" ]]; then + echo "ERROR: backup integrity check failed: $INTEGRITY" >&2 + rm -f "$BACKUP_FILE" + exit 1 + fi + echo "Integrity: ok" +fi + +echo "Backup written: $BACKUP_FILE" + +# --------------------------------------------------------------------------- +# Retention: keep only the N most-recent backups; delete older ones +# --------------------------------------------------------------------------- +# List backups sorted newest-first (ls -t); delete all but the first $RETAIN entries. +# The glob is intentionally narrow (taskdeck-backup-*.db) to avoid touching +# files not managed by this script. +# Uses a while-read loop instead of mapfile for macOS Bash 3.2 compatibility. +ALL_BACKUPS=() +while IFS= read -r line; do + ALL_BACKUPS+=("$line") +done < <(ls -1t "${OUTPUT_DIR}/taskdeck-backup-"*.db 2>/dev/null) + +TOTAL="${#ALL_BACKUPS[@]}" +if [[ "$TOTAL" -gt "$RETAIN" ]]; then + DELETE_COUNT=$(( TOTAL - RETAIN )) + # The array is newest-first (ls -t); trim from the end (oldest entries) + for (( i = RETAIN; i < TOTAL; i++ )); do + VICTIM="${ALL_BACKUPS[$i]}" + rm -f "$VICTIM" + echo "Removed old backup: $VICTIM" + done + echo "Retention: kept $RETAIN of $TOTAL backups, removed $DELETE_COUNT." +else + echo "Retention: $TOTAL backup(s) kept (limit $RETAIN)." +fi + +echo "Done." diff --git a/scripts/restore.ps1 b/scripts/restore.ps1 new file mode 100644 index 000000000..2fda61a34 --- /dev/null +++ b/scripts/restore.ps1 @@ -0,0 +1,248 @@ +#Requires -Version 5.1 +<# +.SYNOPSIS + Restore the Taskdeck SQLite database from a backup file. + +.DESCRIPTION + Before overwriting the live database the script: + 1. Verifies the backup is a valid SQLite file (magic bytes + PRAGMA integrity_check). + 2. Creates a timestamped safety copy of the current live database. + 3. Replaces the live database with the backup. + 4. Runs a post-restore integrity check. + +.PARAMETER BackupFile + Path to the backup .db file to restore from. REQUIRED. + +.PARAMETER DbPath + Path to the live database to overwrite. + Default: resolved from ConnectionStrings__DefaultConnection env var, + then "$env:USERPROFILE\.taskdeck\taskdeck.db". + +.PARAMETER SafetyDir + Directory to write the pre-restore safety copy into. + Default: same directory as -DbPath, or "$env:USERPROFILE\.taskdeck\backups". + +.PARAMETER Yes + Skip the interactive confirmation prompt. + +.EXAMPLE + .\scripts\restore.ps1 -BackupFile "$env:USERPROFILE\.taskdeck\backups\taskdeck-backup-2026-04-01-120000.db" + .\scripts\restore.ps1 -BackupFile "D:\backups\taskdeck-backup-2026-04-01-120000.db" ` + -DbPath "C:\app\data\taskdeck.db" -Yes +#> + +[CmdletBinding(SupportsShouldProcess)] +param( + [Parameter(Mandatory)] + [string]$BackupFile, + + [string]$DbPath = "", + [string]$SafetyDir = "", + [switch]$Yes +) + +Set-StrictMode -Version Latest +$ErrorActionPreference = "Stop" + +# --------------------------------------------------------------------------- +# Validate backup file exists +# --------------------------------------------------------------------------- +if (-not (Test-Path $BackupFile -PathType Leaf)) { + Write-Error "Backup file not found: $BackupFile" +} + +# --------------------------------------------------------------------------- +# Resolve DB path +# --------------------------------------------------------------------------- +if ([string]::IsNullOrWhiteSpace($DbPath)) { + $csEnv = $env:ConnectionStrings__DefaultConnection + if (-not [string]::IsNullOrWhiteSpace($csEnv)) { + $DbPath = ($csEnv -split ';' | Where-Object { $_ -match 'Data Source=' } | ForEach-Object { $_ -replace '.*Data Source=', '' }).Trim() + } elseif (-not [string]::IsNullOrWhiteSpace($env:TASKDECK_DB_PATH)) { + $DbPath = $env:TASKDECK_DB_PATH + } else { + $DbPath = Join-Path $env:USERPROFILE ".taskdeck\taskdeck.db" + } +} + +# --------------------------------------------------------------------------- +# Resolve safety copy directory +# --------------------------------------------------------------------------- +if ([string]::IsNullOrWhiteSpace($SafetyDir)) { + $DbDir = Split-Path $DbPath -Parent + if (Test-Path $DbDir -PathType Container) { + $SafetyDir = $DbDir + } else { + $SafetyDir = Join-Path $env:USERPROFILE ".taskdeck\backups" + } +} + +# --------------------------------------------------------------------------- +# Step 1: Verify backup is a valid SQLite database +# --------------------------------------------------------------------------- +Write-Host "Verifying backup file: $BackupFile" + +# Check SQLite magic bytes: first 15 bytes must be "SQLite format 3" (the full +# header is 16 bytes including the null terminator, but we compare the text only) +$SqliteMagic = [System.Text.Encoding]::ASCII.GetBytes("SQLite format 3") + +# Read only the first 16 bytes instead of the entire file +$HeaderBytes = [byte[]]::new(16) +$stream = [System.IO.File]::OpenRead($BackupFile) +try { [void]$stream.Read($HeaderBytes, 0, 16) } finally { $stream.Close() } + +if ((Get-Item $BackupFile).Length -lt 16) { + Write-Error "Backup file is too small to be a valid SQLite database." +} + +$MagicMatch = $true +for ($i = 0; $i -lt $SqliteMagic.Length; $i++) { + if ($HeaderBytes[$i] -ne $SqliteMagic[$i]) { + $MagicMatch = $false + break + } +} + +if (-not $MagicMatch) { + Write-Error "Backup file does not have the SQLite magic header. This is not a valid SQLite database." +} +Write-Host "File type check: SQLite magic bytes verified" + +$Sqlite3 = Get-Command sqlite3.exe -ErrorAction SilentlyContinue + +if ($Sqlite3) { + Write-Host "Running integrity check on backup..." + $Integrity = & $Sqlite3.Source $BackupFile "PRAGMA integrity_check;" 2>&1 + if ($Integrity -ne "ok") { + Write-Error "Backup integrity check failed: $Integrity" + } + Write-Host "Integrity check: ok" + + # Sanity-check the schema: verify the backup looks like a Taskdeck database + $Tables = (& $Sqlite3.Source $BackupFile ".tables" 2>&1) -join " " + if ([string]::IsNullOrWhiteSpace($Tables)) { + Write-Warning "Backup database is empty (no tables found)." + if (-not $Yes) { + $Confirm = Read-Host "Restore an empty database? [y/N]" + if ($Confirm -notmatch '^[Yy]$') { Write-Host "Aborted."; exit 1 } + } + } elseif ($Tables -notmatch 'Boards') { + Write-Warning "Backup does not contain a 'Boards' table. Tables: $Tables" + Write-Warning "This may not be a Taskdeck database." + if (-not $Yes) { + $Confirm = Read-Host "Restore anyway? [y/N]" + if ($Confirm -notmatch '^[Yy]$') { Write-Host "Aborted."; exit 1 } + } + } +} else { + Write-Warning "sqlite3.exe not found. Skipping PRAGMA integrity_check." + Write-Warning "Install sqlite3 for full validation." +} + +# --------------------------------------------------------------------------- +# Step 2: Interactive confirmation (unless -Yes) +# --------------------------------------------------------------------------- +Write-Host "" +Write-Host " Backup file : $BackupFile" +Write-Host " Live DB : $DbPath" +Write-Host " Safety copy : $SafetyDir\" +Write-Host "" + +if (-not $Yes) { + $Confirm = Read-Host "WARNING: this will overwrite the live database. Proceed? [y/N]" + if ($Confirm -notmatch '^[Yy]$') { Write-Host "Aborted."; exit 1 } +} + +# --------------------------------------------------------------------------- +# Step 3: Create safety copy of the current live database +# --------------------------------------------------------------------------- +if (-not (Test-Path $SafetyDir)) { + New-Item -ItemType Directory -Path $SafetyDir -Force | Out-Null +} + +$Timestamp = (Get-Date).ToUniversalTime().ToString("yyyy-MM-dd-HHmmss") +$SafetyFile = $null + +if (Test-Path $DbPath -PathType Leaf) { + $SafetyFile = Join-Path $SafetyDir "taskdeck-pre-restore-$Timestamp.db" + if ($Sqlite3) { + $SafeSafetyFile = $SafetyFile -replace "'", "''" + & $Sqlite3.Source $DbPath ".backup '$SafeSafetyFile'" + if ($LASTEXITCODE -ne 0) { + Write-Error "Failed to create safety copy (sqlite3 exit code $LASTEXITCODE)." + } + } else { + Copy-Item $DbPath $SafetyFile + } + # Restrict safety copy permissions + try { + $acl = Get-Acl $SafetyFile + $acl.SetAccessRuleProtection($true, $false) + $acl.Access | ForEach-Object { $acl.RemoveAccessRule($_) | Out-Null } + $rule = New-Object System.Security.AccessControl.FileSystemAccessRule( + [System.Security.Principal.WindowsIdentity]::GetCurrent().Name, + "FullControl", "None", "None", "Allow" + ) + $acl.AddAccessRule($rule) + Set-Acl $SafetyFile $acl + } catch { + Write-Warning "Could not restrict safety copy ACL: $_" + } + Write-Host "Safety copy created: $SafetyFile" +} else { + Write-Host "INFO: no existing database at $DbPath — skipping safety copy." +} + +# --------------------------------------------------------------------------- +# Step 4: Restore +# --------------------------------------------------------------------------- +$DbDir = Split-Path $DbPath -Parent +if (-not (Test-Path $DbDir)) { + New-Item -ItemType Directory -Path $DbDir -Force | Out-Null +} + +# Remove stale WAL/SHM files to prevent replay against restored DB. +# EF Core uses WAL mode by default; leftover -wal/-shm from the previous +# database would be replayed on first open, silently corrupting the restore. +Remove-Item -Force -ErrorAction SilentlyContinue "${DbPath}-wal" +Remove-Item -Force -ErrorAction SilentlyContinue "${DbPath}-shm" + +if ($Sqlite3) { + $SafeBackupFile = $BackupFile -replace "'", "''" + & $Sqlite3.Source $DbPath ".restore '$SafeBackupFile'" + if ($LASTEXITCODE -ne 0) { + Write-Error "sqlite3 .restore failed (exit code $LASTEXITCODE). Safety copy: $SafetyFile" + } +} else { + Copy-Item $BackupFile $DbPath -Force +} + +# Restrict restored DB permissions +try { + $acl = Get-Acl $DbPath + $acl.SetAccessRuleProtection($true, $false) + $acl.Access | ForEach-Object { $acl.RemoveAccessRule($_) | Out-Null } + $rule = New-Object System.Security.AccessControl.FileSystemAccessRule( + [System.Security.Principal.WindowsIdentity]::GetCurrent().Name, + "FullControl", "None", "None", "Allow" + ) + $acl.AddAccessRule($rule) + Set-Acl $DbPath $acl +} catch { + Write-Warning "Could not restrict restored DB ACL: $_" +} + +Write-Host "Restored: $BackupFile -> $DbPath" + +# --------------------------------------------------------------------------- +# Step 5: Post-restore integrity verification +# --------------------------------------------------------------------------- +if ($Sqlite3) { + $Integrity = & $Sqlite3.Source $DbPath "PRAGMA integrity_check;" 2>&1 + if ($Integrity -ne "ok") { + Write-Error "Post-restore integrity check FAILED: $Integrity`n Safety copy is at: $SafetyFile" + } + Write-Host "Post-restore integrity check: ok" +} + +Write-Host "Done. Restart the Taskdeck API to pick up the restored database." diff --git a/scripts/restore.sh b/scripts/restore.sh new file mode 100644 index 000000000..5b8093e72 --- /dev/null +++ b/scripts/restore.sh @@ -0,0 +1,273 @@ +#!/usr/bin/env bash +# scripts/restore.sh +# +# Restore the Taskdeck SQLite database from a backup file. +# Before overwriting the live database the script: +# 1. Verifies the backup is a valid SQLite file (magic bytes check + PRAGMA integrity_check). +# 2. Creates a timestamped safety copy of the current database. +# 3. Replaces the live database with the backup. +# +# Usage: +# bash scripts/restore.sh --backup-file [OPTIONS] +# +# Options: +# --backup-file FILE Path to the backup .db file to restore from. REQUIRED. +# --db-path PATH Path to the live database to overwrite. +# Default: resolves from ConnectionStrings env var, +# then ~/.taskdeck/taskdeck.db +# --safety-dir DIR Directory to write the pre-restore safety copy into. +# Default: same directory as --db-path, or +# ~/.taskdeck/backups/ +# --yes Skip the interactive confirmation prompt. +# --help Show this help message and exit. +# +# Examples: +# bash scripts/restore.sh --backup-file ~/.taskdeck/backups/taskdeck-backup-2026-04-01-120000.db +# bash scripts/restore.sh --backup-file /backups/taskdeck-backup-2026-04-01-120000.db \ +# --db-path /app/data/taskdeck.db --yes + +set -euo pipefail + +# --------------------------------------------------------------------------- +# Defaults +# --------------------------------------------------------------------------- +DEFAULT_DB_PATH="${HOME}/.taskdeck/taskdeck.db" +DEFAULT_SAFETY_DIR="${HOME}/.taskdeck/backups" + +BACKUP_FILE="" +DB_PATH="" +SAFETY_DIR="" +YES=0 + +# --------------------------------------------------------------------------- +# Argument parsing +# --------------------------------------------------------------------------- +usage() { + sed -n '/^# Usage:/,/^[^#]/p' "$0" | head -n -1 | sed 's/^# \{0,1\}//' + exit 0 +} + +while [[ $# -gt 0 ]]; do + case "$1" in + --backup-file) + BACKUP_FILE="$2" + shift 2 + ;; + --db-path) + DB_PATH="$2" + shift 2 + ;; + --safety-dir) + SAFETY_DIR="$2" + shift 2 + ;; + --yes|-y) + YES=1 + shift + ;; + --help|-h) + usage + ;; + *) + echo "ERROR: unknown argument: $1" >&2 + exit 1 + ;; + esac +done + +# --------------------------------------------------------------------------- +# Validate required args +# --------------------------------------------------------------------------- +if [[ -z "$BACKUP_FILE" ]]; then + echo "ERROR: --backup-file is required." >&2 + echo " Run: bash scripts/restore.sh --help" >&2 + exit 1 +fi + +if [[ ! -f "$BACKUP_FILE" ]]; then + echo "ERROR: backup file not found: $BACKUP_FILE" >&2 + exit 1 +fi + +# --------------------------------------------------------------------------- +# Resolve DB path +# --------------------------------------------------------------------------- +if [[ -z "$DB_PATH" ]]; then + if [[ -n "${ConnectionStrings__DefaultConnection:-}" ]]; then + DB_PATH=$(echo "${ConnectionStrings__DefaultConnection}" | sed -n 's/.*Data Source=\([^;]*\).*/\1/p') + elif [[ -n "${TASKDECK_DB_PATH:-}" ]]; then + DB_PATH="$TASKDECK_DB_PATH" + else + DB_PATH="$DEFAULT_DB_PATH" + fi +fi + +# --------------------------------------------------------------------------- +# Resolve safety copy directory +# --------------------------------------------------------------------------- +if [[ -z "$SAFETY_DIR" ]]; then + DB_DIR="$(dirname "$DB_PATH")" + if [[ -w "$DB_DIR" ]]; then + SAFETY_DIR="$DB_DIR" + else + SAFETY_DIR="$DEFAULT_SAFETY_DIR" + fi +fi + +# --------------------------------------------------------------------------- +# Reject paths containing single quotes (sqlite3 dot-command safety) +# --------------------------------------------------------------------------- +if [[ "$BACKUP_FILE" == *"'"* ]]; then + echo "ERROR: --backup-file must not contain single-quote characters: $BACKUP_FILE" >&2 + exit 1 +fi +if [[ "$DB_PATH" == *"'"* ]]; then + echo "ERROR: --db-path must not contain single-quote characters: $DB_PATH" >&2 + exit 1 +fi + +# --------------------------------------------------------------------------- +# Step 1: Verify backup is a valid SQLite database +# --------------------------------------------------------------------------- +echo "Verifying backup file: $BACKUP_FILE" + +# Check SQLite magic bytes: first 15 bytes must be "SQLite format 3" (the full +# header is 16 bytes including the null terminator, but we compare the text only). +# Prefer exact byte comparison (dd+xxd) over heuristic `file` output. +MAGIC_EXPECTED="53514c69746520666f726d61742033" +MAGIC_ACTUAL="$(dd if="$BACKUP_FILE" bs=1 count=15 2>/dev/null | xxd -p 2>/dev/null || true)" + +if [[ -n "$MAGIC_ACTUAL" ]]; then + # Exact magic-byte check via dd + xxd (most precise) + if [[ "$MAGIC_ACTUAL" != "$MAGIC_EXPECTED" ]]; then + echo "ERROR: backup file SQLite magic bytes do not match." >&2 + echo " Expected: $MAGIC_EXPECTED" >&2 + echo " Actual: $MAGIC_ACTUAL" >&2 + exit 1 + fi + echo "File type check: SQLite magic bytes verified" +elif command -v file &>/dev/null; then + # Fallback: heuristic check via file command + FILE_TYPE="$(file -b "$BACKUP_FILE")" + if [[ "$FILE_TYPE" != *"SQLite"* ]]; then + echo "ERROR: backup file does not appear to be a SQLite database." >&2 + echo " file: $FILE_TYPE" >&2 + exit 1 + fi + echo "File type check: $FILE_TYPE" +else + echo "WARNING: could not verify SQLite magic bytes (xxd and file not available)." >&2 + echo " Proceeding with integrity_check only." >&2 +fi + +# Run PRAGMA integrity_check if sqlite3 is available +if command -v sqlite3 &>/dev/null; then + echo "Running integrity check on backup..." + INTEGRITY="$(sqlite3 "$BACKUP_FILE" 'PRAGMA integrity_check;' 2>&1)" + if [[ "$INTEGRITY" != "ok" ]]; then + echo "ERROR: backup integrity check failed." >&2 + echo " PRAGMA integrity_check returned: $INTEGRITY" >&2 + exit 1 + fi + echo "Integrity check: ok" + + # Also verify the schema looks like a Taskdeck database by checking for + # at least one expected table (Boards). This catches accidentally restoring + # a wrong SQLite file. + TABLES="$(sqlite3 "$BACKUP_FILE" ".tables" 2>/dev/null || true)" + if [[ -z "$TABLES" ]]; then + echo "WARNING: backup database is empty (no tables found)." >&2 + echo " If this is intentional (blank slate restore), add --yes to skip." >&2 + if [[ "$YES" -ne 1 ]]; then + read -r -p "Restore an empty database? [y/N] " CONFIRM + [[ "$CONFIRM" =~ ^[Yy]$ ]] || { echo "Aborted."; exit 1; } + fi + elif [[ "$TABLES" != *"Boards"* ]]; then + echo "WARNING: backup does not contain a 'Boards' table." >&2 + echo " Tables found: $TABLES" >&2 + echo " This may not be a Taskdeck database." >&2 + if [[ "$YES" -ne 1 ]]; then + read -r -p "Restore anyway? [y/N] " CONFIRM + [[ "$CONFIRM" =~ ^[Yy]$ ]] || { echo "Aborted."; exit 1; } + fi + fi +else + echo "WARNING: sqlite3 not available — skipping PRAGMA integrity_check." >&2 + echo " Install sqlite3 for full validation." >&2 +fi + +# --------------------------------------------------------------------------- +# Step 2: Interactive confirmation (unless --yes) +# --------------------------------------------------------------------------- +echo "" +echo " Backup file : $BACKUP_FILE" +echo " Live DB : $DB_PATH" +echo " Safety copy : $SAFETY_DIR/" +echo "" + +if [[ "$YES" -ne 1 ]]; then + echo "WARNING: this will overwrite the live database." + read -r -p "Proceed with restore? [y/N] " CONFIRM + [[ "$CONFIRM" =~ ^[Yy]$ ]] || { echo "Aborted."; exit 1; } +fi + +# --------------------------------------------------------------------------- +# Step 3: Create safety copy of the current live database +# --------------------------------------------------------------------------- +mkdir -p "$SAFETY_DIR" +chmod 700 "$SAFETY_DIR" 2>/dev/null || true + +TIMESTAMP="$(date -u '+%Y-%m-%d-%H%M%S')" + +if [[ -f "$DB_PATH" ]]; then + SAFETY_FILE="${SAFETY_DIR}/taskdeck-pre-restore-${TIMESTAMP}.db" + if command -v sqlite3 &>/dev/null; then + SAFE_SAFETY_FILE="${SAFETY_FILE//\'/\'\'}" + sqlite3 "$DB_PATH" ".backup '${SAFE_SAFETY_FILE}'" + else + cp "$DB_PATH" "$SAFETY_FILE" + fi + chmod 600 "$SAFETY_FILE" + echo "Safety copy created: $SAFETY_FILE" +else + echo "INFO: no existing database at $DB_PATH — skipping safety copy." +fi + +# --------------------------------------------------------------------------- +# Step 4: Restore +# --------------------------------------------------------------------------- +DB_DIR="$(dirname "$DB_PATH")" +mkdir -p "$DB_DIR" + +# Remove stale WAL/SHM files to prevent replay against restored DB. +# EF Core uses WAL mode by default; leftover -wal/-shm from the previous +# database would be replayed on first open, silently corrupting the restore. +rm -f "${DB_PATH}-wal" "${DB_PATH}-shm" + +if command -v sqlite3 &>/dev/null; then + # Use sqlite3 .restore to write a clean, consistent database image + SAFE_BACKUP_FILE="${BACKUP_FILE//\'/\'\'}" + sqlite3 "$DB_PATH" ".restore '${SAFE_BACKUP_FILE}'" +else + cp "$BACKUP_FILE" "$DB_PATH" +fi + +chmod 600 "$DB_PATH" 2>/dev/null || true + +echo "Restored: $BACKUP_FILE -> $DB_PATH" + +# --------------------------------------------------------------------------- +# Step 5: Post-restore integrity verification +# --------------------------------------------------------------------------- +if command -v sqlite3 &>/dev/null; then + INTEGRITY="$(sqlite3 "$DB_PATH" 'PRAGMA integrity_check;' 2>&1)" + if [[ "$INTEGRITY" != "ok" ]]; then + echo "ERROR: post-restore integrity check failed." >&2 + echo " PRAGMA integrity_check returned: $INTEGRITY" >&2 + echo " The safety copy is at: ${SAFETY_FILE:-}" >&2 + exit 1 + fi + echo "Post-restore integrity check: ok" +fi + +echo "Done. Restart the Taskdeck API to pick up the restored database."