Skip to content

feat: Add sync & check Commands for End-to-End Data Mirroring #58

@QuakeWang

Description

@QuakeWang

Why It Matters

  • Multi-cloud migrations, scheduled backups, and CI/CD deployments need a “make dest look like source” primitive instead of manual put loops.
  • Post-transfer validation (hash / size comparison) is table stakes for preventing silent corruption; competitors like rclone expose sync + check as core commands.
  • Having both commands lets users follow a reliable “sync → check → promote” workflow inside one CLI.

Feature Requirements

storify sync <source> <dest>

  • Walk source and destination simultaneously, skip identical files (size+mtime or hash), upload/overwrite changed files, and optionally delete extra destination files (--delete=never|during|after).
  • Optional knobs: --dry-run, --concurrency N, --include/--exclude pattern, --track-renames, --create-empty-src-dirs.
  • Example:
    storify sync ./datasets oss://team-data/datasets --delete=after --dry-run

storify check <source> <dest>

  • Compare source/dest without modifying either side. Report matches, differences, and missing entries; support --one-way, --download (byte-for-byte when hashes are unavailable), and report files (--differ, --missing-on-src, etc.).
  • Example:
    storify check ./datasets oss://team-data/datasets --differ differ.log --missing-on-dst missing.log

Implementation Sketch

  1. CLI wiring: add sync/check modules under src/cli, parse flags, and reuse the existing Context + Operator setup.
  2. Dual traversal: build a comparer that iterates source/dest listings (via OpenDAL Lister), producing diff records (Match, SrcOnly, DstOnly) to feed sync or check pipelines.
  3. Sync executor: for Match entries, decide whether to copy; queue deletes based on --delete mode; reuse storage::operations::{get, put, delete} for actual transfers.
  4. Check reporter: implement a lightweight reporter (stdout or file) emitting = / + / - / * markers or JSON; optionally fall back to streaming comparison when hashes are missing.
  5. Testing: extend tests/behavior with fixtures covering added/modified/deleted/missing cases on the local fs backend and at least one S3-compatible backend.

Success Criteria

  • storify sync can mirror a directory tree, respecting delete modes and skipping unchanged files.
  • storify check surfaces differences with clear exit codes; sync + check are documented in README/CLI help.
  • Behaviour tests prove that new commands work end-to-end for local + remote backends.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions