Skip to content

Releases: gekap/fast-copy

Structured JSON Logging, Permission Preservation, Security Hardening

04 Apr 14:49

Choose a tag to compare

What's New

New Features

  • --version / -V — Show current version
  • --update — Self-update from GitHub releases with size verification, SHA-256 audit hash, and atomic file replacement (Linux/macOS/Windows)
  • --log-file — Write a structured JSON log recording every file action (copied, linked, skipped, error) with summary stats, transfer method, link targets, and error messages
  • Permission preservation — File permissions (chmod) are now preserved on individual copy (local→local large files) and remote-to-local SFTP transfers, including zero-byte files

Security Fixes

  • Cross-run dedup path validation — mount-relative paths from the SQLite DB are now validated against path traversal (../) and verified to resolve within the mount point boundary
  • SQLite DB symlink protection — Refuses to open the dedup database if the path is a symlink (prevents write-to-arbitrary-location attacks on shared media)
  • R2R tar relay hardening — Post-relay check removes any symlinks injected by a compromised source server
  • Manifest HMAC salt — HMAC key now includes a persistent random salt (~/.fast_copy_salt), preventing key prediction from public username/hostname
  • Remote verify hash fixverify_copy_remote now correctly re-hashes with SHA-256 before comparing to remote hashes (previously compared xxh128 vs sha256, causing false mismatches)
  • Tar stream size fix — Remote tar uploads now use actual file size at write time instead of stale scan-time size (prevents silent corruption if files change during copy)

Improvements

  • SFTP prefetch capprefetch() capped at 256 MB to prevent excessive memory usage on very large files
  • Partial file cleanup — Interrupted or failed copies now remove the partial destination file instead of leaving truncated files
  • Symlink scan warnings — Source scanner now warns when followed symlinks point outside the source tree
  • IPv6 SSH support — Remote paths now accept bracket notation (user@[::1]:/path)
  • Thread-safe logging — Log entry list protected by a lock for non-CPython implementations
  • Truncation warning — SSH command output warns when hitting the 100 MB output cap
  • DedupDB safe close — Database close now acquires the lock to prevent concurrent access errors
  • Progress bar stability — Minimum 10ms elapsed time before displaying speed values

Self-Update

# Check version
fast_copy --version

# Update to latest release
fast_copy --update

Example: JSON Log Output

python fast_copy.py /data /mnt/usb/backup --log-file copy.json
{
  "timestamp": "2026-04-04T13:25:48.680170+00:00",
  "summary": {
    "source": "/data", "destination": "/mnt/usb/data",
    "mode": "local_to_local", "total_files": 3,
    "copied": 2, "linked": 1, "skipped": 0, "errors": 0,
    "total_bytes": 18, "bytes_written": 12, "dedup_saved": 6,
    "elapsed_sec": 0.03, "avg_speed_bps": 400, "hash_algo": "xxh128"
  },
  "files": [
    {"action": "copied", "path": "data.bin", "size": 6, "method": "block_stream"},
    {"action": "linked", "path": "data_copy.bin", "size": 6, "method": "hardlink", "link_target": "data.bin"}
  ]
}

SHA-256 Checksums

017fea2231159558b9915e40292bb689198df818589f8713df7b695d2f017996  fast_copy-linux
29d2d466e22bfaead6a26a900628448a9dd5ab3c39d271f04b5656961b69f470  fast_copy-macos
0ad364a894fabe0fb16602dc33b9c9cfde581d8406a696733c6e6bf452aed8e7  fast_copy-windows.exe
c5ca22289e351a0be3456243dd96de37cb1a1619e7dad110659bd01a84181c61  fast_copy.py

Full Changelog

See CHANGELOG.md for details on all versions.

v2.3.0 — Chunked Tar Streaming, Security Hardening, Windows Long Paths

02 Apr 21:00

Choose a tag to compare

What's New

Performance

  • Raw SSH tar streaming replaces SFTP for all remote transfers — 3-5x faster (4-5 MB/s on 100 Mbps LAN vs 1.2 MB/s with SFTP)
  • Chunked 100 MB tar batches with streaming extraction — files extracted as data arrives, no temp files
  • Per-byte progress for large files during tar extraction (e.g., a 4.8 GB file shows smooth progress instead of freezing)
  • Threaded stdin/stdout on SSH channels to prevent deadlocks with large file lists
  • Batched remote hashing — 5,000 files per SSH command to avoid channel timeouts on large repos (91k+ files)
  • Remote link creation via batched Python script (5,000 links/batch) instead of individual shell commands

Security

  • Hardened tar extraction — explicitly blocks symlinks, hard links, device files, FIFOs, null bytes in filenames
  • 50 GB per-file size limit during tar extraction to prevent tar bomb attacks
  • SSH host key warning with SHA256 fingerprint and MITM attack guidance
  • SFTP-free manifest — read/write manifests via exec commands with SFTP fallback (works on servers with SFTP disabled, e.g., Synology NAS)

Windows

  • Long path support (>260 characters) via \\?\ prefix for scan, copy, extract, verify, and hard link creation
  • Path separator fix in verification — forward/backslash mismatch no longer causes false "MISSING" reports

Reliability

  • Auth retry — prompts for password up to 3 times on failure; catches "No authentication methods available" errors
  • Graceful Ctrl+C — clean "Interrupted." message, no tracebacks
  • Remote space check walks parent directories when destination doesn't exist yet
  • Incremental check fallback — if remote check fails (e.g., SFTP disabled), reconnects and copies all files instead of crashing

CLI

  • Renamed SSH arguments for clarity: --ssh-src-port, --ssh-src-password, --ssh-dst-port, --ssh-dst-password
  • Professional README with architecture docs, diagrams, and real-world benchmarks

Benchmarks

Mode Files Data Speed
Local → Local 59,925 500.7 MB 41.2 MB/s
Remote → Local 91,669 509.8 MB 619.5 KB/s
Local → Remote 91,663 509.8 MB 4.0 MB/s
Remote → Remote 3 1.7 GB 5.2 MB/s

SHA-256 Checksums

8d31d13f45c81b80ea298a19f94c9ce74299623bbc4551159724803227e3c194  fast_copy-linux
d7b96de0a940a24a32ccc2da50047b2abf7ff36b5de8a8e7d5f5830857ca9b21  fast_copy-macos
fd5788cbecde3fab361eb07b14fb6f74c78d97bb1fd5456a560e4d6f697f86df  fast_copy-windows.exe
6ab6eec4e073209ba279999857decd1b757adae19014dc7d7c85d38b89f4d515  fast_copy.py

v2.2.0 — Bug fixes and security hardening

30 Mar 20:38

Choose a tag to compare

What's Changed

Bug Fixes

  • Dedup cache key collision — Single-file mode used only the filename as the cache key, causing false cache hits when different files shared the same name and size. Now uses the full source path.
  • Verification "N more" count — When both missing and size-mismatched files were reported, the "... and N more" count was wrong (subtracted 10 instead of the actual number of items shown).

Security Fixes

  • Tar path traversal (Python < 3.12) — On Python versions before 3.12, tar extraction lacked the filter='data' safety net. Added explicit validation to block member names containing .. or absolute paths.
  • Dedup database permissions — The SQLite hash cache at the drive mount root was created with default permissions. Now restricted to owner-only (0600) to prevent leaking file paths and hashes on shared systems.

Documentation

  • xxHash install instructions — Added platform-specific install instructions for the optional xxHash library (~10x faster hashing) on Linux (Debian, Fedora, Arch), macOS (Homebrew), and Windows.

Notes

  • Requires Python 3.8+. xxHash is optional (falls back to SHA-256).

Full Changelog: v2.1.0...v2.2.0

SHA-256 Checksums

d036878d458ca5c042836d0254461e3251dcaaec82a2f3c57e1a0050ad0da52a  fast_copy.py
28c60abb1819e3beeed34120f11aa78e496aacf4b780e91420cd6abee01dc56b  build.py

v2.1.0 — Cross-run Dedup Database + xxh128 Hashing

24 Mar 17:56

Choose a tag to compare

What's New

Stronger Hash Algorithm

  • xxh64 → xxh128 (128-bit) — collision probability reduced from ~1 in 2³² to ~1 in 2⁶⁴
  • Fallback upgraded from MD5 to SHA-256

Cross-Run Dedup Database

  • Persistent SQLite hash cache at the drive root (.fast_copy_dedup.db)
  • Shared across all destination folders on the same drive
  • Copying the same source to test8/ then test9/zero bytes copied on the second run, all files hard-linked
  • Reports which existing folders matched:
    Cross-run dups:  3696 files (324.9 MB) — already on drive, will link instead of copy
      → test8/: 3696 files matched
    
  • Source hash caching — subsequent runs skip re-hashing (100% cache hits)
  • --no-cache flag to disable

Faster Verification

  • Single os.walk() pass replaces thousands of individual stat() calls on USB

NTFS Symlink Fix

  • Broken symlinks on NTFS (via Linux) now fall back to real copies automatically

Performance Example

Run Destination Data Written Time
1st test8/ 302.4 MB 5.2s
2nd test9/ 0 B (all linked) 1.6s

SHA-256 Checksums

d2dad98eab06e93c09da1bcf0d9a2c70a321d8d90e9ec419e335d4388fed274b  fast_copy.py
28c60abb1819e3beeed34120f11aa78e496aacf4b780e91420cd6abee01dc56b  build.py