rgigasync is a high-performance, Rust-based command-line tool that enhances the capabilities of rsync by enabling efficient mirroring of large directory trees. It is designed to overcome the limitations of rsync when dealing with massive file sets, providing speed improvements and greater resilience in handling large-scale file synchronization tasks.
When synchronizing vast amounts of data across networks or between storage systems, rsync can struggle with performance issues, particularly in scenarios involving:
- Large Numbers of Files:
rsync's memory usage grows with the number of files, which can lead to excessive memory consumption and slowdowns. - Network Instability: If a network connection fails during a large sync operation,
rsyncneeds to restart the entire process, leading to inefficiencies and potential data transfer interruptions. - Complex Directory Structures: Deep and complex directory structures can cause
rsyncto perform suboptimally, especially when it attempts to determine the full set of changes before starting the transfer.
rgigasync addresses these challenges by breaking down the synchronization process into manageable batches, allowing for:
- Optimized Memory Usage: By processing files in smaller batches,
rgigasyncreduces the memory footprint, enablingrsyncto handle large directories without consuming excessive system resources. - Increased Resilience: With its built-in retry mechanism,
rgigasynccan automatically retry failedrsyncoperations, ensuring that data synchronization continues even in the face of network instability. - Faster Synchronizations: By parallelizing and batching file transfers,
rgigasynccan significantly speed up the synchronization process, especially when dealing with large numbers of small files.
- Batch Processing: Files are processed in batches to prevent
rsyncfrom using too much memory, making it suitable for syncing millions of files. - Retry Mechanism: Automatically retries
rsyncoperations up to 5 times in case of failure, improving reliability in unstable network environments. - Customizable: Allows passing custom
rsyncoptions and specifying batch sizes, offering flexibility for different use cases. - Speed Optimization: Designed to maximize the efficiency of
rsyncby reducing overhead and improving throughput, particularly in large-scale operations.
- Rust: Programming language and Cargo (Rust's package manager) for building the project.
- rsync: Installed on your system, as
rgigasyncleveragesrsyncfor file synchronization.
To make rgigasync available globally in your terminal, you can copy the compiled binary to a directory that's included in your system's PATH, or you can add the binary's location to your PATH.
This option copies the rgigasync binary to /usr/local/bin, a common directory in the PATH:
-
Build the Project (if you haven't already):
cargo build --release
-
Create
/usr/local/bin(if it doesn't exist): On macOS, if the/usr/local/bindirectory does not exist, you can create it with:sudo mkdir -p /usr/local/bin
-
Set Permissions: Ensure the directory is writable by your user account:
sudo chown -R $(whoami) /usr/local/bin -
Copy the Binary:
cp ./target/release/rgigasync /usr/local/bin/
-
Verify Installation: After copying, you can verify the installation by running:
rgigasync --version
If the installation was successful, this should display the version of the tool.
If you prefer not to use /usr/local/bin, you can create a bin directory in your home directory and add it to your PATH:
- Create a Custom Bin Directory:
mkdir -p ~/bin
-
Basic Synchronization with Verbose Output:
rgigasync -- "-av" /Volumes/SrcDir/ /Users/userName/DestDir/ -
Synchronization with Progress and Ignoring Existing Files:
rgigasync -- "-av --ignore-existing --info=progress2" /Volumes/SrcDir/ /Users/userName/DestDir/ -
Specifying a Custom Batch Size:
rgigasync -- "-av --ignore-existing --info=progress2" /Volumes/SrcDir/ /Users/userName/DestDir/ 512 -
Excluding Specific File Types:
rgigasync -- "-av --exclude='*.tmp' --exclude='*.log'" /Volumes/SrcDir/ /Users/userName/DestDir/ -
Synchronization Over SSH:
rgigasync -- "-avz -e ssh" /Volumes/SrcDir/ user@remote-server:/home/user/DestDir/ -
Deleting Files at Destination That Are Not Present at Source:
rgigasync -- "-av --delete" /Volumes/SrcDir/ /Users/userName/DestDir/ -
Limiting Bandwidth Usage:
rgigasync -- "-av --bwlimit=10240" /Volumes/SrcDir/ /Users/userName/DestDir/ -
Dry Run to Preview Changes:
rgigasync -- "-av --dry-run" /Volumes/SrcDir/ /Users/userName/DestDir/ -
Parallel Synchronization for Faster Execution::
rgigasync --parallel "--av --ignore-existing --info=progress2" /Volumes/SrcDir/ /Users/userName/DestDir/ 512 ## Mac OSX: Use all cores # rgigasync --parallel -- "--av --ignore-existing --info=progress2" /Volumes/SrcDir/ /Users/userName/DestDir/ 2048 ## Override Cores # RAYON_NUM_THREADS=8 rgigasync --parallel -- "--av --ignore-existing --info=progress2" /Volumes/tuf/TBD/ /Users/josh/TBD/ 2048