The rsync command is one of the most widely used tools in the Linux and Unix world for synchronizing and transferring files efficiently. Its versatility and speed have made it a go-to option for system administrators, developers, and anyone who needs to manage file backups or perform remote file transfers.
In this article, we’ll delve into what rsync is, how it works, and how to use it effectively for different file synchronization tasks.
What is rsync?
rsync is a fast and versatile command-line utility used for transferring and synchronizing files and directories between two locations. The locations can be on the same machine, across different machines on a network, or between a local machine and a remote server.
One of the key advantages of rsync over other file transfer tools is its ability to transfer only the differences (or “deltas”) between the source and destination files, which makes it highly efficient. This delta-transfer algorithm ensures that only the changes are sent, rather than copying the entire file, saving time and bandwidth.
Key Features of rsync:
- Efficient Synchronization: It only transfers the differences between the source and the destination, not the entire file.
- Local and Remote Transfers: You can use rsync to copy files locally, to another machine over SSH, or even to remote servers.
- Compression: It supports compression during transfer, reducing the amount of data that needs to be sent.
- Preserving Attributes: It can preserve file permissions, ownership, timestamps, and symbolic links.
- Incremental Backups: It is commonly used for making backups, especially incremental ones, by copying only changed files.
- Bandwidth Limitations: You can control the bandwidth used by rsync during transfers.
Basic Syntax of rsync
The general syntax for using rsync is as follows:
rsync [options] source destination
Where:
- source is the file or directory you want to copy.
- destination is where you want to copy the file or directory.
- [options] are optional flags that control the behavior of rsync (e.g., preserving file attributes, enabling compression, or excluding files).
Common Flags
1 . -r (recursive): Copies directories recursively, including subdirectories.
rsync -r dir1/ dir2
2. -a (archive mode): This option preserves symbolic links, permissions, timestamps, and other file attributes.
rsync -a source/ destination/
3. -v (verbose): This option provides detailed information about the files being transferred.
rsync -av source/ destination/
4. -z (compress): This option compresses the data during transfer, which can save bandwidth, especially for large files over slow connections.
rsync -az source/ destination/
5. –delete: This option deletes files at the destination that are no longer present at the source. It’s useful for mirroring directories.
rsync -av --delete source/ destination/
6. –exclude: Excludes files or directories from being copied. This can be helpful when you want to avoid syncing certain files (like temporary or log files).
rsync -av --exclude='*.log' source/ destination/
7. –progress/-P: Displays a progress bar during the transfer process.
rsync -av --progress source/ destination/
# alternatively
rsync -avP source/ destination/
8. -n or –dry-run: This option shows what files would be copied or deleted without actually performing the operation. It’s a safe way to preview the results.
rsync -avn source/ destination/
9. -h (Human-readable):
This option makes file sizes in the output more human-readable (e.g., displaying 1K, 2M, 1G instead of raw byte values). This can be useful to understand the size of transferred files.
rsync -ah /path/to/source/ /path/to/destination/
10. rsync over SSH
To use rsync over SSH, you can use the -e option to specify the remote shell command,
rsync -hvze ssh /local/dir user@remote:/remote/dir
Practical examples
Sync File Locally: If you want to copy/sync files from one directory to another on the same machine, you can simply run:
rsync -av /path/to/source/ /path/to/destination/
# Please note that there is a trailing slash (/) at the end of source dir
# This trailing slash means the contents of source dir. Without the trailing slash, source dir, including the directory, would be placed within destination dir.
# To confirm use dry run flag
rsync -avn /path/to/source/ /path/to/destination/
This will recursively copy all files and subdirectories from the source to the destination, preserving their attributes.
Synchronizing Directories Between Local and Remote Systems: To copy files from a local directory to a remote server using ssh, you would run:
# This will copy local dir to remote dir
rsync -avz /path/to/local/dir user@remote_host:/path/to/remote/dir
# This will cony the contents of local dir
rsync -avz /path/to/local/dir/ user@remote_host:/path/to/remote/dir
# To pull/copy form remote dir to local dir
rsync -avz user@remote_host:/path/to/remote/dir /path/to/local/dir
Automatically delete files from local-host after successful data transfer
rsync -avhe ssh --remove-source-files /local/dir user@backup-server:/remote/dir
Update the remote pc/server only if there is a newer version of files on the local filesystem
rsync -avh --progress --update /local/dir <userName>@remote-host:/remote/dir
# The --update option ensures that only files that are newer in the source directory or that don't already exist in the destination will be transferred. Files in the destination directory that are already newer than the source files will not be overwritten
# -h - This option makes file sizes in the output more human-readable (e.g., displaying 1K, 2M, 1G instead of raw byte values).
Sync with particular file permission/ownership
rsync -avhe ssh --chown=<USER>:<GROUP> /local/file user@remote-host:/remote/dir
Sync with ignore existing files
Any files that do not exist on the destination will be copied over
rsync --ignore-existing -avhe /local/file user@remote-host:/remote/dir
# This option tells rsync not to overwrite files that already exist in the destination directory. In other words, it will skip any files that are already present in the destination (in this case, /remote/dir on the remote server).
Delete the files that have been deleted on the local machine
rsync -avhe ssh /local/dir --delete user@remote-host:/remote/dir
Include Files with particular extension
rsync -avz --include='*.png' /local/dir user@remote:/remote/dir
Exclude Files with particular extension
rsync -avz --exclude='*.jpg' /local/dir user@remote:/remote/dir
# combined include and exclude
rsync -avz --exclude='*.jpg' --include='*.png' /local/dir user@remote:/remote/dir
limit synchronization
To limit synchronization to files below a specific size, use the –max-size option followed by the size limit
rsync -av --max-size=10M /local/dir user@remote:/remote/dir
# The rsync command to only synchronize files smaller than 10 MB is as follows:
Get the Deleted File Back from the Remote Machine
If you’ve deleted a file on your local machine but it still exists on the remote machine, and you want to retrieve it from the remote machine back to the local machine using rsync, you can do this by simply running a command that will synchronize the two directories, ensuring that the file from the remote machine is restored to your local directory.
rsync -avz user@remote_host:/home/user/documents/report.txt /home/user/documents/
# for complete dir
rsync -avz user@remote_host:/home/user/documents/ /home/user/documents/
Conclusion
The rsync command is an essential tool for anyone who needs to copy or synchronize files efficiently. With its powerful features like delta-transfer, compression, and the ability to preserve file attributes, it is well-suited for tasks ranging from basic file copying to complex backup solutions.
By mastering the rsync command and understanding its options, you can ensure that your file synchronization tasks are quick, reliable, and efficient, saving both time and resources.