Author | Nejat Hakan |
nejat.hakan@outlook.de | |
PayPal Me | https://paypal.me/nejathakan |
Backup Server - Restic
Introduction to Restic
Welcome to this comprehensive guide on Restic, a modern, secure, efficient, and easy-to-use backup program. In the world of self-hosting, reliable backups are not just a good idea; they are an absolute necessity. Data loss can be catastrophic, ranging from losing precious personal memories to critical operational data for your self-hosted services. Restic provides a robust solution to safeguard your digital assets.
This guide will take you from the fundamental concepts of Restic, through practical basic usage, into intermediate repository management and remote backends, and finally to advanced topics like automation, internals, and integration with other tools. Each section is designed to build upon the last, providing you with the knowledge and skills to confidently implement and manage your own Restic-based backup strategy.
What is Restic?
Restic is an open-source backup program written in Go. It's designed to be fast, efficient, and secure. It operates on the principle of client-side encryption, meaning your data is encrypted before it leaves your machine, ensuring that even if your backup storage is compromised, your data remains unreadable without the decryption key (your password).
Key Features:
- Secure: Restic uses strong, authenticated encryption (AES-256 in counter mode, Poly1305-AES for MAC) for all data and metadata. The encryption keys are derived from your repository password. This means your data is protected both in transit and at rest.
- Efficient: Restic employs content-defined chunking and deduplication. This means it breaks files into smaller pieces (chunks) and only stores unique chunks. If multiple files share the same content, or if a file is only slightly modified between backups, Restic only uploads the new or changed chunks, saving significant storage space and bandwidth.
- Verifiable: Restic allows you to verify your backups to ensure data integrity and that your backed-up data can actually be restored.
- Easy to Use: Restic has a straightforward command-line interface (CLI). While powerful, its basic operations are intuitive.
- Cross-Platform: It runs on Linux, macOS, Windows, BSD, and other operating systems.
- Multiple Backends: Restic can store backups in various locations:
- Local directories.
- SFTP servers (via SSH).
- HTTP REST servers (like Restic's own
rest-server
or rclone'srclone serve restic
). - Cloud storage services like Amazon S3, Backblaze B2, Wasabi, Google Cloud Storage, Azure Blob Storage, and many others (often via rclone).
- Snapshots: Restic creates point-in-time snapshots of your data. You can easily browse and restore files from any snapshot.
- Free and Open Source: Restic is licensed under the BSD 2-Clause License, giving you the freedom to use, modify, and distribute it.
Philosophy behind Restic:
The core philosophy of Restic revolves around simplicity, security, and efficiency. The developers aimed to create a backup tool that "just works" and gives users peace of mind. There's a strong emphasis on doing one thing (backing up data) and doing it well, without unnecessary complexity. The design ensures that the user is always in control of their encryption keys.
Brief Comparison with Other Backup Tools:
While tools like tar
, rsync
, or BorgBackup
have their places, Restic offers a compelling combination of features:
- Compared to
tar
andrsync
: Restic provides versioning (snapshots) and strong encryption natively.rsync
is great for mirroring, but not for historical backups. - Compared to
BorgBackup
: Both Restic and Borg are excellent modern backup tools with encryption and deduplication. Restic's design is often considered simpler for some remote backends (e.g., native S3 support without needing an SSH server on the remote end like Borg traditionally did, though Borg now hasborg serve
). Restic's use of Go allows for easy cross-platform binaries. Borg is Python-based. The choice often comes down to specific backend needs, performance characteristics in particular environments, or personal preference for the CLI or internal architecture.
Why choose Restic for self-hosting?
For self-hosters, Restic offers several compelling advantages:
- Client-Side Encryption: This is paramount. When self-hosting, you might use various storage solutions, some potentially less secure than enterprise-grade offerings or even public cloud storage. Restic encrypts data on your machine before it's sent anywhere. You hold the keys, ensuring privacy even if the storage backend is breached or untrusted.
- Storage Efficiency: Deduplication means you can store many snapshots over long periods without consuming excessive disk space. This is crucial when you're managing your own storage resources.
- Flexibility in Storage Backends: Whether you have a dedicated NAS, an old computer acting as an SFTP server, or you decide to use a cheap cloud storage provider for off-site backups, Restic can likely support it. This allows you to tailor your backup strategy to your budget and infrastructure.
- Open-Source Transparency: You can inspect the code, understand how it works, and be part of a community. This builds trust, which is essential for a critical tool like a backup program.
- No Server-Side Daemon Required (for many backends): For backends like SFTP or S3, Restic doesn't need special software running on the server (beyond the standard SFTP or S3 service). This simplifies setup and maintenance.
Core Concepts
Understanding these core concepts is key to using Restic effectively:
- Repository: This is the storage location where Restic keeps your backup data. It's a specially structured directory (or S3 bucket, etc.) that Restic initializes and manages. All your backed-up data, for all snapshots, from all sources (if you back up multiple directories or machines to the same repository), resides here, encrypted and deduplicated.
- Snapshots: A snapshot is an immutable, point-in-time record of the files and directories you backed up. Each time you run
restic backup
, a new snapshot is created. You can think of it as a "photo" of your data at that specific moment. You can list, browse, and restore data from any existing snapshot. - Deduplication: This is Restic's magic for saving space.
- Content-Defined Chunking: Restic first breaks down your files into smaller pieces called "chunks." Instead of fixed-size chunks, it uses an algorithm (related to the Rabin fingerprint) to find natural boundaries within the file content. This means if you insert data in the middle of a file, only the chunks around the insertion point change, while the rest of the file's chunks remain identical.
- Hashing and Storage: Each chunk is then hashed (a unique fingerprint is created). Restic stores each unique chunk (identified by its hash) only once in the repository.
- How it Works: When you back up a file, Restic chunks it. For each chunk, it checks if a chunk with the same hash already exists in the repository. If yes, it just references the existing chunk. If no, it encrypts and stores the new chunk. This applies across all files and all snapshots in the repository.
- Encryption: Restic encrypts everything it stores in the repository.
- Client-Side: Encryption happens on the machine running Restic before data is sent to the repository.
- Password-Based Key Derivation: You provide a password when initializing a repository. Restic uses this password to derive strong encryption keys. This password is the only way to access the data. If you lose this password, your backup data is irrecoverably lost.
- Authenticated Encryption: Restic uses cryptographic techniques that ensure both confidentiality (data is unreadable) and integrity/authenticity (data cannot be tampered with undetected).
- Internal Data Structures (Brief Overview):
- Blobs: The fundamental unit of data. There are two types:
- Data Blobs: These store the (encrypted) content of your file chunks.
- Tree Blobs: These store (encrypted) metadata, like directory listings, file names, permissions, and pointers (hashes) to the data blobs or other tree blobs that make up a file or directory.
- Trees: These are hierarchical structures made of tree blobs, representing the directory structure of your backup. A snapshot points to a root tree.
- Packs: For efficiency, Restic groups multiple blobs together into larger files called "pack files" within the repository's
data
subdirectory. - Index: Restic maintains an index that maps blob hashes to the pack files where they are stored, allowing for quick lookups.
- Snapshots (files): These are small files in the repository that store metadata about a specific backup operation, including a pointer to the root tree blob for that backup, the time, host, tags, and paths backed up.
- Keys: Files in the repository storing the encrypted master keys. These are themselves encrypted using keys derived from your repository password.
- Config: A file in the repository storing its configuration, such as the repository version and chunking parameters.
- Blobs: The fundamental unit of data. There are two types:
Workshop Introduction Preparing Your Environment and Installing Restic
This first workshop will guide you through installing Restic and setting up a very basic local environment. We'll use a simple folder on your local machine as the initial Restic repository.
Goals:
- Install Restic on your system.
- Verify the installation.
- Understand how Restic refers to repositories.
Prerequisites:
- A Linux, macOS, or Windows computer.
- Internet access to download Restic.
- Basic familiarity with the command line/terminal.
Steps:
-
Create a Working Directory (Optional but Recommended): It's good practice to have a dedicated directory for your Restic experiments.
- On Linux/macOS:
- On Windows (PowerShell): For the rest of this guide, commands will often assume you are in a suitable working directory.
-
Download and Install Restic: Restic offers pre-compiled binaries, which are the easiest way to get started. Visit the Restic releases page on GitHub to find the latest version.
-
Linux:
- Download the correct archive (e.g.,
restic_<version>_linux_amd64.bz2
). - Extract it:
- Make it executable and move it to a directory in your PATH:
(If you don't have
sudo
rights or prefer a user-local installation, you can create~/bin
or~/.local/bin
, add it to your PATH, and moverestic
there.)
- Download the correct archive (e.g.,
-
macOS (using Homebrew is easiest):
If not using Homebrew, download therestic_<version>_darwin_amd64.bz2
(for Intel Macs) orrestic_<version>_darwin_arm64.bz2
(for Apple Silicon Macs), extract, and move it to/usr/local/bin/restic
as described for Linux. -
Windows (using Scoop or Chocolatey is easiest, or manual download):
- Using Scoop:
- Using Chocolatey:
- Manual:
- Download
restic_<version>_windows_amd64.zip
. - Extract
restic.exe
. - Move
restic.exe
to a folder that is in your system's PATH (e.g.,C:\Windows\System32
, or create a dedicated folder likeC:\ProgramFiles\Restic
and add it to your PATH environment variable).
- Download
-
-
Verify the Installation: Open a new terminal window (to ensure PATH changes are recognized) and type:
You should see output similar to: For example: If you see this, Restic is installed correctly and accessible from your command line. -
Understanding Repository Paths: Restic needs to know where its repository is. This is specified with the
-r
flag or theRESTIC_REPOSITORY
environment variable. For this introduction, we'll create a local directory to serve as our first repository. Let's create a directory that will become our Restic repository:- On Linux/macOS (inside
~/restic_workshop
): - On Windows (PowerShell, inside
~\restic_workshop
): The path to this repository would then be, for example,~/restic_workshop/my_first_repo
(Linux/macOS) or~\restic_workshop\my_first_repo
(Windows). Restic will populate this directory with its own structure once initialized.
- On Linux/macOS (inside
This workshop has prepared your system by installing Restic. You are now ready to initialize your first repository and start backing up data, which we will cover in the next section.
1. Getting Started with Restic
With Restic installed, you're ready to dive into its core functionalities: creating a secure place for your backups (a repository), making your first backup, and learning how to see what you've backed up. This section will cover these fundamental steps, focusing on using a local directory as your repository for simplicity.
Installation
This sub-section details the installation process across common operating systems. If you've already completed the "Workshop Introduction" and successfully installed Restic, you can skim or skip this part. However, it provides more detailed explanations and alternatives.
Restic is distributed as a single executable file, making installation straightforward.
-
Linux:
- Using Package Managers (Recommended for ease of updates):
Many distributions include Restic in their official repositories.
- Debian/Ubuntu:
- Fedora:
- Arch Linux:
- Note: Package manager versions might sometimes lag behind the latest Restic release. If you need the absolute latest version, the binary download is preferable.
- Binary Download (for latest version or if not in package manager):
- Go to the Restic GitHub releases page.
- Download the appropriate archive for your system architecture (usually
amd64
for 64-bit Intel/AMD CPUs, orarm64
for 64-bit ARM CPUs). For example,restic_<version>_linux_amd64.bz2
. - Open a terminal and navigate to your downloads directory.
- Extract the archive. If it's a
.bz2
file: This will leave you with an executable file, e.g.,restic_<version>_linux_amd64
. - Make the binary executable:
- Move the binary to a directory in your system's PATH. A common choice is
/usr/local/bin
: If you don't havesudo
access or prefer a user-local installation, you can create a directory like~/.local/bin
(if it doesn't exist), add it to your PATH environment variable (by editing~/.bashrc
,~/.zshrc
, or~/.profile
), and then move the binary there:
- Using Package Managers (Recommended for ease of updates):
Many distributions include Restic in their official repositories.
-
macOS:
- Using Homebrew (Recommended): Homebrew is a popular package manager for macOS.
- Binary Download:
- Go to the Restic GitHub releases page.
- Download the macOS archive, typically
restic_<version>_darwin_amd64.bz2
(for Intel Macs) orrestic_<version>_darwin_arm64.bz2
(for Apple Silicon Macs). - Open Terminal and navigate to your downloads directory.
- Extract the archive:
- Make the binary executable:
- Move it to a directory in your PATH, typically
/usr/local/bin
(Homebrew uses this path too): macOS might show a security warning when you first try to run a downloaded binary. You may need to go to "System Settings" > "Privacy & Security", scroll down, and click "Allow Anyway" for Restic. Alternatively, runningxattr -d com.apple.quarantine /usr/local/bin/restic
might be necessary.
-
Windows:
- Using Package Managers (Scoop or Chocolatey recommended):
- Scoop:
- Chocolatey (run PowerShell as Administrator):
- Binary Download:
- Go to the Restic GitHub releases page.
- Download the Windows archive, usually
restic_<version>_windows_amd64.zip
. - Extract the
restic.exe
file from the ZIP archive. - Move
restic.exe
to a directory that is included in your system's PATH environment variable.- You can create a dedicated folder, e.g.,
C:\Program Files\Restic
, and moverestic.exe
there. - Then, add this folder to your PATH:
- Search for "environment variables" in the Start Menu and select "Edit the system environment variables".
- In the System Properties window, click the "Environment Variables..." button.
- Under "System variables" (or "User variables" if you prefer), find the variable named
Path
and select it. - Click "Edit...".
- Click "New" and add the path to your Restic folder (e.g.,
C:\Program Files\Restic
). - Click "OK" on all open dialogs.
- You'll need to open a new Command Prompt or PowerShell window for the PATH change to take effect.
- You can create a dedicated folder, e.g.,
- Using Package Managers (Scoop or Chocolatey recommended):
-
Verifying the Installation: After installation, open a new terminal or command prompt and run:
This command should output the installed Restic version, confirming that the system can find and execute therestic
binary. If you get a "command not found" error, double-check that the directory containing the Restic executable is correctly added to your system's PATH and that you've opened a new terminal session.
Initializing Your First Repository
A Restic repository is the storage location where your encrypted and deduplicated backup data will live. Before you can back up any data, you must initialize a repository. For this initial setup, we will use a local directory.
-
Choosing a Repository Location: For now, let's create a directory specifically for our Restic repository. If you followed the "Workshop Introduction", you might have already created
my_first_repo
. If not:- On Linux/macOS:
- On Windows (PowerShell):
The path to this repository will be
~/restic_workshop/my_local_repo
(or its Windows equivalent).
-
The
restic init
command: This command prepares a new storage location to be used as a Restic repository. The syntax isrestic -r /path/to/repo init
. The-r
flag (or--repo
) specifies the repository location. Let's initialize our repository:- Linux/macOS:
- Windows (PowerShell):
-
Understanding the Repository Password: Upon running
This password is CRITICAL.init
, Restic will prompt you to enter a password for the new repository:- It encrypts the master keys that protect your data.
- There is NO WAY to recover data from a Restic repository if you lose this password. Restic developers cannot help you.
- Choose a strong, unique password and store it securely (e.g., in a reputable password manager).
- You will need this password for every interaction with this repository (backing up, restoring, listing snapshots, maintenance, etc.), unless you use a password file or environment variable (covered later).
After successfully entering and confirming the password, Restic will output something like:
-
Repository Structure (Brief Overview): If you now look inside the
~/restic_workshop/my_local_repo
directory, you'll see a structure created by Restic:my_local_repo/ ├── config ├── data/ ├── index/ ├── keys/ ├── locks/ └── snapshots/
config
: Contains repository configuration (e.g., version, chunker parameters).data/
: Stores pack files, which contain the actual (encrypted) data blobs. This is where the bulk of your backup data will reside.index/
: Contains index files that map blob IDs to pack files for quick lookups.keys/
: Stores the encrypted master encryption keys. These are unlocked by your repository password.locks/
: Used by Restic to manage concurrent access and prevent repository corruption.snapshots/
: Stores individual snapshot files (metadata about each backup).
You generally don't need to interact with these files and directories directly. Restic manages them for you. Never manually delete or modify files within a Restic repository unless you know exactly what you are doing, as this can lead to data loss.
Making Your First Backup
Once the repository is initialized, you can start backing up your data.
-
The
restic backup
command:
The basic syntax isrestic -r /path/to/repo backup /path/to/your/data [another/path ...]
. You will be prompted for your repository password. -
Selecting Files and Directories to Back Up:
Let's create some sample data to back up.- On Linux/macOS (inside
~/restic_workshop
):mkdir ~/restic_workshop/data_to_backup echo "This is file1.txt in the root of our backup." > ~/restic_workshop/data_to_backup/file1.txt mkdir ~/restic_workshop/data_to_backup/project_alpha echo "Alpha project details." > ~/restic_workshop/data_to_backup/project_alpha/readme.md echo "Some important notes for Alpha." > ~/restic_workshop/data_to_backup/project_alpha/notes.txt
- On Windows (PowerShell, inside
~\restic_workshop
):mkdir ~\restic_workshop\data_to_backup Set-Content -Path ~\restic_workshop\data_to_backup\file1.txt -Value "This is file1.txt in the root of our backup." mkdir ~\restic_workshop\data_to_backup\project_alpha Set-Content -Path ~\restic_workshop\data_to_backup\project_alpha\readme.md -Value "Alpha project details." Set-Content -Path ~\restic_workshop\data_to_backup\project_alpha\notes.txt -Value "Some important notes for Alpha."
Now, let's back up the
data_to_backup
directory:- Linux/macOS:
- Windows (PowerShell): You will be prompted for the repository password you set earlier.
- On Linux/macOS (inside
-
Understanding Tags for Organizing Backups:
You can add tags to your snapshots to help organize and identify them later. This is useful if you back up different types of data or from different sources to the same repository. Use the--tag
option (can be specified multiple times). Example:This would create a new snapshot specifically for# Linux/macOS restic -r ~/restic_workshop/my_local_repo backup --tag project --tag important ~/restic_workshop/data_to_backup/project_alpha
project_alpha
and tag it accordingly. -
Observing the Backup Process (Output, Progress):
When Restic runs a backup, it provides output:If this is not the first backup, it will say something like:enter password for repository: repository <some_id> opened successfully, password is correct found 1 previous snapshots scan [/home/user/restic_workshop/data_to_backup] scanned 3 files, 63B in 0:00 [0:00] 100.00% 3 / 3 files 63 B / 63 B 0s done duration: 0:00, 0.02MiB/s snapshot <snapshot_id_1> saved
Key information in the output:Files: 3 new, 0 changed, 0 unmodified Dirs: 1 new, 0 changed, 0 unmodified Added to Aepository: 1.148 KiB (1.118 KiB stored) processed 3 files, 63 B in 0:00 snapshot <snapshot_id_2> saved
scan [...]
: Restic scans the specified paths.Files: X new, Y changed, Z unmodified
: Shows how many files were new, modified since the last backup (of these paths), or unchanged. This highlights Restic's deduplication; unchanged files aren't re-processed or re-stored.Added to repository: X size (Y size stored)
: Shows the total size of new/changed data and the actual data added to the repository after deduplication and compression (if any, though Restic's compression is minimal, focusing on deduplication).snapshot <snapshot_id> saved
: Indicates success and gives you the ID of the newly created snapshot.
Listing and Inspecting Snapshots
After making backups, you'll want to see what snapshots are stored in your repository.
-
The
restic snapshots
command:
This command lists all snapshots in the repository.You'll be prompted for the repository password. The output will look something like this:# Linux/macOS restic -r ~/restic_workshop/my_local_repo snapshots # Windows (PowerShell) restic -r ~\restic_workshop\my_local_repo snapshots
enter password for repository: repository <some_id> opened successfully, password is correct ID Time Host Tags Paths ---------------------------------------------------------------------------------------------------- a1b2c3d4 2023-10-27 10:00:00 my-laptop /home/user/restic_workshop/data_to_backup e5f6g7h8 2023-10-27 10:05:00 my-laptop project /home/user/restic_workshop/data_to_backup/project_alpha important ---------------------------------------------------------------------------------------------------- 2 snapshots
-
Interpreting Snapshot Information:
ID
: A unique short identifier for the snapshot (e.g.,a1b2c3d4
). You use this ID to refer to the snapshot for restoring, browsing, or deleting.Time
: The date and time the backup operation started.Host
: The hostname of the machine where the backup was created. This is useful if you back up multiple machines to the same repository.Tags
: Any tags you applied to the snapshot during backup.Paths
: The original path(s) on the source machine that were included in this backup.
-
The
restic ls <snapshot_id>
command:
To see the file and directory structure within a specific snapshot, use thels
command with the snapshot ID. You can use the short ID from thesnapshots
command.Output:# Linux/macOS - replace a1b2c3d4 with an actual ID from your output restic -r ~/restic_workshop/my_local_repo ls a1b2c3d4
You can also list contents of subdirectories within the snapshot: Output:enter password for repository: repository <some_id> opened successfully, password is correct snapshot a1b2c3d4 of [/home/user/restic_workshop/data_to_backup] filtered by [] /file1.txt /project_alpha /project_alpha/notes.txt /project_alpha/readme.md
Theenter password for repository: repository <some_id> opened successfully, password is correct snapshot a1b2c3d4 of [/home/user/restic_workshop/data_to_backup] filtered by [] /project_alpha/notes.txt /project_alpha/readme.md
ls
command is very useful for verifying that the files you intended to back up are indeed present in the snapshot.
Workshop Your First Backup and Restore
This workshop will put the theory into practice. You'll create sample data, back it up, simulate data loss, and then restore your data.
Goals:
- Initialize a Restic repository.
- Create sample data.
- Back up the sample data to the repository.
- List snapshots to verify the backup.
- Simulate data loss by deleting the original data.
- Restore the data from the Restic snapshot.
- Verify the integrity of the restored data.
Prerequisites:
- Restic installed.
- A terminal or command prompt.
Steps:
-
Navigate to Your Workshop Directory:
- Linux/macOS:
- Windows (PowerShell):
-
Define Repository Path (for convenience in this workshop): To avoid typing the full repository path repeatedly, you can set an environment variable for your current terminal session.
- Linux/macOS:
# If my_local_repo doesn't exist, create it: mkdir my_local_repo export RESTIC_REPOSITORY=~/restic_workshop/my_local_repo # For this workshop, also set a password file to avoid typing the password repeatedly # THIS IS FOR DEMONSTRATION. SECURE YOUR PASSWORD FILE APPROPRIATELY IN REAL SCENARIOS. echo "your_super_secret_password" > ~/restic_workshop/mypass.txt chmod 600 ~/restic_workshop/mypass.txt # Restrict permissions export RESTIC_PASSWORD_FILE=~/restic_workshop/mypass.txt
- Windows (PowerShell):
Important: Replace
# If my_local_repo doesn't exist, create it: mkdir my_local_repo $env:RESTIC_REPOSITORY = "$env:USERPROFILE\restic_workshop\my_local_repo" # For this workshop, also set a password file # THIS IS FOR DEMONSTRATION. SECURE YOUR PASSWORD FILE APPROPRIATELY IN REAL SCENARIOS. Set-Content -Path "$env:USERPROFILE\restic_workshop\mypass.txt" -Value "your_super_secret_password" $env:RESTIC_PASSWORD_FILE = "$env:USERPROFILE\restic_workshop\mypass.txt"
"your_super_secret_password"
with the actual password you want to use. If you use a password file, ensure it's protected and not committed to version control if your workshop directory is a git repo. For production, consider more secure ways to handle passwords for automation (likesystemd
credentials or other secrets management tools).
- Linux/macOS:
-
Initialize the Restic Repository (if not already done): If
If it was already initialized, it will say something like:my_local_repo
is empty or doesn't exist,restic init
will create it. If it was already initialized and you set theRESTIC_PASSWORD_FILE
to the correct password, Restic will use it.repository master key and config already initialized
. If you used a new password inmypass.txt
for an existing repo, this will fail. Ensure consistency or re-initialize in a freshmy_local_repo
directory. For this workshop, ifmy_local_repo
exists from previous steps, you can remove it and re-create it to start fresh:- Linux/macOS:
rm -rf my_local_repo; mkdir my_local_repo
- Windows:
Remove-Item -Recurse -Force my_local_repo; mkdir my_local_repo
Then runrestic init
.
- Linux/macOS:
-
Create Sample Data: We'll create a directory named
source_data
with a few files and subdirectories.- Linux/macOS:
mkdir source_data echo "Hello Restic World!" > source_data/greeting.txt echo "This is an important document." > source_data/important_doc.md mkdir source_data/my_photos echo "image_data_1" > source_data/my_photos/photo1.jpg echo "image_data_2" > source_data/my_photos/photo2.png tree source_data # (if you have 'tree' installed, otherwise 'ls -R source_data')
- Windows (PowerShell):
mkdir source_data Set-Content -Path source_data\greeting.txt -Value "Hello Restic World!" Set-Content -Path source_data\important_doc.md -Value "This is an important document." mkdir source_data\my_photos Set-Content -Path source_data\my_photos\photo1.jpg -Value "image_data_1" Set-Content -Path source_data\my_photos\photo2.png -Value "image_data_2" Get-ChildItem -Recurse source_data
- Linux/macOS:
-
Back Up the Sample Data: Now, back up the
You should see output indicating the files scanned and the snapshot ID created. Becausesource_data
directory. We'll add a tagworkshop1
.RESTIC_REPOSITORY
andRESTIC_PASSWORD_FILE
are set, you shouldn't be prompted for them. -
List Snapshots: Verify that the backup was created:
You should see your snapshot listed, with the tagworkshop1
and the path./source_data
. Note its short ID. -
Simulate Data Loss: Let's pretend a disaster happened and your original
source_data
directory is gone. Be careful withrm -rf
orRemove-Item -Recurse -Force
! Make sure you are in the correct directory (~/restic_workshop
or~\restic_workshop
) and are deletingsource_data
.- Linux/macOS:
- Windows (PowerShell): Your original data is now "lost".
-
Restore the Data: Now, restore the data from the Restic snapshot. You can use the snapshot ID you noted earlier, or use the special tag
Restic will output the progress of the restore.latest
to refer to the most recent snapshot that backed up./source_data
. We'll restore it to a new directory calledrestored_data
to avoid any ambiguity. -
Verify Restored Data: Check if the
restored_data
directory contains your original files and structure.- Linux/macOS:
ls -R ./restored_data # For a more thorough check, compare file contents if you wish: diff -r ./restored_data/source_data ./source_data_original_reference # (if you made a copy before deleting) # Or simply check contents: cat ./restored_data/source_data/greeting.txt cat ./restored_data/source_data/my_photos/photo1.jpg
- Windows (PowerShell):
You should see that
Get-ChildItem -Recurse ./restored_data # Or simply check contents: Get-Content ./restored_data/source_data/greeting.txt Get-Content ./restored_data/source_data/my_photos/photo1.jpg
restored_data
contains a subdirectory namedsource_data
(because you backed up thesource_data
directory itself), and within that, all your original files and subdirectories. For example, you'll haverestored_data/source_data/greeting.txt
.
If you wanted to restore directly to the original location (e.g., if
source_data
was empty), you would have used--target /
and Restic would restore the files tosource_data/...
relative to your current directory if the snapshot path was relative. Restoring with--target
to a specific directory is often safer. - Linux/macOS:
Congratulations! You've successfully initialized a Restic repository, backed up data, simulated data loss, and restored your data. This covers the fundamental lifecycle of a backup. In the next section, we'll explore restore operations in more detail.
2. Basic Restore Operations
Having successfully created backups, the next crucial step is understanding how to retrieve your data. Restic offers flexible ways to restore files and directories, not just entire backups. You can also mount your repository as a filesystem for easy browsing and retrieval of individual files.
Restoring Files and Directories
The primary command for restoring data is restic restore
. We used it in the previous workshop to restore an entire snapshot. Let's delve deeper into its capabilities.
-
The
restic restore <snapshot_id> --target <path>
command: As a reminder, this is the fundamental restore command.<snapshot_id>
: Can be the full ID, a short unique prefix of the ID, or special identifiers likelatest
(the most recent snapshot overall).--target <path>
: Specifies the directory where Restic will place the restored files and directories. Restic will re-create the directory structure from the snapshot within this target path.- For example, if snapshot
abc123de
backed up/home/user/documents
, and you runrestic restore abc123de --target /tmp/restore_here
, the files will be restored to/tmp/restore_here/home/user/documents/
. This full path restoration is the default. - If you only want the contents of
/home/user/documents
to appear directly in/tmp/restore_here
, you need to use--include
and adjust the restore path, or restore and then move files. More on--include
below.
- For example, if snapshot
-
Restoring to a Different Location: This is the standard behavior when using
--target
. It's highly recommended to restore to a temporary or different location first, especially if the original location still exists, to avoid accidentally overwriting more recent files. You can then inspect the restored data and manually copy over what you need. -
Restoring Specific Files/Folders from a Snapshot using
--include
and--exclude
: Often, you don't need to restore an entire snapshot; you might just need a single file or directory. Restic allows this using the--include
and--exclude
flags. These flags filter the files from the snapshot that will be restored.-
--include <pattern>
: Only restore files and directories matching the pattern. You can use this multiple times. Patterns are matched against the full path within the snapshot.-
Example: To restore only the
project_alpha
directory from our previousdata_to_backup
snapshot (let's assume its ID isa1b2c3d4
and it backed up./source_data
which containedfile1.txt
andproject_alpha/
):This would create# Assuming RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE are set # And snapshot a1b2c3d4 contains /source_data/project_alpha/ # We want to restore project_alpha into ./tmp_restore_specific mkdir ./tmp_restore_specific restic restore a1b2c3d4 --target ./tmp_restore_specific --include "/source_data/project_alpha"
./tmp_restore_specific/source_data/project_alpha/...
. If you wantproject_alpha
directly intmp_restore_specific
without the leading/source_data
path from the snapshot, you'd typically restore/source_data
and then movetmp_restore_specific/source_data/project_alpha
totmp_restore_specific/project_alpha
. Alternatively, you can use therestic dump
command (more advanced, less common for simple restores) which can print file contents to stdout or save to a path, effectively stripping leading directories from the snapshot. For most cases, restoring with--include
and then moving is sufficient. A common pattern is: -
Example: To restore a single file,
notes.txt
, from withinproject_alpha
:This would createrestic restore a1b2c3d4 --target ./tmp_restore_specific --include "/source_data/project_alpha/notes.txt"
./tmp_restore_specific/source_data/project_alpha/notes.txt
.
-
-
--exclude <pattern>
: Exclude files and directories matching the pattern from the restore.- Example: Restore everything from
source_data
except themy_photos
subdirectory:
- Example: Restore everything from
- Path Specificity: The paths used in
--include
and--exclude
forrestore
refer to the paths as stored in the snapshot. Userestic ls <snapshot_id>
to confirm the exact paths. - Using
latest
with path context: If you backed up/foo/bar
and/foo/baz
in separaterestic backup
commands, they create distinct snapshots. If you then runrestic restore latest --target /tmp/restore_output --include /foo/bar
, it will find the latest snapshot that contains/foo/bar
. If you want the latest snapshot that specifically backed up/foo/bar
as a root path, you can add the path to thelatest
specifier:latest:/foo/bar
.It's often simpler to identify the specific snapshot ID using# To restore the latest version of /source_data/project_alpha # from any snapshot that contains it: restic restore latest --target ./target_dir --include "/source_data/project_alpha" # To restore from the latest snapshot *of* ./source_data (assuming ./source_data was a path given to 'backup'): # First find the snapshot ID: # restic snapshots --path ./source_data # (then use that ID) # Or, if source_data was one of the main paths of a snapshot: restic restore latest:/path/to/original/source_data --target ./target_dir --include "/source_data/project_alpha"
restic snapshots
with appropriate filters (--host
,--tag
,--path
) and then use that ID for the restore.
-
Mounting Repositories (Read-Only Access)
Restic provides a fantastic feature to mount your repository (or specific snapshots) as a read-only filesystem. This allows you to browse your backups using your system's file explorer or command-line tools as if they were regular directories.
-
The
restic mount <mount_point>
command:<mount_point>
: An empty directory on your system where the Restic filesystem will be mounted.- Example:
- Once mounted,
~/restic_mount
will contain several directories:hosts/
: Snapshots organized by hostname.ids/
: Snapshots accessible directly by their full ID.snapshots/
: Snapshots listed by their creation date and time (e.g.,2023-10-27T10:00:00Z_<shortid>
). This is often the most convenient way to browse.tags/
: Snapshots organized by tags.
- You can navigate these directories, view file contents, and copy files out. The filesystem is read-only, so you cannot accidentally modify your backups.
- To unmount, use the standard unmount command for your OS (e.g.,
umount ~/restic_mount
on Linux, ordiskutil unmount ~/restic_mount
on macOS, or simply Ctrl+C in the terminal whererestic mount
is running). On Windows, closing therestic mount
process (Ctrl+C) usually suffices. If it gets stuck,fusermount -u ~/restic_mount
(Linux) might be needed.
-
Prerequisites (FUSE): This feature relies on FUSE (Filesystem in Userspace).
- Linux: You need to install
fuse
(orfuse3
). The package name might befuse
,fuse-utils
,fuse3
. You might also need to add your user to thefuse
group and log out/in:sudo usermod -aG fuse $USER
. - macOS: You need to install "macFUSE" (formerly osxfuse). You can download it from its official site or install via Homebrew: It usually requires a system extension and a reboot.
- Windows: Restic's mount feature on Windows uses an external library/tool.
- You need to install WinFsp (Windows File System Proxy). Download the installer from WinFsp's GitHub releases.
- During Restic compilation or if you download official Restic binaries, they should be built with FUSE support. The
restic mount
command on Windows will then work if WinFsp is installed.
- Linux: You need to install
-
Security Implications and Use Cases:
- Read-Only: The mount is inherently read-only, which is good for safety.
- Convenience: Excellent for quickly grabbing a few files, verifying backup contents visually, or comparing versions of a file across different snapshots.
- Performance: Accessing files through the FUSE mount can be slower than a direct
restic restore
, as data needs to be fetched, decrypted, and assembled on the fly. It's not ideal for restoring very large amounts of data or for performance-critical access. - Resource Usage: The
restic mount
process will consume resources while active. Remember to unmount when done.
Understanding Restore Conflicts
What happens if you try to restore files to a target directory where files with the same names already exist?
-
Restic's Default Behavior: By default,
restic restore
will overwrite existing files at the target location if they have the same name and path as files being restored from the snapshot. It does not merge directories or selectively update files based on modification times within the target. It simply lays down the snapshot's version of the files.- If a file exists in the target but not in the snapshot (at that path), it will be left untouched.
- If a directory needs to be created for a restored file, Restic creates it.
-
Best Practices for Restoring to Avoid Data Loss:
- Restore to a New, Empty Directory: This is the safest approach.
Then, inspect
/tmp/my_restore_area
and manually copy or move the needed files to their final destinations, handling any potential conflicts yourself (e.g., by renaming existing files or using tools likersync
with care). - Use
--verify
(Not a restore flag): While not directly part of restore, regularly runrestic check --read-data
(or a sample thereof) to ensure your repository is in good shape. A corrupted repository can lead to failed restores. - Be Specific with
--include
: If you only need specific files, use--include
to limit what gets restored, reducing the chance of unintended overwrites. - Understand Your Snapshot Contents: Use
restic ls <snapshot_id>
to know exactly what files are in the snapshot and their paths before initiating a restore to a populated area.
There is no
--dry-run
option forrestic restore
itself to see what would be overwritten. The primary safety mechanism is restoring to a temporary location. - Restore to a New, Empty Directory: This is the safest approach.
Then, inspect
Workshop Selective Restore and Mounting
In this workshop, you'll practice restoring specific parts of a backup and use the mount
feature to browse your repository. We'll use the repository and sample data from the previous workshop.
Goals:
- Restore a single specific file from a snapshot.
- Restore a specific sub-directory from a snapshot.
- Install FUSE/WinFsp if necessary.
- Mount the Restic repository.
- Browse the mounted repository and copy a file.
- Unmount the repository.
Prerequisites:
- Restic installed.
- The Restic repository (
my_local_repo
) and environment variables (RESTIC_REPOSITORY
,RESTIC_PASSWORD_FILE
) set up from the "Your First Backup and Restore" workshop. - At least one snapshot in the repository (e.g., from backing up
source_data
containinggreeting.txt
,important_doc.md
, andmy_photos/photo1.jpg
,my_photos/photo2.png
).
Steps:
-
Verify Environment and Snapshots: Ensure you are in your
restic_workshop
directory.- Linux/macOS:
- Windows (PowerShell):
List snapshots to pick one. We'll refer to the snapshot of
source_data
made in the previous workshop. If you used the tagworkshop1
, you can use that. Or uselatest
. Note a snapshot ID. Let's assume the path backed up was./source_data
.The paths above assume you ranrestic snapshots # Note an ID, e.g., a1b2c3d4, or we can use 'latest' if it's the one. # Let's check its contents to be sure of the paths within: restic ls latest # (Or your specific snapshot ID) # Expected output (paths might vary slightly if you backed up 'source_data' vs './source_data'): # /source_data/greeting.txt # /source_data/important_doc.md # /source_data/my_photos/ # /source_data/my_photos/photo1.jpg # /source_data/my_photos/photo2.png
restic backup ./source_data
. If you ranrestic backup source_data
(no./
), the paths in the snapshot will be/greeting.txt
,/important_doc.md
, etc. Adjust the--include
paths below accordingly. For consistency, this workshop will assume the paths in snapshot are like/source_data/...
.
-
Create Target Directories for Selective Restores:
- Linux/macOS & Windows (PowerShell):
-
Restore a Single Specific File: Let's restore only
greeting.txt
from the snapshot into theselective_restore_file
directory. Remember to use the correct path as it exists inside the snapshot.Verify:# Use 'latest' or your specific snapshot ID restic restore latest --target ./selective_restore_file --include "/source_data/greeting.txt"
- Linux/macOS:
ls -R ./selective_restore_file
- Windows (PowerShell):
Get-ChildItem -Recurse ./selective_restore_file
You should see./selective_restore_file/source_data/greeting.txt
.
- Linux/macOS:
-
Restore a Specific Sub-directory: Now, let's restore the entire
Verify:my_photos
sub-directory intoselective_restore_folder
.- Linux/macOS:
ls -R ./selective_restore_folder
- Windows (PowerShell):
Get-ChildItem -Recurse ./selective_restore_folder
You should see./selective_restore_folder/source_data/my_photos/
containingphoto1.jpg
andphoto2.png
.
- Linux/macOS:
-
Prepare for Mounting (Install FUSE/WinFsp if needed):
- Linux:
- macOS:
- Windows:
- Download and install WinFsp from http://www.secfs.net/winfsp/download/. Choose the "Core" installer.
-
Mount the Restic Repository: Create a mount point and mount the repository.
- Linux/macOS & Windows (PowerShell): Now, run the mount command. This command will keep running in your terminal until you stop it (Ctrl+C). Open a NEW terminal window/tab for the subsequent browsing commands.
-
Browse the Mounted Repository (in the NEW terminal): Navigate into the mount point and explore.
- Linux/macOS:
cd ./my_restic_mount ls # You should see: hosts, ids, snapshots, tags cd snapshots ls # You'll see directories named after snapshot times/IDs # Navigate into one of them, e.g., the latest one # cd <snapshot_directory_name_here> # ls # You should see 'source_data' directory (or similar, based on your backup path) # cd source_data # ls # cat important_doc.md # Try copying a file out: cp ./important_doc.md ~/restic_workshop/copied_from_mount.txt # (Adjust target path if needed) cd ~/restic_workshop # Go back to your main workshop directory cat ./copied_from_mount.txt
- Windows (PowerShell):
cd .\my_restic_mount Get-ChildItem # You should see: hosts, ids, snapshots, tags cd snapshots Get-ChildItem # Navigate into one of them # cd <snapshot_directory_name_here> # Get-ChildItem # cd source_data # Get-ChildItem # Get-Content important_doc.md # Try copying a file out: Copy-Item -Path .\important_doc.md -Destination ..\copied_from_mount.txt # (Copies to parent of my_restic_mount, i.e. restic_workshop) cd .. # Go back to restic_workshop directory from my_restic_mount Get-Content .\copied_from_mount.txt
- Linux/macOS:
-
Unmount the Repository: Go back to the terminal where
restic mount
is running and pressCtrl+C
. This will unmount the filesystem. Verify thatmy_restic_mount
is now empty (or no longer accessible as a special mount).- Linux/macOS:
ls ./my_restic_mount
(should be empty or show an error if it auto-deleted). - Windows (PowerShell):
Get-ChildItem ./my_restic_mount
(should be empty). Sometimes, especially if the mount process was interrupted uncleanly, you might need a more forceful unmount on Linux:sudo umount ./my_restic_mount
orfusermount -u ./my_restic_mount
.
- Linux/macOS:
This workshop demonstrated how to perform targeted restores and how to use the convenient mount
feature for browsing and accessing files from your backups. These are essential skills for effectively managing your Restic backups.
3. Managing Your Restic Repository
As you accumulate backups, your Restic repository will grow. Proper management is essential to ensure its integrity, control its size, and maintain efficient operation. This section covers crucial maintenance tasks, provides a deeper understanding of deduplication, and introduces the use of environment variables for easier Restic operation.
Repository Maintenance
Regular maintenance keeps your repository healthy and optimized.
-
Checking Repository Integrity
restic check
: This is one of the most important commands for peace of mind. It verifies the integrity and consistency of your repository structure and data.- What it does (by default):
- Verifies that all pack files listed in the index are present and their sizes match.
- Checks that the structure of snapshots, trees, and other internal metadata is intact.
- It does not read all data blobs by default, as this can be very time-consuming and I/O intensive for large repositories.
--read-data
: To perform a more thorough check that involves reading all data from all pack files, decrypting it, and verifying its hash, use the--read-data
flag: This is much slower and uses more bandwidth (for remote repositories) but provides the highest assurance that your data is not corrupted at rest. It's advisable to run this periodically (e.g., weekly or monthly), perhaps during off-peak hours.--read-data-subset <percent|size>
: If--read-data
is too slow for your entire repository, you can check a random subset of pack files.restic check --read-data-subset 10%
(checks 10% of the data)restic check --read-data-subset 50G
(checks up to 50GB of data) This can be a good compromise between a quick check and a full data read.
- Output: A healthy check will report
no errors found
. If errors are found, it will provide details, which are critical for troubleshooting (covered in a later section).
- What it does (by default):
-
Pruning Old Snapshots
restic forget
andrestic prune
: Over time, you'll accumulate many snapshots. While Restic's deduplication is efficient, the metadata itself (snapshot files, tree objects) can consume space, and you might want to enforce a retention policy (e.g., only keep daily backups for a month, weekly for a year, etc.). This is a two-step process:restic forget [options]
: This command decides which snapshots to remove based on a retention policy you define. It doesn't delete data immediately; it only marks the snapshots as forgotten and removes their entries. The actual data blobs that are no longer referenced by any kept snapshot remain in the repository untilprune
is run.- Retention Policies: The
--keep-*
options are powerful:--keep-last <n>
: Keep the lastn
snapshots.--keep-hourly <n>
: Keep the lastn
hourly snapshots (one per hour).--keep-daily <n>
: Keep the lastn
daily snapshots (one per day).--keep-weekly <n>
: Keep the lastn
weekly snapshots.--keep-monthly <n>
: Keep the lastn
monthly snapshots.--keep-yearly <n>
: Keep the lastn
yearly snapshots.--keep-tags <taglist>
: Apply a separate retention policy for snapshots with specific tags.--group-by <host,paths,tags>
: Group snapshots before applying policies (e.g., apply--keep-daily 7
per host).
--dry-run
or-n
: Crucial for testing! This shows whatforget
would do without actually removing anything. Always use--dry-run
first to verify your policy.- Example of actually forgetting:
- Retention Policies: The
-
restic prune
: Afterforget
has marked snapshots for deletion,prune
actually removes the underlying data blobs that are no longer referenced by any remaining snapshot. It also repacks data to consolidate smaller pack files and remove any internal fragmentation, potentially freeing up significant space.- Important:
prune
can be a resource-intensive operation (CPU, I/O, memory), especially on large repositories. It needs to rebuild the index and potentially rewrite large amounts of data. Plan to run it during off-peak hours. prune
creates new pack files and removes old ones. This means it temporarily needs extra space in the repository to hold both old and new data during the operation.- If
prune
is interrupted, the repository remains in a consistent state, but some garbage data might not have been cleaned. You can simply runprune
again.
- Important:
-
Understanding
--repack
(obsolete context, now implicit): In older Restic versions, there was a separaterebuild-index
andprune
might have needed a--repack
flag or a subsequentrepack
command to consolidate data. In modern Restic,prune
handles repacking implicitly and efficiently. Therestic repack
command still exists for specific scenarios (like forcing a repack without pruning or to specific pack sizes) but is less commonly used for routine maintenance.prune
is generally sufficient.
-
Rebuilding the Index
This is usually not needed for routine maintenance unlessrestic rebuild-index
: The index files in a Restic repository map data blob IDs to the pack files where they are stored. If these index files become corrupted or are missing, Restic cannot efficiently find data.restic rebuild-index
scans all pack files and rebuilds the index from scratch.restic check
reports index-related errors or Restic itself suggests it.prune
also rebuilds the index. -
Understanding Lock Files and Handling Stale Locks
restic unlock
: Restic uses lock files in the repository (locks/
directory) to prevent multiple Restic processes from modifying the repository simultaneously, which could lead to corruption.- When a Restic operation (like
backup
,prune
,forget
) starts, it creates a lock. When it finishes cleanly, it removes the lock. - If Restic crashes, is killed abruptly, or a network connection to a remote repository drops, a lock file might be left behind. This is called a "stale lock."
- If a stale lock exists, subsequent Restic commands (especially write operations) will fail with an error message indicating the repository is locked.
restic list-locks
: Shows all active locks in the repository.restic unlock [lockID]
: Removes a specific lock.restic unlock --remove-all
: Removes all locks. Use with caution! Only do this if you are certain no other Restic process is legitimately using the repository. If you remove a lock that's actively in use by another Restic instance, you risk repository corruption.
- When a Restic operation (like
Understanding Deduplication in Depth
We've mentioned deduplication, but let's explore how it works more deeply, as it's central to Restic's efficiency.
-
Chunking Algorithm (Content Defined Chunking - CDC): Restic doesn't just split files into fixed-size blocks (like 4MB chunks). If it did, inserting a single byte at the beginning of a large file would cause every subsequent block to change, leading to poor deduplication for that file. Instead, Restic uses Content Defined Chunking. It scans the file's content using a rolling hash function (often based on Rabin fingerprints). When the hash value matches a certain pattern or falls within a certain range (determined by a "mask"), Restic declares a chunk boundary.
- Variable Chunk Sizes: This results in variable chunk sizes, typically averaging around a target size (e.g., 1MB) but ranging from a minimum (e.g., 512KB) to a maximum (e.g., 8MB). These parameters are set when the repository is initialized and stored in the
config
file. - Robustness to Insertions/Deletions: Because chunk boundaries are determined by the content itself, if you insert or delete data in the middle of a file, only the chunks directly affected by the change (and possibly one or two adjacent chunks) will be different. Chunks far away from the modification, whose content hasn't changed, will remain identical and thus be deduplicated.
- Variable Chunk Sizes: This results in variable chunk sizes, typically averaging around a target size (e.g., 1MB) but ranging from a minimum (e.g., 512KB) to a maximum (e.g., 8MB). These parameters are set when the repository is initialized and stored in the
-
How Changes in Files Affect New Backups:
- New File: The entire file is chunked. For each chunk, Restic calculates its SHA-256 hash. It checks if a blob with this hash already exists in the repository's index.
- If yes (chunk is a duplicate): Restic creates a pointer to the existing blob.
- If no (new chunk): Restic compresses (minimally), encrypts, and stores the new blob in a pack file, then updates the index.
- Modified File: Restic re-chunks the entire modified file.
- Unchanged parts of the file will likely produce the same chunks as before (thanks to CDC). These will be identified as duplicates.
- Changed parts of the file will produce new chunks. These will be processed as above (check hash, store if new).
- Unmodified File (based on metadata): If a file's metadata (modification time, size, inode number on Unix-like systems) hasn't changed since the "parent" snapshot (Restic can quickly check this if a parent snapshot is specified or found), Restic can often skip re-reading and re-chunking the file entirely, assuming its content is also unchanged. This significantly speeds up backups of large, mostly static datasets. The
--force
flag can make Restic re-read all files.
- New File: The entire file is chunked. For each chunk, Restic calculates its SHA-256 hash. It checks if a blob with this hash already exists in the repository's index.
-
Impact on Storage Space and Backup Speed:
- Storage Space: Deduplication dramatically reduces storage requirements, especially for:
- Multiple backups of the same dataset over time (only changes are stored).
- Backups of multiple virtual machines that share common operating system files.
- Directories with many duplicate files.
- Backup Speed (for subsequent backups):
- Scanning for changes can still take time, especially for many small files.
- However, transferring data is much faster as only new/unique chunks are uploaded.
- The client CPU does work for chunking, hashing, and encryption.
- Storage Space: Deduplication dramatically reduces storage requirements, especially for:
Environment Variables for Restic
Restic commands often require you to specify the repository location (-r
or --repo
) and the password. Typing these repeatedly is tedious and error-prone, especially in scripts. Restic supports several environment variables to simplify this:
-
RESTIC_REPOSITORY
: Specifies the location of the Restic repository. If set, you don't need to use the-r
or--repo
flag.- Example (Linux/macOS Bash):
export RESTIC_REPOSITORY=/srv/restic-repo
- Example (Windows PowerShell):
$env:RESTIC_REPOSITORY = "D:\backup\restic-repo"
- Example (Linux/macOS Bash):
-
RESTIC_PASSWORD
/RESTIC_PASSWORD_FILE
(Security Implications): These provide the repository password.RESTIC_PASSWORD
: Sets the password directly.- Example (Linux/macOS Bash):
export RESTIC_PASSWORD="your_secret_password"
- Example (Windows PowerShell):
$env:RESTIC_PASSWORD = "your_secret_password"
- Security Risk: Storing passwords directly in environment variables can be a security risk, as they might be visible in process lists (
ps aux | grep restic
on Linux might show it) or shell history. Generally not recommended for scripts or shared environments.
- Example (Linux/macOS Bash):
RESTIC_PASSWORD_FILE
: Specifies a path to a text file containing the password. Restic will read the first line of this file as the password.- Example (Linux/macOS Bash):
export RESTIC_PASSWORD_FILE=/etc/restic/password.txt
- Example (Windows PowerShell):
$env:RESTIC_PASSWORD_FILE = "C:\secrets\restic_pass.txt"
- Security Best Practice: This is generally more secure than
RESTIC_PASSWORD
. Ensure the password file itself has restrictive permissions (e.g.,chmod 600 /etc/restic/password.txt
on Linux, so only the owner can read/write).
- Example (Linux/macOS Bash):
- If both are set,
RESTIC_PASSWORD
takes precedence. If neither is set, Restic will prompt for the password interactively.
-
Other Useful Variables:
RESTIC_CACHE_DIR
: Restic maintains a local cache (default:~/.cache/restic
on Linux/macOS, platform-specific on Windows) to speed up operations by storing index data and other metadata locally. You can change the cache location with this variable. This is useful if the default location is on a slow disk or has limited space.- Example:
export RESTIC_CACHE_DIR=/var/tmp/restic-cache
- Example:
TMPDIR
: Restic uses temporary files for some operations. If your default temporary directory (/tmp
) is small or slow, you can point Restic to use a different location by settingTMPDIR
.- Example:
export TMPDIR=/mnt/fast_ssd/tmp
- Example:
- Restic honors global proxy variables like
HTTP_PROXY
,HTTPS_PROXY
,FTP_PROXY
for remote repository access.
Using these environment variables, especially RESTIC_REPOSITORY
and RESTIC_PASSWORD_FILE
, is highly recommended for scripting and automation.
Workshop Repository Maintenance and Automation Prep
This workshop will simulate a period of backups, then apply retention policies using forget
and prune
. We'll also set up environment variables for easier use in future scripts.
Goals:
- Create several new snapshots to simulate backups over time.
- Run
restic check
to verify repository health. - Use
restic forget --dry-run
to plan a retention policy. - Apply the retention policy with
restic forget
. - Reclaim space with
restic prune
. - Observe the effects on snapshot count and repository size (if noticeable with small data).
- Set up persistent environment variables for repository and password file for future use (demonstration, actual persistence depends on shell/OS).
Prerequisites:
- The Restic repository (
my_local_repo
) from previous workshops. - The
source_data
directory used previously. - Environment variables
RESTIC_REPOSITORY
andRESTIC_PASSWORD_FILE
should ideally still be set from the previous workshop for convenience during this one. If not, set them for your current session.
Steps:
-
Set Up Session Environment Variables (if not already set): Ensure
RESTIC_REPOSITORY
andRESTIC_PASSWORD_FILE
point to your workshop repository and password file.- Linux/macOS (Bash/Zsh - example paths):
- Windows (PowerShell - example paths):
-
Simulate Backups Over Time: We'll make a few small changes to
source_data
and create new backups. We'll userestic backup
with different tags to simulate daily/weekly backups.- Initial state: Check current snapshots.
- Day 1: (Assume initial backup from workshop 1 was Day 0)
- Linux/macOS:
echo "Update Day 1" >> ./source_data/greeting.txt
- Windows:
Add-Content -Path .\source_data\greeting.txt -Value "Update Day 1"
- Linux/macOS:
- Day 2:
- Linux/macOS:
echo "Update Day 2" >> ./source_data/important_doc.md
- Windows:
Add-Content -Path .\source_data\important_doc.md -Value "Update Day 2"
- Linux/macOS:
- Day 3 to Day 7 (repeat similar modifications and backups):
# Day 3 # ... make a small change to a file in ./source_data ... # restic backup ./source_data --tag daily_sim --tag day3 # ... continue for day4, day5, day6, day7 ... # For brevity in this written workshop, let's just do a few more: # Day 3 echo "Update Day 3" >> ./source_data/my_photos/photo1.jpg restic backup ./source_data --tag daily_sim --tag day3 # Day 4 echo "Update Day 4" >> ./source_data/greeting.txt restic backup ./source_data --tag daily_sim --tag day4 # Day 8 (Simulate start of week 2) echo "Update Day 8 - Week 2" >> ./source_data/important_doc.md restic backup ./source_data --tag daily_sim --tag week2_start
- List snapshots again to see the history: You should now have several snapshots.
-
Check Repository Integrity: Before any major operation like pruning, it's good practice to check the repo.
Ensure it reports "no errors found." -
Plan Retention Policy with
restic forget --dry-run
: Let's say we want to keep:- The last 3 snapshots (covers very recent changes).
- Daily snapshots for the last 7 days (not strictly needed if covered by
--keep-last 3
but good for demo). - One snapshot tagged
week2_start
.Therestic forget --dry-run \ --keep-last 3 \ --keep-daily 7 \ --keep-tag week2_start \ --prune # Add --prune here to see what prune would do, it implies also running prune afterwards
--prune
flag inforget --dry-run --prune
is a bit confusing. Whatrestic forget --dry-run ...
itself shows is which snapshots would be removed. If you userestic forget --dry-run --prune ...
, it doesn't actually run prune in dry-run mode. A better way to plan is:Examine the output. It will list snapshots it intends to "remove" (i.e., mark as forgotten). Adjust yourecho "### Planning snapshots to forget (dry run) ###" restic forget --dry-run --keep-last 3 --keep-daily 5 --group-by paths # Adjust numbers like --keep-daily 5 based on how many you created. # For our few snapshots, let's simplify: keep the last 2, plus the one tagged week2_start restic forget --dry-run --keep-last 2 --keep-tag week2_start
--keep-*
parameters until you are satisfied with the plan. For example, if you only have 5 snapshots total,--keep-last 7
will keep all of them. Let's try a policy that will definitely remove some, assuming you have about 5-7 snapshots now: Keep the 2 newest, and any taggedworkshop1
(our very first one). Note which snapshots are listed under "remove".
-
Apply the Retention Policy with
List snapshots again: You should see fewer snapshots listed. The data blobs from the forgotten snapshots are still in the repository but are now unreferenced.restic forget
: Once you're happy with the dry run, remove the--dry-run
(or-n
) flag to actually forget the snapshots. -
Reclaim Space with
This might take a moment even for a small repository. It will show progress. Afterrestic prune
: Now, runprune
to remove the unreferenced data and potentially repack existing data.prune
completes:- Run
restic check
again to ensure health. - Optionally, check the size of your repository directory (
du -sh ~/restic_workshop/my_local_repo
on Linux/macOS or(Get-ChildItem ~\restic_workshop\my_local_repo -Recurse | Measure-Object -Property Length -Sum).Sum / 1MB
in PowerShell). With very small text files, the size difference might be negligible as overhead and metadata structure could dominate. With larger, more varied data,prune
's effect is more apparent.
- Run
-
Setup Persistent Environment Variables (Conceptual): For future use, especially in scripts, you'd want
RESTIC_REPOSITORY
andRESTIC_PASSWORD_FILE
to be set automatically. How you do this depends on your OS and shell:- Linux/macOS (Bash/Zsh): Add to your shell's startup file (e.g.,
~/.bashrc
,~/.zshrc
, or~/.profile
):Then# In ~/.bashrc or ~/.zshrc: # export RESTIC_REPOSITORY="/path/to/your/real_repo" # export RESTIC_PASSWORD_FILE="/path/to/your/real_repo_password_file"
source ~/.bashrc
or open a new terminal. For this workshop, we've been setting them per session. For actual automation, this is how you'd make them more permanent for your user session or scripts. - Windows (PowerShell): To set them persistently for the current user: You'll need to open a new PowerShell window for these to take effect.
- System-wide (e.g., in scripts run by
cron
orsystemd
): It's often better to define these variables at the top of your backup script itself, or use features of the scheduler (like systemd'sEnvironmentFile=
orEnvironment=
).
For now, ensure they are set in your current terminal session if you plan to proceed directly to the next sections. We will use these variables extensively.
- Linux/macOS (Bash/Zsh): Add to your shell's startup file (e.g.,
This workshop provided hands-on experience with vital repository maintenance tasks. Regularly checking, forgetting old snapshots according to a policy, and pruning are key to a healthy, efficient Restic backup system. Setting up environment variables will make subsequent interactions and scripting much smoother.
4. Remote Repositories and Security
While local repositories are great for quick backups and restores, a robust backup strategy (like the 3-2-1 rule: 3 copies of data, on 2 different media, with 1 off-site) necessitates storing backups remotely. Restic excels here with its client-side encryption and support for various remote backends. This section explores how to use remote storage and delves deeper into Restic's encryption model.
Supported Remote Backends
Restic can natively interact with several types of remote storage, and its capabilities can be extended further using tools like rclone
.
-
SFTP (SSH File Transfer Protocol):
- If you have a server accessible via SSH, you can use its SFTP service as a Restic backend.
- Restic connects via SSH, then interacts with the remote filesystem using SFTP commands.
- Requires an SSH server running on the remote machine.
- Authentication is typically via SSH keys (recommended) or password.
- Repository path format:
sftp:user@host:/path/to/repo
-
REST Server (
rest-server
):- Restic provides its own lightweight HTTP server called
rest-server
. You can run this on a machine to serve a Restic repository over HTTP/HTTPS. - Offers features like append-only mode, which can protect against attackers deleting backups if they compromise the client but not the
rest-server
itself (with append-only, they can add new bad data but not remove old good data). - Repository path format:
rest:http://user:password@host:port/repo_name
orrest:https://...
- Restic provides its own lightweight HTTP server called
-
Amazon S3 and Compatible Services:
- Amazon S3: Native support for Amazon's Simple Storage Service.
- Requires AWS Access Key ID, Secret Access Key, and bucket name.
- Repository path format:
s3:s3.amazonaws.com/bucket_name/path_prefix
ors3:https://<endpoint>/bucket_name/path_prefix
for specific regions or S3 gateways.
- MinIO: A popular open-source, S3-compatible object storage server you can self-host.
- Wasabi, Backblaze B2, DigitalOcean Spaces, etc.: Many cloud storage providers offer S3-compatible APIs. Restic can often use these by specifying the provider's S3 endpoint.
- Backblaze B2 example:
b2:bucketName:/path/prefix
(requires B2 account ID/application key). Restic has dedicated B2 support.
- Backblaze B2 example:
- Amazon S3: Native support for Amazon's Simple Storage Service.
-
Azure Blob Storage:
- Native support for Microsoft Azure Blob Storage.
- Requires Azure account name and key.
- Repository path format:
azure:containerName:/path/prefix
-
Google Cloud Storage (GCS):
- Native support for Google Cloud Storage.
- Requires GCS project ID and authentication (usually via a service account JSON file).
- Repository path format:
gs:bucketName:/path/prefix
-
Rclone:
- Rclone is a powerful command-line program to manage files on cloud storage. It supports a vast number of services (Dropbox, OneDrive, Google Drive, Box, many more).
- Restic can use Rclone as a "backend bridge." You configure a remote in Rclone, and then tell Restic to use that Rclone remote.
- This dramatically expands the number of cloud services Restic can use.
- Repository path format:
rclone:yourRcloneRemoteName:path/to/repo
- Restic executes the
rclone
binary in the background.
Setting up an SFTP Backend
SFTP is a common and relatively easy-to-set-up remote backend if you have a machine you can SSH into.
-
Prerequisites:
- A remote server with an SSH server installed and running.
- A user account on that server for Restic backups. It's good practice to create a dedicated, less-privileged user for this.
- Sufficient storage space on the server for your backups.
-
SSH Key-Based Authentication (Recommended):
Using SSH keys is more secure and convenient than password authentication for automated backups.- Generate an SSH Key Pair (on your client machine, if you don't have one):
- Copy the Public Key to the Remote Server:
Replacerestic_user
andyour_remote_server_ip_or_hostname
with actual values.ssh-copy-id restic_user@your_remote_server_ip_or_hostname # This appends your public key to ~/.ssh/authorized_keys on the remote server for restic_user. # If ssh-copy-id is not available (e.g., on Windows or some minimal Linux), manually: # cat ~/.ssh/id_ed25519.pub | ssh restic_user@your_remote_server_ip_or_hostname "mkdir -p ~/.ssh && chmod 700 ~/.ssh && cat >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys"
- Test SSH Key Authentication:
- (Optional but Recommended) Restrict the SSH Key:
For enhanced security, you can restrict what this specific SSH key can do on the server. Edit~/.ssh/authorized_keys
on the server for therestic_user
. Prefix the key entry with restrictions. A common one for Restic (which uses SFTP subsystem) is to limit it to SFTP only and potentially restrict the starting directory. Exampleauthorized_keys
entry on server:command="sftpserver -u 0002 -R /backup_storage/restic_user_repo",no-pty,no-X11-forwarding,no-agent-forwarding,no-port-forwarding ssh-ed25519 AAAA... restic_backup_key
This is more advanced. A simpler start is just ensuring therestic_user
has limited shell capabilities or is chrooted to their backup directory. For Restic's SFTP backend, it often relies on the standard SFTP subsystem.rrsync
(restricted rsync) scripts are also an option if you wanted to further lock down. For plain Restic SFTP, just ensuring the user can only write to their designated backup path is a good start.
-
Initializing a Restic Repository over SFTP:
Let's sayrestic_user
onyour_remote_server_ip_or_hostname
will store repositories under/var/backups/restic/my_machine_backup
. The repository URL for Restic will be:sftp:restic_user@your_remote_server_ip_or_hostname:/var/backups/restic/my_machine_backup
Restic will connect via SSH (using your key), then use SFTP commands to create the repository structure in the specified remote path.# On your client machine # Ensure RESTIC_PASSWORD_FILE is set or be ready to enter the password restic -r sftp:restic_user@your_remote_server_ip_or_hostname:/var/backups/restic/my_machine_backup init
- If the SSH server runs on a non-standard port (e.g., 2222), use
sftp:user@host:port:/path
. For Restic v0.15.0+, the syntaxssh:user@host:port::sftppath
for non-standard ports with SFTP is also mentioned in discussions, butsftp:user@host:port:/path
should be tried first or checkrestic help
for repository syntax. Commonly, for non-standard SSH port, the SSH client configuration (~/.ssh/config
) is the cleanest way:Then you can use a simpler Restic repo URL:Host my_sftp_server_alias HostName your_remote_server_ip_or_hostname User restic_user Port 2222 IdentityFile ~/.ssh/id_ed25519_for_restic
sftp:my_sftp_server_alias:/var/backups/restic/my_machine_backup
- If the SSH server runs on a non-standard port (e.g., 2222), use
-
Performance Considerations:
- SFTP performance depends heavily on network latency and bandwidth, as well as server-side I/O.
- SSH encryption/decryption adds some CPU overhead on both client and server.
- For very high-latency links or very large numbers of small files, SFTP might be slower than object storage backends like S3, which are designed for concurrent HTTP requests.
- Restic's
backup
command uses a configurable number of parallel connections for uploading data to SFTP (--sftp-connections
, default 5), which helps.
Using a MinIO (S3-compatible) Server
MinIO is an excellent open-source object storage server that implements the Amazon S3 API. You can self-host MinIO on your own hardware (or a VM/container) to create a private S3-like service.
-
Setting up a Local MinIO Server (e.g., via Docker for testing):
This is a quick way to get a MinIO instance running for experimentation. For production, refer to MinIO's documentation for robust deployment.- Install Docker if you haven't already.
- Run MinIO container:
# Create directories to persist MinIO data and config mkdir -p ~/minio/data ~/minio/config docker run -d \ -p 9000:9000 \ -p 9001:9001 \ --name minio_server \ -e "MINIO_ROOT_USER=YOUR_MINIO_ACCESS_KEY" \ -e "MINIO_ROOT_PASSWORD=YOUR_MINIO_SECRET_KEY_VERY_STRONG" \ -v ~/minio/data:/data \ -v ~/minio/config:/root/.minio \ minio/minio server /data --console-address ":9001"
- Replace
YOUR_MINIO_ACCESS_KEY
(e.g.,resticadmin
) andYOUR_MINIO_SECRET_KEY_VERY_STRONG
(e.g.,aVeryStrongSecretKeyForRestic
) with your desired credentials. Make the secret key strong! -p 9000:9000
: Exposes the MinIO S3 API port.-p 9001:9001
: Exposes the MinIO web console port.-v ~/minio/data:/data
: Mounts a local directory to store MinIO's objects.
- Replace
- Access MinIO Console: Open your web browser to
http://localhost:9001
. Log in with theMINIO_ROOT_USER
andMINIO_ROOT_PASSWORD
you set. - Create a Bucket: In the MinIO console, create a new bucket (e.g.,
restic-backups
). This bucket will host your Restic repository.
-
Configuring Restic to use MinIO:
Restic needs the MinIO server's endpoint, your access key, and your secret key.- Endpoint: For our local Docker example, it's
http://localhost:9000
. For a production MinIO, it would be its public or internal IP/hostname and port, likely with HTTPS. - Access Key:
YOUR_MINIO_ACCESS_KEY
- Secret Key:
YOUR_MINIO_SECRET_KEY_VERY_STRONG
You can provide these via environment variables (recommended for S3 keys):
Now, initialize the Restic repository in the MinIO bucket: The repository URL format is# On your client machine (where Restic runs) export AWS_ACCESS_KEY_ID="YOUR_MINIO_ACCESS_KEY" export AWS_SECRET_ACCESS_KEY="YOUR_MINIO_SECRET_KEY_VERY_STRONG" # For non-AWS S3 providers, you might need to set AWS_CA_BUNDLE if using self-signed certs for MinIO HTTPS # For MinIO specifically, Restic might also auto-detect if you use a non-HTTPS endpoint like http://localhost:9000
s3:<endpoint>/<bucket_name>/<optional_path_prefix_in_bucket>
# Ensure RESTIC_PASSWORD_FILE is set for your Restic repository password restic -r s3:http://localhost:9000/restic-backups/my_minio_repo init
s3:http://localhost:9000
: Specifies the S3 protocol and endpoint. Usinghttp
because our Docker MinIO is not set up with HTTPS by default. For production, always use HTTPS./restic-backups
: The bucket name./my_minio_repo
: An optional "folder" or prefix within the bucket where this specific Restic repository will live. This allows you to have multiple Restic repositories in the same bucket.
Once initialized, all Restic commands (
backup
,snapshots
,restore
,check
,prune
) will work with this S3 repository URL, using the AWS environment variables for authentication. - Endpoint: For our local Docker example, it's
Encryption Deep Dive
Restic's security model is built around strong, client-side, authenticated encryption. Understanding this is crucial for trusting your backups.
-
AES-256 Encryption:
- Restic uses AES-256 (Advanced Encryption Standard with 256-bit keys) in Counter mode (CTR) for encrypting data blobs and tree blobs. AES is a widely adopted, strong symmetric encryption algorithm. CTR mode turns a block cipher (like AES) into a stream cipher, which is efficient for encrypting data of varying lengths.
-
Authenticated Encryption (AEAD):
- Simply encrypting data (confidentiality) is not enough; you also need to ensure data integrity and authenticity (that the data hasn't been tampered with and that it originated from an authorized source).
- Restic achieves this by using a Message Authentication Code (MAC). Specifically, it uses Poly1305-AES as the MAC algorithm.
- For each encrypted blob, a MAC is computed and stored alongside it. When reading data, Restic recomputes the MAC and compares it to the stored one. If they don't match, the data is considered corrupt or tampered with, and Restic will report an error. This prevents an attacker from modifying ciphertext in a meaningful way without detection.
-
Key Derivation from the Password: You provide a single repository password. Restic doesn't store this password directly. Instead, it uses it to derive several cryptographic keys:
- Password Hashing: Your repository password is first processed by a key derivation function (KDF). Restic uses
scrypt
for this.scrypt
is designed to be computationally intensive and memory-hard, making brute-force attacks against the password very difficult. - Master Keys:
scrypt
produces two master keys:- One for encryption (master encryption key).
- One for MAC computation (master MAC key).
- Per-File/Blob Keys (Conceptual): While these master keys exist, Restic doesn't directly use them to encrypt every data blob. Instead, the actual encryption and MAC keys for individual blobs are derived in a way that ensures cryptographic separation, often involving nonces or unique identifiers related to the blobs themselves, combined with the master keys. The specifics are complex but designed to prevent issues like key reuse.
- Password Hashing: Your repository password is first processed by a key derivation function (KDF). Restic uses
-
The Role of
config
andkeys
Files in the Repository:keys/
: This directory in the repository stores files, each containing a copy of the (encrypted) master encryption and MAC keys. These master keys are themselves encrypted using a key derived directly from your repository password viascrypt
.- When you access the repository, Restic asks for your password, runs it through
scrypt
, and uses the output to try and decrypt one of the files in thekeys/
directory. If successful, it obtains the master keys for the repository. - Having multiple key files allows for future scenarios like password rotation or multiple users with different passwords having access to the same underlying master keys (though Restic's current model is primarily single-password).
- When you access the repository, Restic asks for your password, runs it through
config
: This file stores repository-wide configuration, including the repository ID, version, and parameters for chunking. It is not encrypted in the same way as data because some of its content might be needed before decryption keys are available (e.g., to identify repository format). However, sensitive data itself is never stored here unencrypted.
-
What Happens If You Lose Your Password? Your data is irrecoverably lost.
- Because Restic performs client-side encryption and you are the sole holder of the password (or the means to derive the keys), there is no backdoor, no password reset mechanism, and no way for Restic developers or anyone else to recover your data without that password.
- This is a feature, not a bug. It guarantees your privacy and control.
- Action: Store your repository password in a very secure and reliable place (e.g., a reputable password manager, a written copy in a safe). Consider creating a "disaster recovery" sheet with the password and repository location details.
-
Changing Repository Password (Not Directly Supported for Existing Keys): Restic does not have a simple
restic change-password
command that re-encrypts the existing master keys with a new password. Thekeys/
files are tied to the password used at their creation.- Workaround/Migration: The typical way to "change" a password is to:
- Initialize a new repository with the new desired password.
- Use
restic copy
to copy all snapshots from the old repository to the new repository. This command re-encrypts data using the new repository's keys (and thus password) as it copies. - Verify the new repository.
- Once confident, you can eventually delete the old repository.
- This process can be time-consuming for large repositories as it involves reading, decrypting, re-encrypting, and writing all unique data.
- Workaround/Migration: The typical way to "change" a password is to:
Workshop Setting up a Remote SFTP Repository
This workshop will guide you through setting up a Restic repository on a remote server using SFTP and SSH key authentication.
Goals:
- Prepare a remote server (or a local VM acting as one) with an SSH user.
- Set up SSH key-based authentication from your client to the server for this user.
- Initialize a Restic repository on the SFTP server from your client.
- Back up a sample directory to the SFTP repository.
- List snapshots and restore a file from the SFTP repository to verify.
Prerequisites:
- Client Machine: Your primary machine where Restic is installed.
- Server Machine: Another machine (can be a Virtual Machine on your local computer, a Raspberry Pi, or a cloud server) running Linux with an SSH server (like OpenSSH Server). You need
sudo
or root access on this server for user creation. - Basic understanding of Linux user management and SSH.
Server Preparation (on your "remote" server):
-
Install SSH Server (if not already present): On most server-oriented Linux distros, OpenSSH server is installed by default. If not:
-
Create a Dedicated Restic User: It's good practice to use a dedicated, non-privileged user for Restic backups.
-
Create a Directory for Restic Repositories: This directory will hold the Restic repository data for
restic_backup_user
.The actual repository will be a subdirectory within# On the server sudo mkdir -p /srv/restic_storage sudo chown restic_backup_user:restic_backup_user /srv/restic_storage sudo chmod 700 /srv/restic_storage # Only user can rwx
/srv/restic_storage
, e.g.,/srv/restic_storage/my_client_backups
.
Client Preparation (on your Restic client machine):
-
Generate SSH Key Pair (if you don't have one you want to use):
# On the client machine ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519_restic_sftp -C "restic_sftp_key_for_$(hostname)" # When prompted for a passphrase, you can leave it empty for fully automated access (less secure for the key itself) # or provide one (more secure, ssh-agent can cache it). For this workshop, empty is fine. # This creates ~/.ssh/id_ed25519_restic_sftp (private) and ~/.ssh/id_ed25519_restic_sftp.pub (public).
-
Copy Public Key to Server: Replace
your_server_ip_or_hostname
with the actual IP or hostname of your server.If# On the client machine ssh-copy-id -i ~/.ssh/id_ed25519_restic_sftp.pub restic_backup_user@your_server_ip_or_hostname # You'll be prompted for restic_backup_user's password on the server one last time here.
ssh-copy-id
is unavailable: -
Configure SSH Client (Optional but Recommended for Specific Key): To make SSH use this specific key automatically for this host/user, edit or create
~/.ssh/config
on your client machine:Now, you can test by SSHing using the alias:Host my_sftp_backup_server HostName your_server_ip_or_hostname User restic_backup_user IdentityFile ~/.ssh/id_ed25519_restic_sftp # Add Port XXXXX if your server SSH is on a non-standard port
If you don't set upssh my_sftp_backup_server # Should log you in as restic_backup_user on the server without a password. exit # to come back to client
~/.ssh/config
, SSH will try default keys, which might also work if the public key was added toauthorized_keys
generally.
Performing Restic Operations:
-
Define SFTP Repository Path and Restic Password:
- Repository URL:
sftp:my_sftp_backup_server:/srv/restic_storage/my_client_sftp_repo
(If not using~/.ssh/config
alias:sftp:restic_backup_user@your_server_ip_or_hostname:/srv/restic_storage/my_client_sftp_repo
) - Restic Password: Use the same
mypass.txt
from previous workshops or create a new one for this repository. EnsureRESTIC_PASSWORD_FILE
environment variable is set.
- Repository URL:
-
Initialize the SFTP Restic Repository:
You should see a success message. On the server, check# On the client export RESTIC_SFTP_REPO="sftp:my_sftp_backup_server:/srv/restic_storage/my_client_sftp_repo" # (Or the full sftp:user@host:/path format if not using ssh config alias) restic -r $RESTIC_SFTP_REPO init
/srv/restic_storage/my_client_sftp_repo
. It should now contain the Restic repository structure (config
,data
,keys
, etc.), owned byrestic_backup_user
. -
Back up Sample Data to SFTP Repository: Let's back up the
Observe the output. It might be slightly slower than a local backup due to network transfer.source_data
directory from yourrestic_workshop
folder again, this time to the remote SFTP repo. -
List Snapshots on SFTP Repository:
You should see the snapshot you just created with thesftp_backup
tag. -
Restore a File from SFTP Repository: Create a temporary directory for the restore.
Verify the restored file:# On the client mkdir ./sftp_restore_test # Let's restore just one file from the snapshot, e.g., /source_data/sftp_test.txt (if you created it) # or /source_data/greeting.txt if using prior data. Use `restic -r $RESTIC_SFTP_REPO ls latest` to find a file path. # Assuming /source_data/greeting.txt exists in the snapshot: restic -r $RESTIC_SFTP_REPO restore latest --target ./sftp_restore_test --include "/source_data/greeting.txt"
Congratulations! You have successfully set up an SFTP backend for Restic, configured SSH key authentication, and performed backup and restore operations. This is a significant step towards a more robust, off-site backup strategy. Remember to manage the RESTIC_PASSWORD_FILE
and your SSH private key securely.
5. Advanced Restic Usage and Automation
With a solid understanding of Restic's basics and remote repository capabilities, we can now explore advanced usage patterns, focusing on automation, efficient data selection, and strategic backup planning. These techniques are crucial for implementing a reliable, unattended backup system.
Scripting Backups
Manually running restic backup
is fine for ad-hoc backups, but for regular, reliable protection, you need automation. Shell scripts (Bash on Linux/macOS, PowerShell on Windows) are a common way to achieve this.
-
Writing Shell Scripts (Bash Example): A good backup script should be robust and informative. Key elements:
- Configuration: Define repository, password file, paths to back up, and exclusion lists, preferably at the top or via external config files/environment variables.
- Locking (simple script-level): Prevent multiple instances of your script from running simultaneously if it performs operations that shouldn't overlap (though Restic's own locking usually handles repository access). A simple pidfile mechanism can be used.
- Pre-backup Hooks: Commands to run before the backup (e.g., dumping a database, stopping a service).
- The Restic Backup Command: Carefully constructed with all necessary options.
- Post-backup Hooks: Commands to run after the backup (e.g., restarting a service, cleaning up database dumps). This includes
forget
andprune
operations. - Logging: Record what happened, any errors, and key statistics.
- Error Handling: Detect failures and react appropriately (e.g., send a notification).
Basic Bash Backup Script Example:
Make this script executable:#!/bin/bash # --- Configuration --- # Best practice: Set these via environment variables or a secure config file read by the script export RESTIC_REPOSITORY="sftp:my_sftp_backup_server:/srv/restic_storage/my_client_sftp_repo" export RESTIC_PASSWORD_FILE="/home/user/.config/restic/sftp_repo_password.txt" # Adjust path # Paths to back up (space-separated) BACKUP_PATHS="/home/user/documents /etc /var/www" # Exclude file (one pattern per line) EXCLUDE_FILE="/home/user/.config/restic/restic_excludes.txt" # Log file LOG_FILE="/var/log/restic_backup.log" # Retention policy KEEP_DAILY=7 KEEP_WEEKLY=4 KEEP_MONTHLY=6 KEEP_YEARLY=1 # --- Script Logic --- # Function for logging with timestamp log() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "${LOG_FILE}" } # Ensure password file exists and has correct permissions if [ ! -f "${RESTIC_PASSWORD_FILE}" ]; then log "ERROR: Restic password file not found at ${RESTIC_PASSWORD_FILE}" exit 1 fi # Check permissions (optional, but good practice) # if [ "$(stat -c %a "${RESTIC_PASSWORD_FILE}")" != "600" ] && [ "$(stat -c %a "${RESTIC_PASSWORD_FILE}")" != "400" ]; then # log "WARNING: Password file ${RESTIC_PASSWORD_FILE} has insecure permissions." # fi log "Starting Restic backup job for paths: ${BACKUP_PATHS}" # Pre-backup commands (example: dump a database) # log "Dumping PostgreSQL database..." # pg_dumpall -U postgres | gzip > /tmp/postgres_dump.sql.gz # BACKUP_PATHS="${BACKUP_PATHS} /tmp/postgres_dump.sql.gz" # Add dump to backup paths # Restic backup command # Using --verbose for more detailed logging from Restic # Using --tag "automated" for easy identification # Using --exclude-file if it exists EXCLUDE_OPTS="" if [ -f "${EXCLUDE_FILE}" ]; then EXCLUDE_OPTS="--exclude-file=${EXCLUDE_FILE}" log "Using exclude file: ${EXCLUDE_FILE}" fi log "Running restic backup..." restic backup ${BACKUP_PATHS} \ --tag automated \ ${EXCLUDE_OPTS} \ --verbose >> "${LOG_FILE}" 2>&1 # Append stdout and stderr to log BACKUP_EXIT_CODE=$? if [ ${BACKUP_EXIT_CODE} -eq 0 ]; then log "Restic backup completed successfully." else log "ERROR: Restic backup failed with exit code ${BACKUP_EXIT_CODE}." # Add notification command here (e.g., mail, webhook) # exit 1 # Decide if script should terminate on backup failure fi # Post-backup commands (example: remove database dump) # log "Cleaning up PostgreSQL dump..." # rm -f /tmp/postgres_dump.sql.gz # Prune old snapshots (forget and prune) # Only run prune if backup was successful or if you want to prune regardless if [ ${BACKUP_EXIT_CODE} -eq 0 ] || [ "$1" == "--force-prune" ]; then # Allow forcing prune via argument log "Running restic forget and prune..." restic forget \ --keep-daily ${KEEP_DAILY} \ --keep-weekly ${KEEP_WEEKLY} \ --keep-monthly ${KEEP_MONTHLY} \ --keep-yearly ${KEEP_YEARLY} \ --prune \ --group-by paths,tags >> "${LOG_FILE}" 2>&1 # Group by paths AND tags for retention FORGET_EXIT_CODE=$? if [ ${FORGET_EXIT_CODE} -eq 0 ]; then log "Restic forget and prune completed successfully." else log "ERROR: Restic forget/prune failed with exit code ${FORGET_EXIT_CODE}." fi else log "Skipping forget/prune due to backup failure or policy." fi # (Optional) Check repository integrity periodically # if (( $(date +%u) == 7 )); then # e.g., run check on Sundays # log "Running weekly repository check (no data read)..." # restic check >> "${LOG_FILE}" 2>&1 # fi log "Restic backup job finished." exit 0
chmod +x your_backup_script.sh
Remember to create the exclude file (restic_excludes.txt
) and the password file with appropriate content and permissions. -
Logging and Error Handling in Scripts:
- Logging: As shown above, redirect
stdout
andstderr
from Restic commands to a log file (>> "${LOG_FILE}" 2>&1
). Usetee -a
if you want to see output on the console and log it. Prefix log entries with timestamps. - Error Handling: Check the exit code of Restic commands (
$?
in Bash). A non-zero exit code usually indicates an error.0
: Success1
: General error (e.g., snapshot contains no files, source file not found)3
: Fatal error (e.g., repository cannot be opened, integrity issue) Based on the exit code, your script can log the error, send notifications (email, Slack, etc.), or take other actions.set -e
at the top of a Bash script makes it exit immediately if any command fails, which can be useful but requires careful handling if you want to perform cleanup actions on failure.
- Logging: As shown above, redirect
-
Using
--json
for Machine-Readable Output: Many Restic commands support the--json
flag (e.g.,restic backup --json
,restic snapshots --json
). This makes Restic output data in JSON format, which is much easier for scripts to parse than human-readable text.- Example: Get the ID of the last backup: This is useful for more advanced scripting where you need to programmatically use information from Restic.
Scheduling Backups
Once you have a reliable backup script, you need to schedule it to run automatically.
-
cron
on Linux/macOS:cron
is the standard job scheduler on Unix-like systems.- Edit your user's crontab:
crontab -e
- Add a line to schedule your script. The format is:
minute hour day_of_month month day_of_week /path/to/command
- Example: Run
/home/user/bin/my_restic_backup.sh
every day at 2:30 AM:# Ensure RESTIC_PASSWORD_FILE and other env vars are set within the script # or in the crontab itself (though less secure for passwords in crontab). # It's best if the script handles its environment. 30 2 * * * /home/user/bin/my_restic_backup.sh > /tmp/restic_cron.log 2>&1
- It's good practice to redirect
stdout
andstderr
from the cron job to a log file or/dev/null
if the script itself handles logging. - Ensure your script uses absolute paths for commands or sets its own
PATH
environment variable, as cron jobs run with a minimal environment. - Important: Ensure
RESTIC_PASSWORD_FILE
andRESTIC_REPOSITORY
are correctly set or accessible in the cron environment. Often, it's best toexport
these at the beginning of your script.
- It's good practice to redirect
- Example: Run
- Edit your user's crontab:
-
Task Scheduler on Windows: Windows uses Task Scheduler for automated jobs.
- Open "Task Scheduler" (search in Start Menu).
- Click "Create Basic Task..." or "Create Task..." for more options.
- Name/Description: Give your task a name (e.g., "Restic Daily Backup").
- Trigger: Define when it should run (e.g., "Daily", set time).
- Action: "Start a program".
- Program/script:
powershell.exe
(orpwsh.exe
for PowerShell Core) - Add arguments:
-ExecutionPolicy Bypass -File "C:\path\to\your_restic_backup.ps1"
- (Or directly run
restic.exe
if you pass all arguments on the command line here, but a script is more flexible).
- Program/script:
- Conditions/Settings: Configure power options, what to do if task fails, etc.
- Run with highest privileges: May be needed if backing up system files.
- User Account: Specify the user account under which the task should run. This account needs access to the Restic binary, password file, and data to be backed up.
A PowerShell backup script (
.ps1
) would be analogous to the Bash script, using PowerShell cmdlets for file operations, logging, and environment variables ($env:RESTIC_REPOSITORY
,$env:RESTIC_PASSWORD_FILE
).
-
systemd
Timers on Linux:systemd
offers a more modern and flexible alternative tocron
on Linux systems that usesystemd
. It involves two unit files: a.service
file (defines what to run) and a.timer
file (defines when to run it).- Create a
.service
file (e.g.,/etc/systemd/system/restic-backup.service
or~/.config/systemd/user/restic-backup.service
for a user service):[Unit] Description=Restic Backup Service # Add After=network-online.target if backing up to a remote repo # Add Requires=network-online.target if it MUST have network [Service] Type=oneshot # User=your_backup_user # If running as a system service but want a specific user # Group=your_backup_group ExecStart=/usr/local/bin/my_restic_backup.sh # Absolute path to your script # Environment="RESTIC_REPOSITORY=..." # Environment="RESTIC_PASSWORD_FILE=..." # (Alternatively, set these inside my_restic_backup.sh) StandardOutput=append:/var/log/restic-backup-service.log StandardError=append:/var/log/restic-backup-service.error.log [Install] WantedBy=multi-user.target # Or default.target for user services
- Create a
.timer
file (e.g.,/etc/systemd/system/restic-backup.timer
or~/.config/systemd/user/restic-backup.timer
):[Unit] Description=Run Restic Backup Daily RefuseManualStart=no # Allow manual starting of the timer RefuseManualStop=no # Allow manual stopping of the timer [Timer] # Run daily at 2:30 AM OnCalendar=*-*-* 02:30:00 # Alternatively, run 15 minutes after boot and daily thereafter # OnBootSec=15min # OnUnitActiveSec=1d Persistent=true # Run job if missed due to downtime when machine next boots [Install] WantedBy=timers.target
- Enable and Start the Timer:
- If created as system services (in
/etc/systemd/system/
): - If created as user services (in
~/.config/systemd/user/
):systemctl --user daemon-reload systemctl --user enable restic-backup.timer systemctl --user start restic-backup.timer # Check status: systemctl --user list-timers # User services need lingering enabled for the user if they should run when user is not logged in: # sudo loginctl enable-linger your_username
systemd
offers better logging integration (viajournalctl
), dependency management, and resource control compared tocron
.
- If created as system services (in
- Create a
Excluding Files and Directories Effectively
You often don't want to back up everything (e.g., cache files, temporary files, large unimportant downloads). Restic provides several ways to exclude files.
-
--exclude <pattern>
: Specify a pattern to exclude. Can be used multiple times. Patterns are shell glob patterns (e.g.,*.tmp
,node_modules
).- Example:
restic backup /home/user --exclude='*.log' --exclude='/home/user/Downloads'
- Paths are matched relative to the backup source if they are not absolute.
- If a pattern ends with
/
, it only matches directories. **
can match any sequence of characters including path separators, e.g.,**/cache/**
would match any file or directory under a directory namedcache
anywhere in the backup.
- Example:
-
--exclude-file <filepath>
: Provide a path to a text file containing one exclusion pattern per line. Comments (lines starting with#
) and empty lines are ignored. This is much cleaner for managing many exclusions. Examplerestic_excludes.txt
:Then run:# Cache directories **/.cache **/Cache **/cache # Temporary files *.tmp *.temp *.~ # Specific large directories I don't need backed up /home/user/Downloads/isos /home/user/SteamLibrary # Node.js dependencies node_modules/
restic backup /home/user --exclude-file=/path/to/restic_excludes.txt
-
--exclude-caches
Tag: This special option tells Restic to look for directories that are marked with a "Cache Directory Tag." This is a file namedCACHEDIR.TAG
inside a directory. The content of this file should be as specified by the Cache Directory Tagging Standard. If Restic finds such a tagged directory, it will exclude it. Many applications are starting to adopt this standard. To use:restic backup /home/user --exclude-caches
You can combine this with other--exclude
options. -
--iexclude <pattern>
and--iexclude-file <filepath>
: Case-insensitive versions of--exclude
and--exclude-file
. -
Order of Precedence for Include/Exclude Rules: Restic evaluates include/exclude rules in the order they are given on the command line. Generally, the last matching rule wins. However, there are also
--files-from
type options that interact with this. For basic--exclude
and--include
(used for restore, not typically for backup selection which is path-based):- If you have
restic restore ... --include foo --exclude bar
, the order matters. - It's often simpler to manage exclusions for
backup
primarily through--exclude
and--exclude-file
. Forbackup
, if you specify multiple source paths and also exclusions, the exclusions apply to all items being considered from those source paths.
- If you have
Backup Strategies
A "strategy" involves deciding what to back up, how often, and where to store it, guided by principles like the 3-2-1 rule.
-
Full vs. Incremental (Restic is always "incremental forever" effectively):
- Traditional Full Backup: A complete copy of all selected data.
- Traditional Incremental Backup: Copies data changed since the last backup (full or incremental).
- Traditional Differential Backup: Copies data changed since the last full backup.
- Restic's Approach: Restic doesn't strictly follow these traditional models for its core mechanism. Every Restic backup is like a "full" snapshot in terms of what it represents (a complete view of the selected data at that point in time). However, due to deduplication, it only transfers and stores the changed data (new unique chunks) compared to what's already in the repository.
- The first backup of a dataset is effectively a "full" transfer.
- Subsequent backups are "incremental" in terms of data transfer and storage, but result in a new, complete, independent snapshot. This is often called "incremental forever" or "deduplicated fulls." You don't need to manage chains of full + incrementals; every snapshot is self-contained for restore.
-
3-2-1 Backup Rule and how Restic fits in:
A widely respected guideline for data protection:- 3 Copies of Your Data: Your primary (live) data + two backups.
- 2 Different Storage Media: Store copies on at least two distinct types of storage (e.g., internal HDD, external HDD, LTO tape, cloud storage). This protects against failure of a specific medium type.
- 1 Off-site Copy: At least one backup copy should be stored in a different physical location (e.g., cloud, friend's house, office safe). This protects against local disasters like fire, flood, or theft.
How Restic Helps Implement 3-2-1:
- Multiple Copies: You can create multiple Restic repositories.
- Repo 1: Local external HDD (Copy 2, Medium 1, On-site)
- Repo 2: Remote SFTP server or S3 cloud bucket (Copy 3, Medium 2, Off-site)
- You would run your Restic backup script targeting both repositories (or use
restic copy
to synchronize snapshots between an on-site and off-site repository). - Restic's client-side encryption ensures your off-site copy is secure even if stored on third-party infrastructure.
-
Backing Up Different Types of Data:
- User Files (Documents, Photos, etc.): Straightforward. Point Restic to
~/Documents
,~/Pictures
, etc. - Application Configuration Files: Usually in
/etc
,~/.config
,~/.local/share
. Back these up. - Databases (PostgreSQL, MySQL, SQLite, etc.):
- Critical: You cannot reliably back up live database files directly by just copying them, as they might be in an inconsistent state.
- Solution: Use the database's native dump tool to create a consistent backup file first, then back up that dump file with Restic.
- PostgreSQL:
pg_dump
orpg_dumpall
- MySQL/MariaDB:
mysqldump
- SQLite: The
.sqlite
file can often be copied if no process is writing to it, or use the.backup
command in thesqlite3
CLI.
- PostgreSQL:
- Your pre-backup hook in the script would run the dump; your post-backup hook might remove the dump file.
- Virtual Machines:
- If VMs are shut down, you can back up their disk image files.
- If live, it's better to use snapshot capabilities of the hypervisor (if available) to get a consistent state, then back up that snapshot or export. Or, run Restic inside the VM to back up its critical data.
- Docker Volumes:
- Stop containers using the volume.
- Back up the volume's data from the Docker host path (e.g.,
/var/lib/docker/volumes/myvolume/_data
). - Or, run Restic in a temporary container that mounts the volume and the Restic config/cache, then performs the backup.
- User Files (Documents, Photos, etc.): Straightforward. Point Restic to
Workshop Automated Backup Script with Exclusions
This workshop will guide you through creating a Bash script for automated backups, incorporating exclusions, basic logging, and then setting it up with cron
.
Goals:
- Create an exclusion file.
- Develop a Bash script that:
- Uses environment variables for repository and password file (assumed to be set in the script or calling environment).
- Backs up your user's home directory (or a chosen subdirectory for safety/speed in the workshop).
- Uses the exclusion file.
- (Optionally) Excludes files larger than a certain size using
find
(more complex) or relies on Restic's standard exclusions for simplicity here. Restic doesn't have a direct--exclude-larger-than
flag; this would typically involve pre-filtering withfind
or a similar tool if strictly needed. For this workshop, we'll focus on pattern exclusions. - Logs its activity to a file.
- Set up a
cron
job (orsystemd
timer if you prefer and are on Linux) to run this script.
Prerequisites:
- A Linux or macOS environment (for Bash and cron).
- Restic installed.
- An initialized Restic repository (local or remote, e.g., the SFTP one from Workshop 4).
RESTIC_REPOSITORY
andRESTIC_PASSWORD_FILE
environment variables should be handled by the script or pre-set.
Steps:
-
Prepare Environment and Test Data:
- Repository: Decide which repository to use. For this workshop, using the local
~/restic_workshop/my_local_repo
is fine, or use the SFTP one if you have it running (sftp:my_sftp_backup_server:/srv/restic_storage/my_client_sftp_repo
). - Password File: Ensure the corresponding password file is set up (e.g.,
~/restic_workshop/mypass.txt
or~/restic_workshop/sftp_pass.txt
). - Backup Source: For safety and speed during the workshop, let's not back up your entire home directory. Create a dedicated test directory.
# In ~/restic_workshop mkdir -p ./home_sim/Documents mkdir -p ./home_sim/Downloads mkdir -p ./home_sim/.cache # A cache directory to exclude echo "My important document" > ./home_sim/Documents/doc1.txt echo "A temporary download" > ./home_sim/Downloads/big_file.iso.tmp # A temp file echo "Cache content" > ./home_sim/.cache/app_cache_data touch ./home_sim/another_file.log # A log file
- Target for Backup: We will back up
./home_sim
.
- Repository: Decide which repository to use. For this workshop, using the local
-
Create an Exclusion File: Create
~/restic_workshop/my_restic_excludes.txt
:Note: The paths in the exclude file are relative to the items being scanned.# Exclude all .cache directories and their contents **/.cache # Exclude temporary files *.tmp *.temp # Exclude log files *.log # Exclude specific subdirectories of Downloads if needed # home_sim/Downloads/unwanted_stuff/
**/.cache
will find any directory named.cache
anywhere within the backup source. If you backed up/home/user
, and had/home/user/app/.cache
, it would be matched. If backing up./home_sim
, then./home_sim/.cache
is matched. -
Develop the Backup Script: Create
~/restic_workshop/do_backup.sh
with the following content. AdjustREPO_URL
andPASS_FILE
carefully.Make it executable:#!/bin/bash # Exit on error set -e # --- Configuration --- # !! IMPORTANT !! SET THESE TO YOUR ACTUAL VALUES # Using the local repo for this workshop example for simplicity REPO_URL="${HOME}/restic_workshop/my_local_repo" PASS_FILE="${HOME}/restic_workshop/mypass.txt" # Assumes this file contains the password for REPO_URL # Source directory to back up SOURCE_DIR="${HOME}/restic_workshop/home_sim" # Exclude file EXCLUDE_FILE="${HOME}/restic_workshop/my_restic_excludes.txt" # Log file LOG_DIR="${HOME}/restic_workshop/logs" LOG_FILE="${LOG_DIR}/restic_backup_$(date +%Y-%m-%d).log" # Retention (applied after successful backup) KEEP_LAST=5 KEEP_DAILY=7 KEEP_WEEKLY=4 # --- End Configuration --- # Ensure log directory exists mkdir -p "${LOG_DIR}" # Function for logging log_msg() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "${LOG_FILE}" } # Check if source directory exists if [ ! -d "${SOURCE_DIR}" ]; then log_msg "ERROR: Source directory ${SOURCE_DIR} not found." exit 1 fi # Check if password file exists if [ ! -f "${PASS_FILE}" ]; then log_msg "ERROR: Restic password file ${PASS_FILE} not found." exit 1 fi # It's good practice to ensure password file has restrictive permissions (e.g. 600 or 400) # chmod 600 "${PASS_FILE}" # Uncomment if you want script to enforce this # Set Restic environment variables export RESTIC_REPOSITORY="${REPO_URL}" export RESTIC_PASSWORD_FILE="${PASS_FILE}" log_msg "=== Starting Restic Backup for ${SOURCE_DIR} ===" # Backup log_msg "Running backup..." # Using nice and ionice to lower priority if running from cron nice -n 10 ionice -c 3 restic backup "${SOURCE_DIR}" \ --exclude-file="${EXCLUDE_FILE}" \ --tag "automated_home_sim" \ --verbose >> "${LOG_FILE}" 2>&1 # Append Restic's own verbose output BACKUP_EC=$? if [ ${BACKUP_EC} -eq 0 ]; then log_msg "Backup completed successfully." elif [ ${BACKUP_EC} -eq 1 ]; then log_msg "Backup completed with some warnings (e.g., source files changed during backup). Exit code: ${BACKUP_EC}" # For exit code 1 (warnings), we might still want to proceed with forget/prune else log_msg "ERROR: Backup failed with exit code ${BACKUP_EC}." log_msg "See Restic output above for details." log_msg "=== Backup Job FAILED ===" exit ${BACKUP_EC} # Exit script with Restic's error code fi # Forget and Prune (only if backup was successful or had warnings) log_msg "Running forget and prune..." nice -n 10 ionice -c 3 restic forget \ --keep-last ${KEEP_LAST} \ --keep-daily ${KEEP_DAILY} \ --keep-weekly ${KEEP_WEEKLY} \ --prune \ --group-by "paths,tags" >> "${LOG_FILE}" 2>&1 FORGET_EC=$? if [ ${FORGET_EC} -eq 0 ]; then log_msg "Forget and prune completed successfully." else log_msg "ERROR: Forget/prune failed with exit code ${FORGET_EC}." fi log_msg "=== Backup Job Finished ===" exit 0
chmod +x ~/restic_workshop/do_backup.sh
-
Test the Script Manually:
- Check the output on the console and in the log file (
~/restic_workshop/logs/restic_backup_YYYY-MM-DD.log
). - Verify a new snapshot appears:
restic -r "${HOME}/restic_workshop/my_local_repo" -p "${HOME}/restic_workshop/mypass.txt" snapshots
- Check if exclusions worked. Use
restic ls <snapshot_id>
to inspect the contents of the new snapshot. You should not see.cache
directory,*.tmp
, or*.log
files fromhome_sim
. E.g.:You should see# Get latest snapshot ID for the specific path and tag LATEST_ID=$(restic -r "${HOME}/restic_workshop/my_local_repo" -p "${HOME}/restic_workshop/mypass.txt" snapshots --json --latest 1 --path "${HOME}/restic_workshop/home_sim" --tag "automated_home_sim" | jq -r '.[0].short_id') echo "Inspecting snapshot: $LATEST_ID" restic -r "${HOME}/restic_workshop/my_local_repo" -p "${HOME}/restic_workshop/mypass.txt" ls $LATEST_ID
home_sim/Documents/doc1.txt
but not the excluded files/folders.
- Check the output on the console and in the log file (
-
Set up a Cron Job:
- Open your crontab:
crontab -e
- Add a line to run the script. For testing, let's run it every 5 minutes. (Remember to remove or change this after testing!)
# Example: Run every 5 minutes for testing. # IMPORTANT: Change this to a sane schedule like daily after testing. */5 * * * * /bin/bash ${HOME}/restic_workshop/do_backup.sh
- Note: Cron often runs with a very minimal environment. The script explicitly sets
RESTIC_REPOSITORY
andRESTIC_PASSWORD_FILE
using absolute paths, which is good.nice
andionice
are used to make the backup less resource-intensive if run in the background. PATH
issues: Ifrestic
or other commands likenice
,ionice
,jq
are not found, you might need to specify their full paths in the script or set aPATH
variable at the top of the script:export PATH=/usr/local/bin:/usr/bin:/bin:$PATH
.
- Note: Cron often runs with a very minimal environment. The script explicitly sets
- Wait for it to run. Check the log file in
~/restic_workshop/logs/
. - Verify new snapshots are created.
-
Once testing is complete, change the cron schedule to something reasonable, e.g., daily at 3 AM:
And save the crontab. -
Alternative for systemd users (Linux): You could adapt the systemd timer example from the theory section to call
do_backup.sh
. The.service
file'sExecStart
would point to your script. Ensure the user running the systemd service (if it's a system service) has access to the repository, password file, and source data. User services are often easier for home directory backups.
- Open your crontab:
This workshop provided a practical template for an automated Restic backup script. Remember that robust scripting also involves more comprehensive error notification (e.g., email alerts), which was omitted for brevity but is crucial for production systems. Regularly check your logs and test your restores!
6. Restic Internals and Troubleshooting
Understanding some of Restic's internal workings can be immensely helpful when troubleshooting issues or trying to optimize performance. This section provides a glimpse into Restic's data structures, common problems and their solutions, performance tuning tips, and how to handle data migration or repository upgrades.
Understanding Restic's Data Structures
Restic organizes data within its repository in a specific, well-defined way. The repository itself is just a directory (or an S3 bucket prefix, etc.) containing several subdirectories and a config
file.
-
config
file:- Located at the root of the repository.
- A JSON file containing the repository ID, version information (which version of the repository format is being used), and chunker parameters (min/avg/max chunk size).
- This file is not encrypted, as Restic needs to read it to understand how to interact with the repository.
-
keys/
directory:- Contains files, each holding an encrypted copy of the master encryption key and master MAC key for the repository.
- These files are encrypted using a key derived from your repository password via
scrypt
. - This allows password changes by adding a new key file encrypted with a new password, then removing the old one (though Restic's current tooling for this is via
restic copy
to a new repo, or more complex key management).
-
snapshots/
directory:- Stores one file per snapshot. Each file is named with the full SHA-256 ID of the snapshot.
- These files are JSON structures containing metadata about the backup:
- Time of backup.
- Hostname where the backup was made.
- Username.
- Tags associated with the snapshot.
- Paths that were backed up.
- A pointer (SHA-256 hash) to the root tree object that represents the top-level directory structure of this snapshot.
- Snapshot files are encrypted.
-
index/
directory:- Contains index files. Each file is named with a SHA-256 ID.
- These files map blob IDs (SHA-256 hashes of content) to the pack files where these blobs are stored, along with their offset and length within the pack.
- The index is crucial for quickly locating data and for deduplication (checking if a blob already exists).
- Index files are encrypted. Restic loads these into memory (or a local cache) for fast lookups.
restic rebuild-index
reconstructs these files if they get corrupted.restic prune
also rebuilds the index.
-
locks/
directory:- Contains lock files created by Restic operations to prevent concurrent modifications that could corrupt the repository.
- Each lock file is named with a unique ID.
- Stale locks (left behind after a crash) can be removed with
restic unlock
.
-
data/
directory:- This is where the actual backup data is stored in pack files.
- Pack files are named with two hexadecimal digits as a subdirectory (e.g.,
00/
,0a/
,ff/
) and then the SHA-256 ID of the pack file itself (e.g.,00/00ab12ef...
). This sharding helps manage very large numbers of pack files. - Each pack file contains multiple blobs concatenated together.
- Pack files are encrypted.
-
Blobs (the fundamental units): Restic identifies two main types of blobs by their content, and stores them all mixed together in pack files:
- Data Blobs:
- These store the (compressed and then encrypted) content of your file chunks.
- When Restic chunks a file, each chunk's content is hashed. This hash becomes the ID of the data blob.
- Tree Blobs:
- These store (encrypted) directory listings and file metadata.
- A tree blob represents a single directory. It contains a list of "nodes." Each node can be:
- A file: storing its name, size, modification time, permissions, and a list of data blob IDs (hashes) that constitute its content.
- A subdirectory: storing its name and a pointer (hash) to another tree blob representing that subdirectory.
- Other filesystem objects like symlinks.
- The structure is hierarchical: a snapshot points to a root tree blob, which points to other tree blobs for subdirectories, and eventually to data blobs for file content.
- Tree blobs are also deduplicated. If a directory's content (list of files and subdirs with their metadata) hasn't changed, its tree blob hash will be the same, and it won't be stored again.
- Data Blobs:
-
How it all fits together (simplified backup process):
- Restic scans the files/directories to be backed up.
- For each file, it performs Content Defined Chunking.
- Each chunk's content is hashed (SHA-256). This hash is the data blob's ID.
- Restic checks its index: does this data blob ID already exist?
- If yes: Deduplicated. Restic just notes the ID.
- If no: The chunk is compressed (if applicable, Restic's compression is light), encrypted, and eventually written to a pack file. The index is updated.
- For each directory, Restic creates a list of its contents (files, subdirectories, metadata, pointers to their respective data/tree blob IDs). This list forms a tree blob.
- The tree blob is hashed. This hash is its ID. It's then checked against the index for deduplication, encrypted, and stored if new.
- This process continues recursively up to the root(s) of the backup.
- Finally, a snapshot file is created, pointing to the root tree blob(s) of the backup, along with other metadata (time, host, tags). This snapshot file is encrypted and saved.
Troubleshooting Common Issues
Even with a robust tool like Restic, you might encounter issues. Here's how to approach some common ones:
-
"Repository not found" or "password incorrect" or "Fatal: unable to open_repository: repository does not exist":
- Cause:
- Incorrect repository path (
-r
orRESTIC_REPOSITORY
). Double-check for typos, correct protocol (e.g.,sftp:
,s3:
), and server details. - Incorrect password (either typed interactively or in
RESTIC_PASSWORD_FILE
/RESTIC_PASSWORD
). Restic cannot distinguish between a wrong password and a non-existent/inaccessible repository for security reasons (to avoid leaking information about repository existence). - Network issues preventing access to a remote repository.
- Permissions issues (Restic process cannot read/write to the local repository path or connect to the remote one).
- For S3/B2/Azure/GCS: Incorrect credentials (access keys, secrets, account IDs) or bucket/container names. Ensure environment variables like
AWS_ACCESS_KEY_ID
are correctly set and exported.
- Incorrect repository path (
- Troubleshooting:
- Verify the repository path meticulously.
- If using
RESTIC_PASSWORD_FILE
, ensure the file path is correct and the file contains only the password and a single newline at the end (some systems are sensitive to extra newlines or spaces).cat -A your_password_file
can show hidden characters. - Try providing the password interactively to rule out password file issues.
- Test network connectivity to remote hosts (e.g.,
ping
,ssh
,mc ls ALIAS/bucket
if using MinIO client). - Check file system permissions for local repositories.
- For cloud backends, double-check credentials and bucket/container policies in the cloud provider's console. Use verbose flags with Restic (e.g.,
-vv
) or with the cloud provider's CLI tool to get more diagnostic info.
- Cause:
-
Stale Locks (
restic unlock
):- Symptom: Commands fail with "repository is locked by another process" or similar.
- Cause: A previous Restic process (backup, prune, etc.) terminated uncleanly (crashed, killed, network drop) without removing its lock file from the
locks/
directory in the repository. - Solution:
- Verify: Make absolutely sure no other Restic process is legitimately accessing the repository. Check
ps aux | grep restic
on relevant machines. - List Locks:
restic list-locks
will show existing lock files, their IDs, hostnames, and creation times. This can help identify if a lock is genuinely old or potentially active. - Remove Locks:
restic unlock <lock_id_from_list>
to remove a specific lock.restic unlock --remove-all
to remove all locks. Use with extreme caution, only when certain no other process is active.
- Verify: Make absolutely sure no other Restic process is legitimately accessing the repository. Check
- Prevention: Ensure scripts handle signals gracefully (trap
SIGINT
,SIGTERM
) to attempt clean shutdown. Robust network connections help for remote repositories.
-
Slow Backups/Restores (Network, Disk I/O, CPU):
- Cause & Troubleshooting:
- Network (for remote repos):
- Bandwidth: Is your upload/download speed a bottleneck? Use speed test tools. Restic offers
--limit-upload
and--limit-download
(in KiB/s) to throttle itself if it's overwhelming your connection. - Latency: High latency (long ping times) to the remote server can significantly slow down operations involving many small file transfers or interactions, like SFTP.
- Packet Loss: Can cripple performance.
ping -c 100 your_server
can show packet loss.
- Bandwidth: Is your upload/download speed a bottleneck? Use speed test tools. Restic offers
- Disk I/O (Client and/or Server):
- Slow hard drives (especially SMR drives for writes, or very fragmented drives).
- Use tools like
iostat
,iotop
(Linux) or Resource Monitor (Windows) to check disk activity and queue lengths. - For local repositories, ensure the disk is healthy and has enough free space.
- For SFTP servers, the server's disk performance is critical.
- CPU (Client):
- Restic performs chunking, hashing, and encryption (client-side). This is CPU-intensive. Older or low-power CPUs can be a bottleneck.
- Check CPU usage during backup (
top
,htop
on Linux). - Restic tries to use multiple CPU cores.
- Small Files: Backing up a huge number of very small files can be slow due to metadata overhead per file and the overhead of individual operations, regardless of raw throughput.
- Repository State: A repository needing a
prune
or with a very fragmented index might perform sub-optimally. - Antivirus/Security Software: On the client, such software might be scanning files Restic reads/writes, slowing it down. Try temporarily disabling or creating exceptions (with caution).
- Network (for remote repos):
- Cause & Troubleshooting:
-
Corrupted Repository (
restic check --read-data
helps):- Symptom:
restic check
reports errors.restic backup
orrestic restore
might fail with errors about missing blobs, pack files, or MAC verification failures. - Cause:
- Hardware issues (failing disk on client or server, bad RAM).
- Filesystem corruption.
- Bugs in Restic (rare, but possible).
- Manual tampering with repository files.
- Unreliable network causing silent data corruption during transfer to remote storage (though Restic's end-to-end checksums usually catch this).
- Troubleshooting & Recovery (Very Advanced, Potential Data Loss):
- STOP all backups to this repository immediately.
- Run
restic check --read-data
. Note all errors carefully. This identifies which pack files or blobs are affected. - If the errors point to specific pack files:
restic find --pack <pack_id>
can show which snapshots/files might be affected by a bad pack.- There is NO easy "repair" command in Restic that can magically fix corrupted data. The goal is to salvage as much unaffected data as possible.
- One drastic measure could be to remove the problematic pack file(s) from the
data/
directory (e.g.,mv data/xy/xyz... /tmp/bad_packs/
). This WILL lead to data loss for any snapshots/files that relied on blobs in that pack. - After removing bad packs, run
restic rebuild-index
. This will make Restic "forget" about the blobs in the removed packs. - Then run
restic check
again. It might show missing blobs. - Attempt to restore critical data. Some files/snapshots will be incomplete or unrestorable.
- The safest approach if corruption is found is often to:
- Identify which snapshots are affected using
restic find --blob <blob_id>
for each missing/corrupt blob reported bycheck
. - Try to restore unaffected snapshots or files to a new location.
- Create a brand new, healthy Restic repository.
- Back up your source data again to this new repository.
- If possible and if some snapshots in the old (damaged) repo are still valuable and seem mostly intact, you might try
restic copy
from the old to the new repo, but it may fail if it encounters the corrupted parts.
- Identify which snapshots are affected using
- Prevention: Use reliable hardware, ECC RAM if possible, filesystems with checksumming (ZFS, Btrfs), and regular
restic check --read-data
runs.
- Symptom:
-
Dealing with "parent snapshot not found" errors (rare):
- Cause: During backup, Restic can use a "parent" snapshot to speed up scanning for unchanged files. If this parent snapshot ID is specified (e.g., via
--parent <id>
) but doesn't exist, or if Restic's logic for finding a suitable parent fails in some edge case, this error can occur. It might also happen if a snapshot was partially removed or the repository metadata is inconsistent. - Solution:
- Try running the backup without explicitly specifying a
--parent
flag:restic backup /path/to/data
. Restic will then try to find a suitable parent automatically based on host and paths. - If it persists, try
restic backup --force /path/to/data
. The--force
flag makes Restic re-read all files instead of relying on metadata comparisons with a parent, which might bypass the issue. - Run
restic check
to ensure repository integrity. If there are underlying issues, they need to be addressed.
- Try running the backup without explicitly specifying a
- Cause: During backup, Restic can use a "parent" snapshot to speed up scanning for unchanged files. If this parent snapshot ID is specified (e.g., via
Performance Tuning
Optimizing Restic's performance involves considering the entire chain: source disk, CPU, network, and destination storage.
-
Network Bandwidth Considerations:
- As mentioned, use
--limit-upload N
and--limit-download N
(where N is in KiB/s) if Restic is saturating your link and causing issues for other applications. - For remote repositories, choose a server/service geographically close to you to reduce latency.
- For SFTP, Restic uses multiple connections (default 5, configurable with
-o sftp.connections=N
or--sftp-connections=N
in newer versions). For S3 and other HTTP-based backends, Restic also uses multiple concurrent connections (default 5, configurable with--option s3.connections=N
or similar backend-specific options). Tuning this might help on high-bandwidth, high-latency links, but too many connections can also be detrimental or hit server-side limits.
- As mentioned, use
-
Disk I/O on Client and Server:
- Source (Client): If backing up from slow HDDs, especially many small files, reading the source data can be the bottleneck. Consider faster SSDs for frequently changing source data.
- Cache (Client): Restic uses a local cache (
~/.cache/restic
orRESTIC_CACHE_DIR
). Ensure this is on a reasonably fast disk. The cache stores index data and lock files for remote repositories, reducing repeated downloads. - Destination (Server/Local Repo):
- If writing to a local HDD, ensure it's not overly fragmented and is performing well.
- For self-hosted remote storage (SFTP, MinIO), the server's disk I/O is critical. Use SSDs on the server if possible, or a fast RAID array.
-
CPU Usage:
- Encryption and chunking are CPU-bound. Faster CPUs with more cores will generally perform better.
- Restic will try to use available cores. There isn't much direct tuning for CPU usage other than ensuring your system isn't already CPU-starved by other processes. Using
nice
(Linux/macOS) or process priority settings (Windows) can make Restic "play nicer" with other applications.
-
Using a Local Cache (
RESTIC_CACHE_DIR
):- As mentioned,
RESTIC_CACHE_DIR
(default~/.cache/restic
) is important for performance with remote repositories. It stores parts of the index and metadata locally. - If the default location is on a slow or network-mounted drive, move the cache to a fast local SSD:
export RESTIC_CACHE_DIR=/mnt/fast_ssd/restic_cache
. - The cache can grow quite large for very large repositories.
restic cache --cleanup
can remove old, unused cache data.restic cache --max-size SIZE
can attempt to limit its size.
- As mentioned,
-
Parallelism Options:
--option sftp.connections=N
(or similar for other backends likeb2.connections
,s3.connections
) allows tuning the number of parallel upload/download streams for some backends. The default is often 5. Increasing this might help on high-latency, high-bandwidth links but could also overwhelm the server or your local resources. Experiment carefully.- For
backup
itself, Restic has internal parallelism for scanning files and processing data. There isn't a direct "number of backup threads" knob usually, but it's designed to utilize multiple cores.
-
Pack Size (
restic prune --max-repack-size SIZE
):- When
prune
repacks data, it aims for a certain pack size. If you have specific needs (e.g., very slow storage where fewer, larger files are better, or a filesystem with limits on file size), you can influence this, but it's an advanced option, and defaults are usually fine.restic prune
itself will try to create packs up to around 16MiB by default.restic repack
offers more control over this.
- When
Data Migration and Repository Upgrades
-
Moving a Repository to a New Backend (e.g., Local to S3): The most straightforward and Restic-idiomatic way is using
restic copy
.- Initialize the new (empty) destination repository:
- Copy snapshots from the old repository to the new one:
export RESTIC_PASSWORD_FILE_OLD="..." # Password for the old repo restic -r /path/to/old_local_repo --password-file $RESTIC_PASSWORD_FILE_OLD \ copy \ --repo2 s3:your-s3-endpoint/new-bucket/new-repo --password-file2 $RESTIC_PASSWORD_FILE_NEW
- This command reads all data from the source, decrypts it (using old password), re-encrypts it (using new password, if different, or if keys are different even with same password for new repo), and writes it to the destination. Deduplication is preserved (it only copies unique data).
- Can copy all snapshots or select specific ones.
- This can take a long time for large repositories.
- Alternative (Non-Restic, for identical backend types): If moving between two identical backend types (e.g., one S3 bucket to another, or one local filesystem to another), you could use tools like
rclone copy
orrsync
to copy the repository directory structure directly.- CRITICAL: The repository must NOT be in use during this kind of direct copy.
- The repository password and internal encryption keys remain unchanged.
- After copying, run
restic check --read-data
on the new location extensively to ensure integrity. - This is faster if underlying data doesn't need re-encryption but carries more risk if not done carefully.
restic copy
is safer.
-
Restic's Repository Format Versions and Upgrades (
restic migrate
):- Over time, the Restic developers might introduce improvements or changes to the repository format.
- When you initialize a repository, it uses the latest format version supported by that Restic binary.
- If you update your Restic binary to a newer version that supports a newer repository format, your old repository will still work.
- However, to take advantage of new format features (e.g., for performance or efficiency), you might need to migrate the repository.
- The command
restic migrate <migration_name>
is used for this.- Example: A past migration was
add_compression
(though compression is applied by default to new repos now). If it were still a manual migration, it would berestic migrate add_compression
.
- Example: A past migration was
- Migrations are typically one-way and may involve rewriting data, so they can be time-consuming.
- Always read the Restic release notes carefully when upgrading Restic to see if any migrations are recommended or available.
- Always back up your repository's
config
andkeys/
directory before attempting a migration, just in case. Or even better, have a full backup of the repo if possible, or test migration on a cloned copy of the repo first.
Workshop Diagnosing and Fixing a "Problem"
This workshop simulates a couple of common, non-destructive "problems" to practice troubleshooting.
Goals:
- Simulate and resolve a stale lock issue.
- Simulate and understand a password mismatch.
- (Discussion) Understand output from
restic check
when issues are found.
Prerequisites:
- An initialized Restic repository (e.g.,
~/restic_workshop/my_local_repo
with itsmypass.txt
). - Environment variables
RESTIC_REPOSITORY
andRESTIC_PASSWORD_FILE
set to this local repo and passfile.
Steps:
-
Simulate a Stale Lock:
- The "Safe" Stale Lock Simulation: Restic's own lock files are fairly robust. One way to get a lock without a running process is to manually create one, or to trick Restic.
Let's try creating a lock file manually. First, see what real lock files look like (if any exist from a previous interrupted operation, clear them with
restic unlock --remove-all
on your test repo). A lock file contains JSON withpid
,host
,user
,time
,exclusive
fields.# Ensure you are in ~/restic_workshop and env vars are set for my_local_repo # If my_local_repo/locks exists and has files, clear it for this test: # restic unlock --remove-all # (Ensure no restic process is actually running against this repo!) # Now, manually create a fake lock file. # The name of the lock file is a random hex string. # We'll just create one with a known name for simplicity for the workshop. mkdir -p "${RESTIC_REPOSITORY}/locks" # Ensure locks dir exists FAKE_LOCK_ID="abcdef1234567890abcdef1234567890abcdef1234567890abcdef12345678" FAKE_LOCK_FILE="${RESTIC_REPOSITORY}/locks/${FAKE_LOCK_ID}" echo '{"time":"2023-01-01T10:00:00Z","exclusive":true,"hostname":"fakehost","username":"fakeuser","pid":1234}' > "${FAKE_LOCK_FILE}" log_msg "Created fake lock file: ${FAKE_LOCK_FILE}"
- Attempt a Restic Operation:
Try listing snapshots. This usually doesn't require an exclusive lock, but let's try a
check
which might. Or evenbackup
. You should see an error message similar to:Fatal: repository is already locked exclusively by PID 1234 on fakehost by user fakeuser, lock created at ...
To override the lock, use the 'unlock' command or the --ignore-lock flag.
(The exact error message might vary slightly depending on Restic version and the operation). - Diagnose and Resolve:
restic list-locks
You should see your fake lock listed.- Since we know this is a "stale" lock (we faked it, and PID 1234 on fakehost isn't real), we can remove it.
restic unlock --remove-all
(for a test repo, this is fine) Or, more targeted:restic unlock ${FAKE_LOCK_ID}
(but you need to get the ID fromlist-locks
output if it was random). - Try the Restic operation (
restic check
) again. It should now work. - Clean up:
rm -f "${FAKE_LOCK_FILE}"
ifunlock
didn't remove it or if you want to be sure. (Actually,restic unlock
should remove it).
- The "Safe" Stale Lock Simulation: Restic's own lock files are fairly robust. One way to get a lock without a running process is to manually create one, or to trick Restic.
Let's try creating a lock file manually. First, see what real lock files look like (if any exist from a previous interrupted operation, clear them with
-
Simulate Password Issue:
- Change
RESTIC_PASSWORD_FILE
content or variable: Edit your~/restic_workshop/mypass.txt
and change the password to something incorrect (e.g., add "WRONG" to the end). Or, unsetRESTIC_PASSWORD_FILE
and try to type interactively. - Attempt to Access Repository:
Restic will output:
Fatal: unable to open repository: wrong password or no key found
(Or similar, it won't explicitly say "wrong password" to avoid confirming a repository's existence to an attacker). - Resolve:
Restore the correct password.
Try
# Restore original password file setting export RESTIC_PASSWORD_FILE=$RESTIC_PASSWORD_FILE_ORIGINAL rm ~/restic_workshop/wrong_pass.txt # Clean up # Or, if you edited mypass.txt directly, change it back to the correct password.
restic snapshots
again. It should now work. This highlights the importance of accurately managing your password or password file.
- Change
-
Discussion
restic check
Output on Errors:- It's difficult and risky to intentionally corrupt a real repository for a workshop.
-
Instead, let's discuss what
restic check
(especially with--read-data
) might show if it finds problems:pack <ID>: not referenced in any index
: A pack file exists indata/
but no index file lists its contents.restic rebuild-index
might fix this if the pack is valid. Ifrebuild-index
doesn't help, the pack might be orphaned or an old remnant.pack <ID>: referenced in index <indexID>, but not found
: An index file references a pack file that is missing from thedata/
directory. This means data loss.blob <ID> in pack <ID> at offset <X> length <Y>: MAC verification failed
: When reading a blob from a pack file and decrypting it, the Message Authentication Code (MAC) doesn't match. This means the (encrypted) data is corrupted. This is a serious error indicating data damage.tree <ID>: not found or not a tree
: A snapshot or another tree object references a tree ID, but Restic cannot find or validate this tree blob. This can lead to parts of a snapshot being unreadable.unused blob <ID>
: After aprune
,check
might list some blobs as unused. This is often normal for a short period asprune
works, but persistent large numbers of unused blobs after successful prunes might indicate an issue.prune
should clean these up.
-
What to do?
- As discussed in the theory, there's no magic "repair" button.
- Focus on
restic rebuild-index
for index issues. - For missing packs or MAC failures, it means data is lost. The goal is to salvage what you can.
- Consult Restic forums or GitHub issues with detailed logs from
restic check --read-data
if you encounter real corruption.
This workshop demonstrated how to identify and resolve some common operational issues. Regular checks and careful handling of repository access are key to maintaining a healthy backup system.
7. Integrating Restic with Other Tools
Restic's command-line nature and robust design make it highly suitable for integration with other tools and workflows. This section explores common integrations, such as backing up Docker volumes, using Rclone as a versatile backend, monitoring Restic backups, and setting up Restic's own rest-server
.
Restic and Docker
Docker is widely used for containerizing applications. Persistent data for these applications is often stored in Docker volumes. Backing up these volumes is crucial.
-
Backing up Docker Volumes: Docker volumes are managed by Docker and reside in a specific path on the Docker host (e.g.,
/var/lib/docker/volumes/
on Linux). There are a few strategies:- Backup from the Host:
- Identify the mount point of the named volume on the host. You can find this using
docker volume inspect <volume_name>
. It will show aMountpoint
path. - Important: For data consistency, especially for databases or applications actively writing to the volume, you should stop the container(s) using the volume before backing it up.
- This can be incorporated into pre/post-backup hooks in your script.
- Identify the mount point of the named volume on the host. You can find this using
- Run Restic Inside a "Sidecar" or "Job" Container:
- Create a Docker service or run a temporary container that mounts:
- The Docker volume you want to back up (e.g., mount
my_app_data_volume
to/data_to_backup
inside the Restic container). - A directory for Restic's configuration (repository URL, password file path) or pass them as environment variables.
- Optionally, a directory for Restic's cache (
/root/.cache/restic
or as configured).
- The Docker volume you want to back up (e.g., mount
- The Restic container then runs the
restic backup /data_to_backup
command. - Example
docker run
command (conceptual):# Assume RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE are set in the environment # or passed directly to the Restic container using -e docker run --rm \ -v my_app_data_volume:/data_to_backup:ro \ # Mount volume read-only if just backing up -v /path/to/host/restic_config:/etc/restic \ # Mount a dir with password file -v /path/to/host/restic_cache:/root/.cache/restic \ # Mount cache dir -e RESTIC_REPOSITORY="your_repo_url" \ -e RESTIC_PASSWORD_FILE="/etc/restic/password" \ restic/restic \ # Official Restic Docker image backup /data_to_backup --tag docker_sidecar_backup
- The
restic/restic
image on Docker Hub is convenient. - This approach keeps Restic and its dependencies containerized.
- The
- Create a Docker service or run a temporary container that mounts:
- Backup from the Host:
-
Strategies for Backing Up Containerized Applications:
- Databases: Always use the database's native dump tool (e.g.,
mysqldump
,pg_dump
) executed inside the database container or viadocker exec
. Pipe the output to a file that Restic can then back up (either from the host or via a sidecar Restic container that also mounts the dump location). - Application Data vs. Configuration:
- Separate application data (often in volumes) from application configuration (Docker Compose files, environment files, custom entrypoint scripts).
- Back up both. Config files are small and critical for recreating the service.
- Backup Frequency: Might vary. Database transaction logs might need frequent backups, while application code (if managed by version control) might not need Restic backup as often as the data volumes.
- Databases: Always use the database's native dump tool (e.g.,
Restic with Rclone
Rclone is a versatile command-line program for managing files on over 70 cloud storage services. Restic can use Rclone as a "backend," effectively allowing Restic to store its repositories on any service Rclone supports.
-
Using Rclone as a Versatile Backend:
- Why?
- Access to cloud providers not natively supported by Restic (e.g., Dropbox, Google Drive, OneDrive, Jottacloud, etc.).
- Utilize Rclone's advanced features like its own encryption layering (though Restic already encrypts, this could be for obfuscation or policy compliance), caching, or union/crypt remotes.
- Consolidate cloud access: If you already use Rclone for other purposes, Restic can leverage its existing remotes.
- How it Works: Restic executes the
rclone
binary in the background to perform operations on the remote. Restic tells Rclone what to do (list files, read data, write data, delete data) using Rclone's specific commands for serving Restic (rclone serve restic
).
- Why?
-
Setting up an Rclone Remote:
- Install Rclone: Download from rclone.org and set it up.
- Configure a Remote: Use
rclone config
to create a new remote.This creates an entry in Rclone's configuration file (usuallyrclone config # Follow the interactive prompts: # n) New remote # name> myCloudStorage (e.g., for Google Drive, Dropbox, etc.) # Storage> (select the number corresponding to your cloud provider) # ... follow provider-specific authentication steps ...
~/.config/rclone/rclone.conf
).
-
Configuring Restic to Use an Rclone Remote: The Restic repository URL format is
rclone:<rcloneRemoteName>:<path_in_remote>
.<rcloneRemoteName>
: The name you gave the remote inrclone config
(e.g.,myCloudStorage
).<path_in_remote>
: The path/folder within that Rclone remote where the Restic repository will live (e.g.,restic_backups/my_server
).
Example: Initialize a Restic repo on an Rclone remote named
gdrive_backup
in a foldermy_app_restic_repo
:All subsequent Restic commands will use this# Ensure RESTIC_PASSWORD_FILE is set # Ensure rclone is in your PATH and configured restic -r rclone:gdrive_backup:my_app_restic_repo init
rclone:
prefixed repository URL. Restic will invokerclone
as needed.- Performance: Can be slower than native Restic backends due to the overhead of Restic calling Rclone, which then calls the cloud API. However, for many use cases, it's perfectly acceptable.
- Rclone Version: Ensure your Rclone version is compatible with Restic's expectations. Generally, use recent versions of both.
Monitoring Restic Backups
For automated backups, monitoring is essential to ensure they are running successfully and to be alerted of failures.
-
Using Prometheus with a Restic Exporter:
- Prometheus is a popular open-source monitoring and alerting toolkit.
- Several third-party Restic exporters are available (e.g., search GitHub for "restic prometheus exporter").
- These exporters typically run
restic stats --json
,restic snapshots --json
, andrestic check
periodically, parsing the output and exposing metrics in a format Prometheus can scrape (e.g., total backup size, last backup time, number of snapshots, success/failure status of checks). - You can then build dashboards (e.g., in Grafana) and set up alerts in Prometheus's Alertmanager.
- This provides a comprehensive overview of your Restic repositories' health and status.
-
Parsing Restic's JSON Output for Status:
- As shown in the scripting section, Restic commands with
--json
provide machine-readable output. - Your backup script can parse this JSON (e.g., using
jq
) to extract key metrics:- Snapshot ID, time, size added, total files processed from
restic backup --json
. - List of snapshots, their sizes, tags from
restic snapshots --json
. - Statistics about repository size, deduplication ratio from
restic stats --json --mode raw-data
.
- Snapshot ID, time, size added, total files processed from
- This data can be logged, sent to a central monitoring system, or used to make decisions within the script.
- As shown in the scripting section, Restic commands with
-
Integrating with Healthchecks.io or Similar "Dead Man's Switch" Services:
- Healthchecks.io (and similar services like Uptime Kuma's push monitors, Cronitor) provide "cron monitoring" or "heartbeat monitoring."
- How it works:
- You create a "check" in Healthchecks.io, and it gives you a unique URL.
- Your backup script, at the very end of a successful run, makes an HTTP GET request to this URL (e.g., using
curl
orwget
). - If Healthchecks.io doesn't receive this "ping" within an expected timeframe (e.g., 25 hours for a daily job), it assumes the job failed (or didn't run at all) and sends you an alert.
- Example (in your Bash backup script, after successful completion):
- This is a simple yet effective way to get notified if your backups stop running. It doesn't tell you why they failed (your script's logging should do that), but it tells you that they failed to report in.
Rest-server for a Dedicated Restic Repository Server
rest-server
is Restic's own lightweight HTTP server designed specifically to serve Restic repositories. It offers some advantages over general-purpose servers like SFTP or a raw S3 bucket for Restic.
-
Setting up
rest-server
:- Download: Get the
rest-server
binary from the Restic GitHub releases page (it's a separate project from Restic itself but maintained by Restic developers). - Choose a Data Directory: Decide where
rest-server
will store the repository data (e.g.,/srv/restic_server_data
).rest-server
can serve multiple repositories from subdirectories under its main data path. - Authentication:
rest-server
uses htpasswd-style authentication.- Create an htpasswd file (e.g., using
htpasswd
utility from Apache tools, or a Go-based htpasswd generator):
- Create an htpasswd file (e.g., using
- Run
rest-server
:# Example: /path/to/rest-server \ --path /srv/restic_server_data \ # Directory to store repositories --listen :8000 \ # Listen on port 8000 --private-repos \ # Users can only access their own sub-directory --htpasswd-file /etc/restic/rest_server_htpasswd \ --log /var/log/rest_server.log # Add --tls to enable HTTPS if you provide --tls-cert and --tls-key # Or run behind a reverse proxy like Nginx or Caddy for HTTPS and other features.
--private-repos
: If enabled, whenyour_backup_user
authenticates, they can only access/srv/restic_server_data/your_backup_user/
. The repository name in the Restic URL will then be relative to this user-specific path.- If not using
--private-repos
, the Restic URL path will be relative to--path
.
- Download: Get the
-
Security Considerations:
- HTTPS is CRITICAL: Always run
rest-server
with TLS (HTTPS) enabled, either natively (--tls --tls-cert ... --tls-key ...
) or by placing it behind a reverse proxy (Nginx, Caddy, Traefik) that handles TLS termination. Transmitting Restic repository credentials over plain HTTP is insecure. - Strong Passwords: Use strong, unique passwords in your htpasswd file.
- Firewall: Restrict access to the
rest-server
port on your firewall to only trusted client IPs if possible. - Regular Updates: Keep
rest-server
and your OS updated.
- HTTPS is CRITICAL: Always run
-
Advantages over Plain SFTP or S3:
- Append-Only Mode (
--append-only
): This is a key feature. Ifrest-server
is run with--append-only
, Restic clients can create new snapshots and add data, but they cannot delete existing snapshots or data via the Resticforget
,prune
, ortag --remove
commands.- This protects against a compromised client (e.g., ransomware on a client machine) trying to delete backups. The attacker could create new (garbage) snapshots but couldn't wipe out the history.
- Pruning in append-only mode must be done directly on the
rest-server
machine, not from the client. You'd temporarily stop the append-only server, runrestic prune
locally on the server's data directory, then restartrest-server
in append-only mode.
- Prometheus Metrics (
--prometheus
):rest-server
can expose Prometheus metrics about its operation. - Designed for Restic: It understands Restic's access patterns and can be more efficient for certain operations than a generic file server.
- Append-Only Mode (
-
Using a
rest-server
Repository with Restic Client: The repository URL format isrest:http://user:password@host:port/repository_name
orrest:https://...
.- If
--private-repos
is used on the server:rest:https://your_backup_user:the_password@your_rest_server_host:8000/
(the repository is implicitly named after the user, or you can add a sub-path). - Or, if not using
--private-repos
and the repo is/srv/restic_server_data/my_main_repo
:rest:https://your_backup_user:the_password@your_rest_server_host:8000/my_main_repo
- It's better to put the username/password in a
.netrc
file or use Restic's password/password-file mechanisms for the repo URL's credentials rather than embedding in the URL, though Restic will prompt if not provided. For the repository URL itself, the user/pass for the REST server authentication is part of the URL. The Restic repository encryption password is still handled byRESTIC_PASSWORD
orRESTIC_PASSWORD_FILE
. These are two different passwords.
- If
Workshop Backing Up Docker Volumes and Setting up Rest-Server
This workshop is in two parts. Part 1 focuses on backing up a Docker volume. Part 2 involves setting up a basic rest-server
instance.
Part 1: Docker Volume Backup
Goals:
- Create a Docker container with a named volume storing some data.
- Stop the container and back up its volume data from the host using Restic.
- (Alternative discussion) Back up the volume using Restic running inside another Docker container.
- Restore volume data and verify.
Prerequisites:
- Docker installed and running.
- Restic installed on the Docker host.
- An initialized Restic repository (e.g., local
~/restic_workshop/my_local_repo
). RESTIC_REPOSITORY
andRESTIC_PASSWORD_FILE
environment variables set for this repo.
Steps (Part 1):
-
Create a Docker Container with a Named Volume:
# Create a named volume docker volume create my_test_data_volume # Run a simple container that writes to this volume and then exits docker run --rm -v my_test_data_volume:/app_data --name data_writer alpine /bin/sh -c \ "echo 'Hello from Docker Volume!' > /app_data/message.txt && \ echo 'Some more data...' >> /app_data/another_file.log && \ ls -l /app_data" # Verify content was written (optional, by running another container to read) # docker run --rm -v my_test_data_volume:/app_data alpine cat /app_data/message.txt
-
Locate the Volume's Mountpoint on the Host:
Look for the"Mountpoint"
line. It will be something like/var/lib/docker/volumes/my_test_data_volume/_data
. Copy this path. Let's call itDOCKER_VOLUME_PATH
. -
Back Up the Volume Data from the Host: Important: If the container that uses this volume were long-running (like a database), you would
docker stop <container_name>
first. Since ourdata_writer
container already exited, the data is quiescent.If# Replace DOCKER_VOLUME_PATH with the actual path from 'docker volume inspect' # On Linux, you might need sudo to access /var/lib/docker/volumes/ # Example: sudo restic backup /var/lib/docker/volumes/my_test_data_volume/_data --tag docker_volume_test # If you don't want to use sudo for restic itself, you could temporarily change permissions or use ACLs, # but for a quick test, sudo is often easiest if restic needs to read system-owned files. # Assuming your user has read access for this workshop or you're running as root/sudo: DOCKER_VOLUME_PATH=$(docker volume inspect my_test_data_volume | jq -r '.[0].Mountpoint') # If jq is installed echo "Backing up from: ${DOCKER_VOLUME_PATH}" # Ensure RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE are set to your local test repo sudo restic backup "${DOCKER_VOLUME_PATH}" --tag workshop_docker_vol
sudo restic
is used, it won't see your user'sRESTIC_PASSWORD_FILE
env var. Either runsudo -E restic ...
to preserve environment, or specify--password-file
and--repo
directly on thesudo restic
command line. For simplicity ifsudo
is problematic with env vars for you: -
Restore Volume Data to a New Location and Verify:
You should seemkdir ~/restic_workshop/docker_vol_restore # Find snapshot ID or use latest with path LATEST_VOL_SNAPSHOT_ID=$(restic snapshots --json --latest 1 --path "${DOCKER_VOLUME_PATH}" --tag workshop_docker_vol | jq -r '.[0].short_id') restic restore ${LATEST_VOL_SNAPSHOT_ID} --target ~/restic_workshop/docker_vol_restore # Verify ls -l ~/restic_workshop/docker_vol_restore/${DOCKER_VOLUME_PATH}/ # Note the full path in restore cat ~/restic_workshop/docker_vol_restore/${DOCKER_VOLUME_PATH}/message.txt
Hello from Docker Volume!
andanother_file.log
. -
Alternative Discussion Restic in Docker for Backup: Instead of backing up from the host, you could run Restic in a container:
This encapsulates Restic itself. It's very useful for orchestrated environments like Kubernetes.# docker run --rm \ # -v my_test_data_volume:/source_data:ro \ # Mount the volume to back up # -v your_restic_password_file_on_host:/tmp/restic_pass \ # Mount password file # -v your_restic_cache_dir_on_host:/root/.cache/restic \ # Mount cache (optional) # -e RESTIC_REPOSITORY="your_repo_url" \ # -e RESTIC_PASSWORD_FILE="/tmp/restic_pass" \ # restic/restic backup /source_data --tag docker_container_backup
Part 2: Setting up a Basic Rest-Server
Goals:
- Install
rest-server
. - Configure basic htpasswd authentication.
- Run
rest-server
to serve a local directory. - Initialize a new Restic repository using this
rest-server
and perform a test backup.
Prerequisites:
- A Linux machine (can be your main machine or a VM).
htpasswd
utility (fromapache2-utils
or similar).- Ability to download and run the
rest-server
binary.
Steps (Part 2):
-
Download
rest-server
: Go to https://github.com/restic/rest-server/releases/latest. Download the binary for your OS/architecture (e.g.,rest-server_<version>_linux_amd64
).Make sure to get the latest version and adjust commands accordingly.# Example download and setup cd ~/restic_workshop # wget https://github.com/restic/rest-server/releases/download/v0.12.1/rest-server_0.12.1_linux_amd64.tar.gz # Replace with latest # tar -xzf rest-server_*.tar.gz # For simplicity, let's assume the binary is now at ./rest-server_0.12.1_linux_amd64/rest-server # sudo mv ./rest-server_*/rest-server /usr/local/bin/rest-server # chmod +x /usr/local/bin/rest-server # Verify: rest-server --version # For workshop, can just run from current dir: ./rest-server # For this guide, let's assume you downloaded and extracted it to ~/restic_workshop/rest-server_binary/rest-server # For example: # wget https://github.com/restic/rest-server/releases/download/v0.12.1/rest-server_0.12.1_linux_amd64 -O ~/restic_workshop/rest-server # chmod +x ~/restic_workshop/rest-server # Now ~/restic_workshop/rest-server is the executable REST_SERVER_BINARY=~/restic_workshop/rest-server # Adjust if you put it elsewhere like /usr/local/bin
-
Create Data Directory and Htpasswd File:
mkdir -p ~/restic_workshop/rest_server_data # Data path for rest-server mkdir -p ~/restic_workshop/rest_server_config # For htpasswd file # Install htpasswd if not present (Debian/Ubuntu: sudo apt install apache2-utils) # Create htpasswd file with a user 'workshop_user' htpasswd -B -c ~/restic_workshop/rest_server_config/htpasswd workshop_user # Enter a password for workshop_user when prompted (e.g., "restworkshoppass")
-
Run
rest-server
(No TLS for this basic local workshop - NOT FOR PRODUCTION): Warning: Running without TLS is insecure for real use. This is only for a quick local test. Open a new terminal window forrest-server
as it will run in the foreground.# In NEW Terminal 1 (for rest-server): cd ~/restic_workshop # Ensure paths are relative to here if needed ${REST_SERVER_BINARY} \ --path ./rest_server_data \ --listen localhost:8000 \ --htpasswd-file ./rest_server_config/htpasswd \ --private-repos \ --debug # For more verbose output during the workshop # It should say something like "Starting server..." # Keep this terminal open.
-
Initialize and Use Restic with
rest-server
: Go back to your original terminal window (Terminal 2).- Repository URL:
rest:http://workshop_user:PASSWORD@localhost:8000/my_first_rest_repo
ReplacePASSWORD
with the password you set forworkshop_user
inhtpasswd
. If--private-repos
is used, the repository path is relative to the user's directory automatically created byrest-server
(./rest_server_data/workshop_user/
). The/my_first_rest_repo
then becomes a subdirectory within that. So, effectively, the data will be in./rest_server_data/workshop_user/my_first_rest_repo
. Let's simplify, if--private-repos
is used, the userworkshop_user
automatically getsrest_server_data/workshop_user/
. You can then init a repo directly in there. The URL becomesrest:http://workshop_user:PASSWORD@localhost:8000/
if you want the repo directly in the user's root, orrest:http://workshop_user:PASSWORD@localhost:8000/specific_repo_name
- Define variables for Restic repository (new one for rest-server):
# In Terminal 2: # REST_USER_PASSWORD should be the password you set for 'workshop_user' in htpasswd export REST_USER_PASSWORD="restworkshoppass" # The password for workshop_user export REST_SERVER_REPO_URL="rest:http://workshop_user:${REST_USER_PASSWORD}@localhost:8000/workshop_repo1" # New Restic encryption password for this new repository echo "restic_encryption_key_for_rest_server" > ~/restic_workshop/rest_server_enc_pass.txt export RESTIC_PASSWORD_FILE=~/restic_workshop/rest_server_enc_pass.txt
-
Initialize Restic Repository:
Check Terminal 1 (rest-server output). You should see activity (GET, PUT requests). On the filesystem, check~/restic_workshop/rest_server_data/workshop_user/workshop_repo1
. It should now contain Restic's repo structure. -
Perform a Test Backup:
Again, observe logs in Terminal 1. - List Snapshots: You should see your new snapshot.
- Repository URL:
-
Cleanup:
- In Terminal 1 (rest-server), press
Ctrl+C
to stoprest-server
. - You can remove
~/restic_workshop/rest_server_data
,~/restic_workshop/rest_server_config
, and~/restic_workshop/rest_server_enc_pass.txt
if you want to clean up this workshop's specific files.
- In Terminal 1 (rest-server), press
This workshop demonstrated basic integration with Docker volumes and setting up a rest-server
. For production rest-server
use, always implement TLS and consider running it as a system service with proper logging and resource management. The append-only mode is a particularly valuable feature for enhancing backup security.
Conclusion
Throughout this comprehensive guide, we've journeyed from the fundamentals of Restic to its advanced applications in a self-hosted environment. You've learned how to install Restic, initialize repositories, perform backups and restores, manage repository health and size, and leverage remote backends for robust data protection. We've also delved into scripting, automation, security considerations, and integration with other essential tools like Docker and Rclone.
Recap of Restic's Strengths:
- Security First: Client-side encryption with strong algorithms ensures your data remains private, regardless of where it's stored. You hold the keys.
- Efficiency: Deduplication via content-defined chunking saves significant storage space and bandwidth, making frequent backups feasible.
- Simplicity and Flexibility: A clear command-line interface combined with support for numerous local and remote backends provides versatility for diverse self-hosting setups.
- Reliability: Features like
restic check
allow you to verify the integrity of your backups, giving you confidence in your ability to restore. - Open Source: Transparency and an active community contribute to its trustworthiness and continuous improvement.
Importance of Regular Testing of Backups:
A backup strategy is only as good as its last successful restore. It's not enough to assume your backups are working; you must regularly test them. This involves:
- Periodically performing actual restores of randomly selected files or entire directories to a test location.
- Verifying the integrity of the restored data.
- Reviewing backup logs for errors or warnings.
- Running
restic check
(and occasionallyrestic check --read-data
) to ensure repository health.
Make testing a routine part of your data management discipline. This practice will not only confirm that your backups are functional but also familiarize you with the restore process, which is invaluable during a real data loss emergency.
Further Learning Resources:
- Official Restic Documentation: The primary source of truth. It's comprehensive and well-maintained: https://restic.readthedocs.io/
- Restic Forum: A great place to ask questions, share experiences, and learn from other users: https://forum.restic.net/
- Restic GitHub Repository: For source code, issue tracking, and release notes: https://github.com/restic/restic
- Restic Blog: For announcements and articles: https://restic.net/blog/
Encouragement for Responsible Data Management:
Data is valuable. Whether it's personal memories, critical project files, or configurations for your self-hosted services, losing it can be devastating. By learning and implementing tools like Restic, you are taking a significant step towards responsible data stewardship.
Embrace the principles of the 3-2-1 backup rule, automate your backups, monitor them diligently, and test your restores. The peace of mind that comes from knowing your data is safe and recoverable is well worth the effort. Continue to explore, experiment, and refine your backup strategy as your needs and self-hosted infrastructure evolve. Happy self-hosting, and may your data always be secure!