Skip to content
Author Nejat Hakan
eMail nejat.hakan@outlook.de
PayPal Me https://paypal.me/nejathakan


Backup Server - Restic

Introduction to Restic

Welcome to this comprehensive guide on Restic, a modern, secure, efficient, and easy-to-use backup program. In the world of self-hosting, reliable backups are not just a good idea; they are an absolute necessity. Data loss can be catastrophic, ranging from losing precious personal memories to critical operational data for your self-hosted services. Restic provides a robust solution to safeguard your digital assets.

This guide will take you from the fundamental concepts of Restic, through practical basic usage, into intermediate repository management and remote backends, and finally to advanced topics like automation, internals, and integration with other tools. Each section is designed to build upon the last, providing you with the knowledge and skills to confidently implement and manage your own Restic-based backup strategy.

What is Restic?

Restic is an open-source backup program written in Go. It's designed to be fast, efficient, and secure. It operates on the principle of client-side encryption, meaning your data is encrypted before it leaves your machine, ensuring that even if your backup storage is compromised, your data remains unreadable without the decryption key (your password).

Key Features:

  • Secure: Restic uses strong, authenticated encryption (AES-256 in counter mode, Poly1305-AES for MAC) for all data and metadata. The encryption keys are derived from your repository password. This means your data is protected both in transit and at rest.
  • Efficient: Restic employs content-defined chunking and deduplication. This means it breaks files into smaller pieces (chunks) and only stores unique chunks. If multiple files share the same content, or if a file is only slightly modified between backups, Restic only uploads the new or changed chunks, saving significant storage space and bandwidth.
  • Verifiable: Restic allows you to verify your backups to ensure data integrity and that your backed-up data can actually be restored.
  • Easy to Use: Restic has a straightforward command-line interface (CLI). While powerful, its basic operations are intuitive.
  • Cross-Platform: It runs on Linux, macOS, Windows, BSD, and other operating systems.
  • Multiple Backends: Restic can store backups in various locations:
    • Local directories.
    • SFTP servers (via SSH).
    • HTTP REST servers (like Restic's own rest-server or rclone's rclone serve restic).
    • Cloud storage services like Amazon S3, Backblaze B2, Wasabi, Google Cloud Storage, Azure Blob Storage, and many others (often via rclone).
  • Snapshots: Restic creates point-in-time snapshots of your data. You can easily browse and restore files from any snapshot.
  • Free and Open Source: Restic is licensed under the BSD 2-Clause License, giving you the freedom to use, modify, and distribute it.

Philosophy behind Restic:
The core philosophy of Restic revolves around simplicity, security, and efficiency. The developers aimed to create a backup tool that "just works" and gives users peace of mind. There's a strong emphasis on doing one thing (backing up data) and doing it well, without unnecessary complexity. The design ensures that the user is always in control of their encryption keys.

Brief Comparison with Other Backup Tools:
While tools like tar, rsync, or BorgBackup have their places, Restic offers a compelling combination of features:

  • Compared to tar and rsync: Restic provides versioning (snapshots) and strong encryption natively. rsync is great for mirroring, but not for historical backups.
  • Compared to BorgBackup: Both Restic and Borg are excellent modern backup tools with encryption and deduplication. Restic's design is often considered simpler for some remote backends (e.g., native S3 support without needing an SSH server on the remote end like Borg traditionally did, though Borg now has borg serve). Restic's use of Go allows for easy cross-platform binaries. Borg is Python-based. The choice often comes down to specific backend needs, performance characteristics in particular environments, or personal preference for the CLI or internal architecture.

Why choose Restic for self-hosting?

For self-hosters, Restic offers several compelling advantages:

  • Client-Side Encryption: This is paramount. When self-hosting, you might use various storage solutions, some potentially less secure than enterprise-grade offerings or even public cloud storage. Restic encrypts data on your machine before it's sent anywhere. You hold the keys, ensuring privacy even if the storage backend is breached or untrusted.
  • Storage Efficiency: Deduplication means you can store many snapshots over long periods without consuming excessive disk space. This is crucial when you're managing your own storage resources.
  • Flexibility in Storage Backends: Whether you have a dedicated NAS, an old computer acting as an SFTP server, or you decide to use a cheap cloud storage provider for off-site backups, Restic can likely support it. This allows you to tailor your backup strategy to your budget and infrastructure.
  • Open-Source Transparency: You can inspect the code, understand how it works, and be part of a community. This builds trust, which is essential for a critical tool like a backup program.
  • No Server-Side Daemon Required (for many backends): For backends like SFTP or S3, Restic doesn't need special software running on the server (beyond the standard SFTP or S3 service). This simplifies setup and maintenance.

Core Concepts

Understanding these core concepts is key to using Restic effectively:

  • Repository: This is the storage location where Restic keeps your backup data. It's a specially structured directory (or S3 bucket, etc.) that Restic initializes and manages. All your backed-up data, for all snapshots, from all sources (if you back up multiple directories or machines to the same repository), resides here, encrypted and deduplicated.
  • Snapshots: A snapshot is an immutable, point-in-time record of the files and directories you backed up. Each time you run restic backup, a new snapshot is created. You can think of it as a "photo" of your data at that specific moment. You can list, browse, and restore data from any existing snapshot.
  • Deduplication: This is Restic's magic for saving space.
    • Content-Defined Chunking: Restic first breaks down your files into smaller pieces called "chunks." Instead of fixed-size chunks, it uses an algorithm (related to the Rabin fingerprint) to find natural boundaries within the file content. This means if you insert data in the middle of a file, only the chunks around the insertion point change, while the rest of the file's chunks remain identical.
    • Hashing and Storage: Each chunk is then hashed (a unique fingerprint is created). Restic stores each unique chunk (identified by its hash) only once in the repository.
    • How it Works: When you back up a file, Restic chunks it. For each chunk, it checks if a chunk with the same hash already exists in the repository. If yes, it just references the existing chunk. If no, it encrypts and stores the new chunk. This applies across all files and all snapshots in the repository.
  • Encryption: Restic encrypts everything it stores in the repository.
    • Client-Side: Encryption happens on the machine running Restic before data is sent to the repository.
    • Password-Based Key Derivation: You provide a password when initializing a repository. Restic uses this password to derive strong encryption keys. This password is the only way to access the data. If you lose this password, your backup data is irrecoverably lost.
    • Authenticated Encryption: Restic uses cryptographic techniques that ensure both confidentiality (data is unreadable) and integrity/authenticity (data cannot be tampered with undetected).
  • Internal Data Structures (Brief Overview):
    • Blobs: The fundamental unit of data. There are two types:
      • Data Blobs: These store the (encrypted) content of your file chunks.
      • Tree Blobs: These store (encrypted) metadata, like directory listings, file names, permissions, and pointers (hashes) to the data blobs or other tree blobs that make up a file or directory.
    • Trees: These are hierarchical structures made of tree blobs, representing the directory structure of your backup. A snapshot points to a root tree.
    • Packs: For efficiency, Restic groups multiple blobs together into larger files called "pack files" within the repository's data subdirectory.
    • Index: Restic maintains an index that maps blob hashes to the pack files where they are stored, allowing for quick lookups.
    • Snapshots (files): These are small files in the repository that store metadata about a specific backup operation, including a pointer to the root tree blob for that backup, the time, host, tags, and paths backed up.
    • Keys: Files in the repository storing the encrypted master keys. These are themselves encrypted using keys derived from your repository password.
    • Config: A file in the repository storing its configuration, such as the repository version and chunking parameters.

Workshop Introduction Preparing Your Environment and Installing Restic

This first workshop will guide you through installing Restic and setting up a very basic local environment. We'll use a simple folder on your local machine as the initial Restic repository.

Goals:

  1. Install Restic on your system.
  2. Verify the installation.
  3. Understand how Restic refers to repositories.

Prerequisites:

  • A Linux, macOS, or Windows computer.
  • Internet access to download Restic.
  • Basic familiarity with the command line/terminal.

Steps:

  1. Create a Working Directory (Optional but Recommended): It's good practice to have a dedicated directory for your Restic experiments.

    • On Linux/macOS:
      mkdir ~/restic_workshop
      cd ~/restic_workshop
      
    • On Windows (PowerShell):
      mkdir ~\restic_workshop
      cd ~\restic_workshop
      
      For the rest of this guide, commands will often assume you are in a suitable working directory.
  2. Download and Install Restic: Restic offers pre-compiled binaries, which are the easiest way to get started. Visit the Restic releases page on GitHub to find the latest version.

    • Linux:

      1. Download the correct archive (e.g., restic_<version>_linux_amd64.bz2).
      2. Extract it:
        # Example for version 0.16.4 - replace with the actual version you downloaded
        wget https://github.com/restic/restic/releases/download/v0.16.4/restic_0.16.4_linux_amd64.bz2
        bzip2 -d restic_0.16.4_linux_amd64.bz2
        
      3. Make it executable and move it to a directory in your PATH:
        chmod +x restic_0.16.4_linux_amd64
        sudo mv restic_0.16.4_linux_amd64 /usr/local/bin/restic
        
        (If you don't have sudo rights or prefer a user-local installation, you can create ~/bin or ~/.local/bin, add it to your PATH, and move restic there.)
    • macOS (using Homebrew is easiest):

      brew install restic
      
      If not using Homebrew, download the restic_<version>_darwin_amd64.bz2 (for Intel Macs) or restic_<version>_darwin_arm64.bz2 (for Apple Silicon Macs), extract, and move it to /usr/local/bin/restic as described for Linux.

    • Windows (using Scoop or Chocolatey is easiest, or manual download):

      • Using Scoop:
        scoop install restic
        
      • Using Chocolatey:
        choco install restic
        
      • Manual:
        1. Download restic_<version>_windows_amd64.zip.
        2. Extract restic.exe.
        3. Move restic.exe to a folder that is in your system's PATH (e.g., C:\Windows\System32, or create a dedicated folder like C:\ProgramFiles\Restic and add it to your PATH environment variable).
  3. Verify the Installation: Open a new terminal window (to ensure PATH changes are recognized) and type:

    restic version
    
    You should see output similar to:
    restic <version> compiled with go<version> on <os>/<arch>
    
    For example:
    restic 0.16.4 compiled with go1.21.5 on linux/amd64
    
    If you see this, Restic is installed correctly and accessible from your command line.

  4. Understanding Repository Paths: Restic needs to know where its repository is. This is specified with the -r flag or the RESTIC_REPOSITORY environment variable. For this introduction, we'll create a local directory to serve as our first repository. Let's create a directory that will become our Restic repository:

    • On Linux/macOS (inside ~/restic_workshop):
      mkdir my_first_repo
      
    • On Windows (PowerShell, inside ~\restic_workshop):
      mkdir my_first_repo
      
      The path to this repository would then be, for example, ~/restic_workshop/my_first_repo (Linux/macOS) or ~\restic_workshop\my_first_repo (Windows). Restic will populate this directory with its own structure once initialized.

This workshop has prepared your system by installing Restic. You are now ready to initialize your first repository and start backing up data, which we will cover in the next section.

1. Getting Started with Restic

With Restic installed, you're ready to dive into its core functionalities: creating a secure place for your backups (a repository), making your first backup, and learning how to see what you've backed up. This section will cover these fundamental steps, focusing on using a local directory as your repository for simplicity.

Installation

This sub-section details the installation process across common operating systems. If you've already completed the "Workshop Introduction" and successfully installed Restic, you can skim or skip this part. However, it provides more detailed explanations and alternatives.

Restic is distributed as a single executable file, making installation straightforward.

  • Linux:

    • Using Package Managers (Recommended for ease of updates): Many distributions include Restic in their official repositories.
      • Debian/Ubuntu:
        sudo apt update
        sudo apt install restic
        
      • Fedora:
        sudo dnf install restic
        
      • Arch Linux:
        sudo pacman -S restic
        
      • Note: Package manager versions might sometimes lag behind the latest Restic release. If you need the absolute latest version, the binary download is preferable.
    • Binary Download (for latest version or if not in package manager):
      1. Go to the Restic GitHub releases page.
      2. Download the appropriate archive for your system architecture (usually amd64 for 64-bit Intel/AMD CPUs, or arm64 for 64-bit ARM CPUs). For example, restic_<version>_linux_amd64.bz2.
      3. Open a terminal and navigate to your downloads directory.
      4. Extract the archive. If it's a .bz2 file:
        # Replace <version> and <arch> with actual values
        bzip2 -d restic_<version>_linux_<arch>.bz2
        
        This will leave you with an executable file, e.g., restic_<version>_linux_amd64.
      5. Make the binary executable:
        chmod +x restic_<version>_linux_<arch>
        
      6. Move the binary to a directory in your system's PATH. A common choice is /usr/local/bin:
        sudo mv restic_<version>_linux_<arch> /usr/local/bin/restic
        
        If you don't have sudo access or prefer a user-local installation, you can create a directory like ~/.local/bin (if it doesn't exist), add it to your PATH environment variable (by editing ~/.bashrc, ~/.zshrc, or ~/.profile), and then move the binary there:
        mkdir -p ~/.local/bin
        mv restic_<version>_linux_<arch> ~/.local/bin/restic
        # You might need to open a new terminal or source your shell config file
        # e.g., source ~/.bashrc
        
  • macOS:

    • Using Homebrew (Recommended): Homebrew is a popular package manager for macOS.
      brew install restic
      
    • Binary Download:
      1. Go to the Restic GitHub releases page.
      2. Download the macOS archive, typically restic_<version>_darwin_amd64.bz2 (for Intel Macs) or restic_<version>_darwin_arm64.bz2 (for Apple Silicon Macs).
      3. Open Terminal and navigate to your downloads directory.
      4. Extract the archive:
        # Replace <version> and <arch>
        bzip2 -d restic_<version>_darwin_<arch>.bz2
        
      5. Make the binary executable:
        chmod +x restic_<version>_darwin_<arch>
        
      6. Move it to a directory in your PATH, typically /usr/local/bin (Homebrew uses this path too):
        sudo mv restic_<version>_darwin_<arch> /usr/local/bin/restic
        
        macOS might show a security warning when you first try to run a downloaded binary. You may need to go to "System Settings" > "Privacy & Security", scroll down, and click "Allow Anyway" for Restic. Alternatively, running xattr -d com.apple.quarantine /usr/local/bin/restic might be necessary.
  • Windows:

    • Using Package Managers (Scoop or Chocolatey recommended):
      • Scoop:
        scoop install restic
        
      • Chocolatey (run PowerShell as Administrator):
        choco install restic
        
    • Binary Download:
      1. Go to the Restic GitHub releases page.
      2. Download the Windows archive, usually restic_<version>_windows_amd64.zip.
      3. Extract the restic.exe file from the ZIP archive.
      4. Move restic.exe to a directory that is included in your system's PATH environment variable.
        • You can create a dedicated folder, e.g., C:\Program Files\Restic, and move restic.exe there.
        • Then, add this folder to your PATH:
          1. Search for "environment variables" in the Start Menu and select "Edit the system environment variables".
          2. In the System Properties window, click the "Environment Variables..." button.
          3. Under "System variables" (or "User variables" if you prefer), find the variable named Path and select it.
          4. Click "Edit...".
          5. Click "New" and add the path to your Restic folder (e.g., C:\Program Files\Restic).
          6. Click "OK" on all open dialogs.
          7. You'll need to open a new Command Prompt or PowerShell window for the PATH change to take effect.
  • Verifying the Installation: After installation, open a new terminal or command prompt and run:

    restic version
    
    This command should output the installed Restic version, confirming that the system can find and execute the restic binary. If you get a "command not found" error, double-check that the directory containing the Restic executable is correctly added to your system's PATH and that you've opened a new terminal session.

Initializing Your First Repository

A Restic repository is the storage location where your encrypted and deduplicated backup data will live. Before you can back up any data, you must initialize a repository. For this initial setup, we will use a local directory.

  • Choosing a Repository Location: For now, let's create a directory specifically for our Restic repository. If you followed the "Workshop Introduction", you might have already created my_first_repo. If not:

    • On Linux/macOS:
      mkdir ~/restic_workshop/my_local_repo
      
    • On Windows (PowerShell):
      mkdir ~\restic_workshop\my_local_repo
      
      The path to this repository will be ~/restic_workshop/my_local_repo (or its Windows equivalent).
  • The restic init command: This command prepares a new storage location to be used as a Restic repository. The syntax is restic -r /path/to/repo init. The -r flag (or --repo) specifies the repository location. Let's initialize our repository:

    • Linux/macOS:
      restic -r ~/restic_workshop/my_local_repo init
      
    • Windows (PowerShell):
      restic -r ~\restic_workshop\my_local_repo init
      
  • Understanding the Repository Password: Upon running init, Restic will prompt you to enter a password for the new repository:

    enter password for new repository:
    enter password again:
    
    This password is CRITICAL.

    • It encrypts the master keys that protect your data.
    • There is NO WAY to recover data from a Restic repository if you lose this password. Restic developers cannot help you.
    • Choose a strong, unique password and store it securely (e.g., in a reputable password manager).
    • You will need this password for every interaction with this repository (backing up, restoring, listing snapshots, maintenance, etc.), unless you use a password file or environment variable (covered later).

    After successfully entering and confirming the password, Restic will output something like:

    created restic repository <long_hex_id> at /home/user/restic_workshop/my_local_repo
    
    Please note that knowledge of the password is required to access the repository.
    Losing the password means losing access to all data stored in the repository.
    

  • Repository Structure (Brief Overview): If you now look inside the ~/restic_workshop/my_local_repo directory, you'll see a structure created by Restic:

    my_local_repo/
    ├── config
    ├── data/
    ├── index/
    ├── keys/
    ├── locks/
    └── snapshots/
    

    • config: Contains repository configuration (e.g., version, chunker parameters).
    • data/: Stores pack files, which contain the actual (encrypted) data blobs. This is where the bulk of your backup data will reside.
    • index/: Contains index files that map blob IDs to pack files for quick lookups.
    • keys/: Stores the encrypted master encryption keys. These are unlocked by your repository password.
    • locks/: Used by Restic to manage concurrent access and prevent repository corruption.
    • snapshots/: Stores individual snapshot files (metadata about each backup).

    You generally don't need to interact with these files and directories directly. Restic manages them for you. Never manually delete or modify files within a Restic repository unless you know exactly what you are doing, as this can lead to data loss.

Making Your First Backup

Once the repository is initialized, you can start backing up your data.

  • The restic backup command:
    The basic syntax is restic -r /path/to/repo backup /path/to/your/data [another/path ...]. You will be prompted for your repository password.

  • Selecting Files and Directories to Back Up:
    Let's create some sample data to back up.

    • On Linux/macOS (inside ~/restic_workshop):
      mkdir ~/restic_workshop/data_to_backup
      echo "This is file1.txt in the root of our backup." > ~/restic_workshop/data_to_backup/file1.txt
      mkdir ~/restic_workshop/data_to_backup/project_alpha
      echo "Alpha project details." > ~/restic_workshop/data_to_backup/project_alpha/readme.md
      echo "Some important notes for Alpha." > ~/restic_workshop/data_to_backup/project_alpha/notes.txt
      
    • On Windows (PowerShell, inside ~\restic_workshop):
      mkdir ~\restic_workshop\data_to_backup
      Set-Content -Path ~\restic_workshop\data_to_backup\file1.txt -Value "This is file1.txt in the root of our backup."
      mkdir ~\restic_workshop\data_to_backup\project_alpha
      Set-Content -Path ~\restic_workshop\data_to_backup\project_alpha\readme.md -Value "Alpha project details."
      Set-Content -Path ~\restic_workshop\data_to_backup\project_alpha\notes.txt -Value "Some important notes for Alpha."
      

    Now, let's back up the data_to_backup directory:

    • Linux/macOS:
      restic -r ~/restic_workshop/my_local_repo backup ~/restic_workshop/data_to_backup
      
    • Windows (PowerShell):
      restic -r ~\restic_workshop\my_local_repo backup ~\restic_workshop\data_to_backup
      
      You will be prompted for the repository password you set earlier.
  • Understanding Tags for Organizing Backups:
    You can add tags to your snapshots to help organize and identify them later. This is useful if you back up different types of data or from different sources to the same repository. Use the --tag option (can be specified multiple times). Example:

    # Linux/macOS
    restic -r ~/restic_workshop/my_local_repo backup --tag project --tag important ~/restic_workshop/data_to_backup/project_alpha
    
    This would create a new snapshot specifically for project_alpha and tag it accordingly.

  • Observing the Backup Process (Output, Progress):
    When Restic runs a backup, it provides output:

    enter password for repository:
    repository <some_id> opened successfully, password is correct
    found 1 previous snapshots
    scan [/home/user/restic_workshop/data_to_backup]
    scanned 3 files, 63B in 0:00
    [0:00] 100.00%  3 / 3 files  63 B / 63 B  0s  done
    duration: 0:00, 0.02MiB/s
    snapshot <snapshot_id_1> saved
    
    If this is not the first backup, it will say something like:
    Files:           3 new,     0 changed,     0 unmodified
    Dirs:            1 new,     0 changed,     0 unmodified
    Added to Aepository: 1.148 KiB (1.118 KiB stored)
    processed 3 files, 63 B in 0:00
    snapshot <snapshot_id_2> saved
    
    Key information in the output:

    • scan [...]: Restic scans the specified paths.
    • Files: X new, Y changed, Z unmodified: Shows how many files were new, modified since the last backup (of these paths), or unchanged. This highlights Restic's deduplication; unchanged files aren't re-processed or re-stored.
    • Added to repository: X size (Y size stored): Shows the total size of new/changed data and the actual data added to the repository after deduplication and compression (if any, though Restic's compression is minimal, focusing on deduplication).
    • snapshot <snapshot_id> saved: Indicates success and gives you the ID of the newly created snapshot.

Listing and Inspecting Snapshots

After making backups, you'll want to see what snapshots are stored in your repository.

  • The restic snapshots command:
    This command lists all snapshots in the repository.

    # Linux/macOS
    restic -r ~/restic_workshop/my_local_repo snapshots
    # Windows (PowerShell)
    restic -r ~\restic_workshop\my_local_repo snapshots
    
    You'll be prompted for the repository password. The output will look something like this:
    enter password for repository:
    repository <some_id> opened successfully, password is correct
    ID        Time                 Host        Tags        Paths
    ----------------------------------------------------------------------------------------------------
    a1b2c3d4  2023-10-27 10:00:00  my-laptop              /home/user/restic_workshop/data_to_backup
    e5f6g7h8  2023-10-27 10:05:00  my-laptop   project    /home/user/restic_workshop/data_to_backup/project_alpha
                                              important
    ----------------------------------------------------------------------------------------------------
    2 snapshots
    

  • Interpreting Snapshot Information:

    • ID: A unique short identifier for the snapshot (e.g., a1b2c3d4). You use this ID to refer to the snapshot for restoring, browsing, or deleting.
    • Time: The date and time the backup operation started.
    • Host: The hostname of the machine where the backup was created. This is useful if you back up multiple machines to the same repository.
    • Tags: Any tags you applied to the snapshot during backup.
    • Paths: The original path(s) on the source machine that were included in this backup.
  • The restic ls <snapshot_id> command:
    To see the file and directory structure within a specific snapshot, use the ls command with the snapshot ID. You can use the short ID from the snapshots command.

    # Linux/macOS - replace a1b2c3d4 with an actual ID from your output
    restic -r ~/restic_workshop/my_local_repo ls a1b2c3d4
    
    Output:
    enter password for repository:
    repository <some_id> opened successfully, password is correct
    snapshot a1b2c3d4 of [/home/user/restic_workshop/data_to_backup] filtered by []
    /file1.txt
    /project_alpha
    /project_alpha/notes.txt
    /project_alpha/readme.md
    
    You can also list contents of subdirectories within the snapshot:
    # Linux/macOS
    restic -r ~/restic_workshop/my_local_repo ls a1b2c3d4 /project_alpha
    
    Output:
    enter password for repository:
    repository <some_id> opened successfully, password is correct
    snapshot a1b2c3d4 of [/home/user/restic_workshop/data_to_backup] filtered by []
    /project_alpha/notes.txt
    /project_alpha/readme.md
    
    The ls command is very useful for verifying that the files you intended to back up are indeed present in the snapshot.

Workshop Your First Backup and Restore

This workshop will put the theory into practice. You'll create sample data, back it up, simulate data loss, and then restore your data.

Goals:

  1. Initialize a Restic repository.
  2. Create sample data.
  3. Back up the sample data to the repository.
  4. List snapshots to verify the backup.
  5. Simulate data loss by deleting the original data.
  6. Restore the data from the Restic snapshot.
  7. Verify the integrity of the restored data.

Prerequisites:

  • Restic installed.
  • A terminal or command prompt.

Steps:

  1. Navigate to Your Workshop Directory:

    • Linux/macOS:
      cd ~/restic_workshop
      
    • Windows (PowerShell):
      cd ~\restic_workshop
      
  2. Define Repository Path (for convenience in this workshop): To avoid typing the full repository path repeatedly, you can set an environment variable for your current terminal session.

    • Linux/macOS:
      # If my_local_repo doesn't exist, create it: mkdir my_local_repo
      export RESTIC_REPOSITORY=~/restic_workshop/my_local_repo
      # For this workshop, also set a password file to avoid typing the password repeatedly
      # THIS IS FOR DEMONSTRATION. SECURE YOUR PASSWORD FILE APPROPRIATELY IN REAL SCENARIOS.
      echo "your_super_secret_password" > ~/restic_workshop/mypass.txt
      chmod 600 ~/restic_workshop/mypass.txt # Restrict permissions
      export RESTIC_PASSWORD_FILE=~/restic_workshop/mypass.txt
      
    • Windows (PowerShell):
      # If my_local_repo doesn't exist, create it: mkdir my_local_repo
      $env:RESTIC_REPOSITORY = "$env:USERPROFILE\restic_workshop\my_local_repo"
      # For this workshop, also set a password file
      # THIS IS FOR DEMONSTRATION. SECURE YOUR PASSWORD FILE APPROPRIATELY IN REAL SCENARIOS.
      Set-Content -Path "$env:USERPROFILE\restic_workshop\mypass.txt" -Value "your_super_secret_password"
      $env:RESTIC_PASSWORD_FILE = "$env:USERPROFILE\restic_workshop\mypass.txt"
      
      Important: Replace "your_super_secret_password" with the actual password you want to use. If you use a password file, ensure it's protected and not committed to version control if your workshop directory is a git repo. For production, consider more secure ways to handle passwords for automation (like systemd credentials or other secrets management tools).
  3. Initialize the Restic Repository (if not already done): If my_local_repo is empty or doesn't exist, restic init will create it. If it was already initialized and you set the RESTIC_PASSWORD_FILE to the correct password, Restic will use it.

    restic init
    
    If it was already initialized, it will say something like: repository master key and config already initialized. If you used a new password in mypass.txt for an existing repo, this will fail. Ensure consistency or re-initialize in a fresh my_local_repo directory. For this workshop, if my_local_repo exists from previous steps, you can remove it and re-create it to start fresh:

    • Linux/macOS: rm -rf my_local_repo; mkdir my_local_repo
    • Windows: Remove-Item -Recurse -Force my_local_repo; mkdir my_local_repo Then run restic init.
  4. Create Sample Data: We'll create a directory named source_data with a few files and subdirectories.

    • Linux/macOS:
      mkdir source_data
      echo "Hello Restic World!" > source_data/greeting.txt
      echo "This is an important document." > source_data/important_doc.md
      mkdir source_data/my_photos
      echo "image_data_1" > source_data/my_photos/photo1.jpg
      echo "image_data_2" > source_data/my_photos/photo2.png
      tree source_data # (if you have 'tree' installed, otherwise 'ls -R source_data')
      
    • Windows (PowerShell):
      mkdir source_data
      Set-Content -Path source_data\greeting.txt -Value "Hello Restic World!"
      Set-Content -Path source_data\important_doc.md -Value "This is an important document."
      mkdir source_data\my_photos
      Set-Content -Path source_data\my_photos\photo1.jpg -Value "image_data_1"
      Set-Content -Path source_data\my_photos\photo2.png -Value "image_data_2"
      Get-ChildItem -Recurse source_data
      
  5. Back Up the Sample Data: Now, back up the source_data directory. We'll add a tag workshop1.

    restic backup ./source_data --tag workshop1
    
    You should see output indicating the files scanned and the snapshot ID created. Because RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE are set, you shouldn't be prompted for them.

  6. List Snapshots: Verify that the backup was created:

    restic snapshots
    
    You should see your snapshot listed, with the tag workshop1 and the path ./source_data. Note its short ID.

  7. Simulate Data Loss: Let's pretend a disaster happened and your original source_data directory is gone. Be careful with rm -rf or Remove-Item -Recurse -Force! Make sure you are in the correct directory (~/restic_workshop or ~\restic_workshop) and are deleting source_data.

    • Linux/macOS:
      rm -rf ./source_data
      ls ./source_data # This should give an error "No such file or directory"
      
    • Windows (PowerShell):
      Remove-Item -Recurse -Force ./source_data
      Get-ChildItem ./source_data # This should give an error
      
      Your original data is now "lost".
  8. Restore the Data: Now, restore the data from the Restic snapshot. You can use the snapshot ID you noted earlier, or use the special tag latest to refer to the most recent snapshot that backed up ./source_data. We'll restore it to a new directory called restored_data to avoid any ambiguity.

    # Replace 'latest' with your specific snapshot ID if you prefer
    # The --target option specifies where to restore the files.
    restic restore latest --target ./restored_data
    
    Restic will output the progress of the restore.

  9. Verify Restored Data: Check if the restored_data directory contains your original files and structure.

    • Linux/macOS:
      ls -R ./restored_data
      # For a more thorough check, compare file contents if you wish:
      diff -r ./restored_data/source_data ./source_data_original_reference # (if you made a copy before deleting)
      # Or simply check contents:
      cat ./restored_data/source_data/greeting.txt
      cat ./restored_data/source_data/my_photos/photo1.jpg
      
    • Windows (PowerShell):
      Get-ChildItem -Recurse ./restored_data
      # Or simply check contents:
      Get-Content ./restored_data/source_data/greeting.txt
      Get-Content ./restored_data/source_data/my_photos/photo1.jpg
      
      You should see that restored_data contains a subdirectory named source_data (because you backed up the source_data directory itself), and within that, all your original files and subdirectories. For example, you'll have restored_data/source_data/greeting.txt.

    If you wanted to restore directly to the original location (e.g., if source_data was empty), you would have used --target / and Restic would restore the files to source_data/... relative to your current directory if the snapshot path was relative. Restoring with --target to a specific directory is often safer.

Congratulations! You've successfully initialized a Restic repository, backed up data, simulated data loss, and restored your data. This covers the fundamental lifecycle of a backup. In the next section, we'll explore restore operations in more detail.

2. Basic Restore Operations

Having successfully created backups, the next crucial step is understanding how to retrieve your data. Restic offers flexible ways to restore files and directories, not just entire backups. You can also mount your repository as a filesystem for easy browsing and retrieval of individual files.

Restoring Files and Directories

The primary command for restoring data is restic restore. We used it in the previous workshop to restore an entire snapshot. Let's delve deeper into its capabilities.

  • The restic restore <snapshot_id> --target <path> command: As a reminder, this is the fundamental restore command.

    • <snapshot_id>: Can be the full ID, a short unique prefix of the ID, or special identifiers like latest (the most recent snapshot overall).
    • --target <path>: Specifies the directory where Restic will place the restored files and directories. Restic will re-create the directory structure from the snapshot within this target path.
      • For example, if snapshot abc123de backed up /home/user/documents, and you run restic restore abc123de --target /tmp/restore_here, the files will be restored to /tmp/restore_here/home/user/documents/. This full path restoration is the default.
      • If you only want the contents of /home/user/documents to appear directly in /tmp/restore_here, you need to use --include and adjust the restore path, or restore and then move files. More on --include below.
  • Restoring to a Different Location: This is the standard behavior when using --target. It's highly recommended to restore to a temporary or different location first, especially if the original location still exists, to avoid accidentally overwriting more recent files. You can then inspect the restored data and manually copy over what you need.

  • Restoring Specific Files/Folders from a Snapshot using --include and --exclude: Often, you don't need to restore an entire snapshot; you might just need a single file or directory. Restic allows this using the --include and --exclude flags. These flags filter the files from the snapshot that will be restored.

    • --include <pattern>: Only restore files and directories matching the pattern. You can use this multiple times. Patterns are matched against the full path within the snapshot.

      • Example: To restore only the project_alpha directory from our previous data_to_backup snapshot (let's assume its ID is a1b2c3d4 and it backed up ./source_data which contained file1.txt and project_alpha/):

        # Assuming RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE are set
        # And snapshot a1b2c3d4 contains /source_data/project_alpha/
        # We want to restore project_alpha into ./tmp_restore_specific
        mkdir ./tmp_restore_specific
        restic restore a1b2c3d4 --target ./tmp_restore_specific --include "/source_data/project_alpha"
        
        This would create ./tmp_restore_specific/source_data/project_alpha/.... If you want project_alpha directly in tmp_restore_specific without the leading /source_data path from the snapshot, you'd typically restore /source_data and then move tmp_restore_specific/source_data/project_alpha to tmp_restore_specific/project_alpha. Alternatively, you can use the restic dump command (more advanced, less common for simple restores) which can print file contents to stdout or save to a path, effectively stripping leading directories from the snapshot. For most cases, restoring with --include and then moving is sufficient. A common pattern is:
        restic restore <snapshot_id> --target /tmp/restore_job --include "/path/in/snapshot/to/desired_folder"
        # Now, /tmp/restore_job will contain /path/in/snapshot/to/desired_folder/...
        # You might then move /tmp/restore_job/path/in/snapshot/to/desired_folder to your final destination.
        

      • Example: To restore a single file, notes.txt, from within project_alpha:

        restic restore a1b2c3d4 --target ./tmp_restore_specific --include "/source_data/project_alpha/notes.txt"
        
        This would create ./tmp_restore_specific/source_data/project_alpha/notes.txt.

    • --exclude <pattern>: Exclude files and directories matching the pattern from the restore.

      • Example: Restore everything from source_data except the my_photos subdirectory:
        restic restore latest --target ./tmp_restore_no_photos --include "/source_data" --exclude "/source_data/my_photos"
        
    • Path Specificity: The paths used in --include and --exclude for restore refer to the paths as stored in the snapshot. Use restic ls <snapshot_id> to confirm the exact paths.
    • Using latest with path context: If you backed up /foo/bar and /foo/baz in separate restic backup commands, they create distinct snapshots. If you then run restic restore latest --target /tmp/restore_output --include /foo/bar, it will find the latest snapshot that contains /foo/bar. If you want the latest snapshot that specifically backed up /foo/bar as a root path, you can add the path to the latest specifier: latest:/foo/bar.
      # To restore the latest version of /source_data/project_alpha
      # from any snapshot that contains it:
      restic restore latest --target ./target_dir --include "/source_data/project_alpha"
      
      # To restore from the latest snapshot *of* ./source_data (assuming ./source_data was a path given to 'backup'):
      # First find the snapshot ID:
      # restic snapshots --path ./source_data
      # (then use that ID)
      # Or, if source_data was one of the main paths of a snapshot:
      restic restore latest:/path/to/original/source_data --target ./target_dir --include "/source_data/project_alpha"
      
      It's often simpler to identify the specific snapshot ID using restic snapshots with appropriate filters (--host, --tag, --path) and then use that ID for the restore.

Mounting Repositories (Read-Only Access)

Restic provides a fantastic feature to mount your repository (or specific snapshots) as a read-only filesystem. This allows you to browse your backups using your system's file explorer or command-line tools as if they were regular directories.

  • The restic mount <mount_point> command:

    • <mount_point>: An empty directory on your system where the Restic filesystem will be mounted.
    • Example:
      # Create a mount point directory (if it doesn't exist)
      mkdir ~/restic_mount
      # Mount the repository
      restic mount ~/restic_mount
      # (You'll be prompted for the password if RESTIC_PASSWORD_FILE/env var isn't set)
      
    • Once mounted, ~/restic_mount will contain several directories:
      • hosts/: Snapshots organized by hostname.
      • ids/: Snapshots accessible directly by their full ID.
      • snapshots/: Snapshots listed by their creation date and time (e.g., 2023-10-27T10:00:00Z_<shortid>). This is often the most convenient way to browse.
      • tags/: Snapshots organized by tags.
    • You can navigate these directories, view file contents, and copy files out. The filesystem is read-only, so you cannot accidentally modify your backups.
    • To unmount, use the standard unmount command for your OS (e.g., umount ~/restic_mount on Linux, or diskutil unmount ~/restic_mount on macOS, or simply Ctrl+C in the terminal where restic mount is running). On Windows, closing the restic mount process (Ctrl+C) usually suffices. If it gets stuck, fusermount -u ~/restic_mount (Linux) might be needed.
  • Prerequisites (FUSE): This feature relies on FUSE (Filesystem in Userspace).

    • Linux: You need to install fuse (or fuse3). The package name might be fuse, fuse-utils, fuse3.
      # Debian/Ubuntu
      sudo apt install fuse3
      # Fedora
      sudo dnf install fuse3
      # Arch
      sudo pacman -S fuse3
      
      You might also need to add your user to the fuse group and log out/in: sudo usermod -aG fuse $USER.
    • macOS: You need to install "macFUSE" (formerly osxfuse). You can download it from its official site or install via Homebrew:
      brew install --cask macfuse
      
      It usually requires a system extension and a reboot.
    • Windows: Restic's mount feature on Windows uses an external library/tool.
      • You need to install WinFsp (Windows File System Proxy). Download the installer from WinFsp's GitHub releases.
      • During Restic compilation or if you download official Restic binaries, they should be built with FUSE support. The restic mount command on Windows will then work if WinFsp is installed.
  • Security Implications and Use Cases:

    • Read-Only: The mount is inherently read-only, which is good for safety.
    • Convenience: Excellent for quickly grabbing a few files, verifying backup contents visually, or comparing versions of a file across different snapshots.
    • Performance: Accessing files through the FUSE mount can be slower than a direct restic restore, as data needs to be fetched, decrypted, and assembled on the fly. It's not ideal for restoring very large amounts of data or for performance-critical access.
    • Resource Usage: The restic mount process will consume resources while active. Remember to unmount when done.

Understanding Restore Conflicts

What happens if you try to restore files to a target directory where files with the same names already exist?

  • Restic's Default Behavior: By default, restic restore will overwrite existing files at the target location if they have the same name and path as files being restored from the snapshot. It does not merge directories or selectively update files based on modification times within the target. It simply lays down the snapshot's version of the files.

    • If a file exists in the target but not in the snapshot (at that path), it will be left untouched.
    • If a directory needs to be created for a restored file, Restic creates it.
  • Best Practices for Restoring to Avoid Data Loss:

    1. Restore to a New, Empty Directory: This is the safest approach.
      mkdir /tmp/my_restore_area
      restic restore <snapshot_id> --target /tmp/my_restore_area
      
      Then, inspect /tmp/my_restore_area and manually copy or move the needed files to their final destinations, handling any potential conflicts yourself (e.g., by renaming existing files or using tools like rsync with care).
    2. Use --verify (Not a restore flag): While not directly part of restore, regularly run restic check --read-data (or a sample thereof) to ensure your repository is in good shape. A corrupted repository can lead to failed restores.
    3. Be Specific with --include: If you only need specific files, use --include to limit what gets restored, reducing the chance of unintended overwrites.
    4. Understand Your Snapshot Contents: Use restic ls <snapshot_id> to know exactly what files are in the snapshot and their paths before initiating a restore to a populated area.

    There is no --dry-run option for restic restore itself to see what would be overwritten. The primary safety mechanism is restoring to a temporary location.

Workshop Selective Restore and Mounting

In this workshop, you'll practice restoring specific parts of a backup and use the mount feature to browse your repository. We'll use the repository and sample data from the previous workshop.

Goals:

  1. Restore a single specific file from a snapshot.
  2. Restore a specific sub-directory from a snapshot.
  3. Install FUSE/WinFsp if necessary.
  4. Mount the Restic repository.
  5. Browse the mounted repository and copy a file.
  6. Unmount the repository.

Prerequisites:

  • Restic installed.
  • The Restic repository (my_local_repo) and environment variables (RESTIC_REPOSITORY, RESTIC_PASSWORD_FILE) set up from the "Your First Backup and Restore" workshop.
  • At least one snapshot in the repository (e.g., from backing up source_data containing greeting.txt, important_doc.md, and my_photos/photo1.jpg, my_photos/photo2.png).

Steps:

  1. Verify Environment and Snapshots: Ensure you are in your restic_workshop directory.

    • Linux/macOS:
      echo $RESTIC_REPOSITORY
      echo $RESTIC_PASSWORD_FILE
      
    • Windows (PowerShell):
      Write-Host $env:RESTIC_REPOSITORY
      Write-Host $env:RESTIC_PASSWORD_FILE
      
      List snapshots to pick one. We'll refer to the snapshot of source_data made in the previous workshop. If you used the tag workshop1, you can use that. Or use latest. Note a snapshot ID. Let's assume the path backed up was ./source_data.
      restic snapshots
      # Note an ID, e.g., a1b2c3d4, or we can use 'latest' if it's the one.
      # Let's check its contents to be sure of the paths within:
      restic ls latest # (Or your specific snapshot ID)
      # Expected output (paths might vary slightly if you backed up 'source_data' vs './source_data'):
      # /source_data/greeting.txt
      # /source_data/important_doc.md
      # /source_data/my_photos/
      # /source_data/my_photos/photo1.jpg
      # /source_data/my_photos/photo2.png
      
      The paths above assume you ran restic backup ./source_data. If you ran restic backup source_data (no ./), the paths in the snapshot will be /greeting.txt, /important_doc.md, etc. Adjust the --include paths below accordingly. For consistency, this workshop will assume the paths in snapshot are like /source_data/....
  2. Create Target Directories for Selective Restores:

    • Linux/macOS & Windows (PowerShell):
      mkdir ./selective_restore_file
      mkdir ./selective_restore_folder
      
  3. Restore a Single Specific File: Let's restore only greeting.txt from the snapshot into the selective_restore_file directory. Remember to use the correct path as it exists inside the snapshot.

    # Use 'latest' or your specific snapshot ID
    restic restore latest --target ./selective_restore_file --include "/source_data/greeting.txt"
    
    Verify:

    • Linux/macOS: ls -R ./selective_restore_file
    • Windows (PowerShell): Get-ChildItem -Recurse ./selective_restore_file You should see ./selective_restore_file/source_data/greeting.txt.
  4. Restore a Specific Sub-directory: Now, let's restore the entire my_photos sub-directory into selective_restore_folder.

    restic restore latest --target ./selective_restore_folder --include "/source_data/my_photos"
    
    Verify:

    • Linux/macOS: ls -R ./selective_restore_folder
    • Windows (PowerShell): Get-ChildItem -Recurse ./selective_restore_folder You should see ./selective_restore_folder/source_data/my_photos/ containing photo1.jpg and photo2.png.
  5. Prepare for Mounting (Install FUSE/WinFsp if needed):

    • Linux:
      # Check if fuse3 is installed, e.g., by trying 'which fusermount3' or 'fuse3 --version'
      # If not, install:
      # sudo apt install fuse3 (Debian/Ubuntu)
      # sudo dnf install fuse3 (Fedora)
      # Potentially add user to fuse group: sudo usermod -aG fuse $USER (then log out/in)
      
    • macOS:
      # Check if macFUSE is installed.
      # If not: brew install --cask macfuse (and follow post-install instructions, possibly reboot)
      
    • Windows:
  6. Mount the Restic Repository: Create a mount point and mount the repository.

    • Linux/macOS & Windows (PowerShell):
      mkdir ./my_restic_mount
      
      Now, run the mount command. This command will keep running in your terminal until you stop it (Ctrl+C). Open a NEW terminal window/tab for the subsequent browsing commands.
      # In one terminal:
      restic mount ./my_restic_mount
      # It will say "Now serving the repository at ./my_restic_mount"
      # Keep this terminal open.
      
  7. Browse the Mounted Repository (in the NEW terminal): Navigate into the mount point and explore.

    • Linux/macOS:
      cd ./my_restic_mount
      ls
      # You should see: hosts, ids, snapshots, tags
      cd snapshots
      ls
      # You'll see directories named after snapshot times/IDs
      # Navigate into one of them, e.g., the latest one
      # cd <snapshot_directory_name_here>
      # ls
      # You should see 'source_data' directory (or similar, based on your backup path)
      # cd source_data
      # ls
      # cat important_doc.md
      # Try copying a file out:
      cp ./important_doc.md ~/restic_workshop/copied_from_mount.txt # (Adjust target path if needed)
      cd ~/restic_workshop # Go back to your main workshop directory
      cat ./copied_from_mount.txt
      
    • Windows (PowerShell):
      cd .\my_restic_mount
      Get-ChildItem
      # You should see: hosts, ids, snapshots, tags
      cd snapshots
      Get-ChildItem
      # Navigate into one of them
      # cd <snapshot_directory_name_here>
      # Get-ChildItem
      # cd source_data
      # Get-ChildItem
      # Get-Content important_doc.md
      # Try copying a file out:
      Copy-Item -Path .\important_doc.md -Destination ..\copied_from_mount.txt # (Copies to parent of my_restic_mount, i.e. restic_workshop)
      cd .. # Go back to restic_workshop directory from my_restic_mount
      Get-Content .\copied_from_mount.txt
      
  8. Unmount the Repository: Go back to the terminal where restic mount is running and press Ctrl+C. This will unmount the filesystem. Verify that my_restic_mount is now empty (or no longer accessible as a special mount).

    • Linux/macOS: ls ./my_restic_mount (should be empty or show an error if it auto-deleted).
    • Windows (PowerShell): Get-ChildItem ./my_restic_mount (should be empty). Sometimes, especially if the mount process was interrupted uncleanly, you might need a more forceful unmount on Linux: sudo umount ./my_restic_mount or fusermount -u ./my_restic_mount.

This workshop demonstrated how to perform targeted restores and how to use the convenient mount feature for browsing and accessing files from your backups. These are essential skills for effectively managing your Restic backups.

3. Managing Your Restic Repository

As you accumulate backups, your Restic repository will grow. Proper management is essential to ensure its integrity, control its size, and maintain efficient operation. This section covers crucial maintenance tasks, provides a deeper understanding of deduplication, and introduces the use of environment variables for easier Restic operation.

Repository Maintenance

Regular maintenance keeps your repository healthy and optimized.

  • Checking Repository Integrity restic check: This is one of the most important commands for peace of mind. It verifies the integrity and consistency of your repository structure and data.

    # Assuming RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE are set
    restic check
    

    • What it does (by default):
      • Verifies that all pack files listed in the index are present and their sizes match.
      • Checks that the structure of snapshots, trees, and other internal metadata is intact.
      • It does not read all data blobs by default, as this can be very time-consuming and I/O intensive for large repositories.
    • --read-data: To perform a more thorough check that involves reading all data from all pack files, decrypting it, and verifying its hash, use the --read-data flag:
      restic check --read-data
      
      This is much slower and uses more bandwidth (for remote repositories) but provides the highest assurance that your data is not corrupted at rest. It's advisable to run this periodically (e.g., weekly or monthly), perhaps during off-peak hours.
    • --read-data-subset <percent|size>: If --read-data is too slow for your entire repository, you can check a random subset of pack files.
      • restic check --read-data-subset 10% (checks 10% of the data)
      • restic check --read-data-subset 50G (checks up to 50GB of data) This can be a good compromise between a quick check and a full data read.
    • Output: A healthy check will report no errors found. If errors are found, it will provide details, which are critical for troubleshooting (covered in a later section).
  • Pruning Old Snapshots restic forget and restic prune: Over time, you'll accumulate many snapshots. While Restic's deduplication is efficient, the metadata itself (snapshot files, tree objects) can consume space, and you might want to enforce a retention policy (e.g., only keep daily backups for a month, weekly for a year, etc.). This is a two-step process:

    1. restic forget [options]: This command decides which snapshots to remove based on a retention policy you define. It doesn't delete data immediately; it only marks the snapshots as forgotten and removes their entries. The actual data blobs that are no longer referenced by any kept snapshot remain in the repository until prune is run.
      • Retention Policies: The --keep-* options are powerful:
        • --keep-last <n>: Keep the last n snapshots.
        • --keep-hourly <n>: Keep the last n hourly snapshots (one per hour).
        • --keep-daily <n>: Keep the last n daily snapshots (one per day).
        • --keep-weekly <n>: Keep the last n weekly snapshots.
        • --keep-monthly <n>: Keep the last n monthly snapshots.
        • --keep-yearly <n>: Keep the last n yearly snapshots.
        • --keep-tags <taglist>: Apply a separate retention policy for snapshots with specific tags.
        • --group-by <host,paths,tags>: Group snapshots before applying policies (e.g., apply --keep-daily 7 per host).
      • --dry-run or -n: Crucial for testing! This shows what forget would do without actually removing anything. Always use --dry-run first to verify your policy.
        restic forget --dry-run --keep-daily 7 --keep-weekly 4 --keep-monthly 6 --keep-yearly 1
        
      • Example of actually forgetting:
        restic forget --keep-daily 7 --keep-weekly 4 --keep-monthly 6
        
    2. restic prune: After forget has marked snapshots for deletion, prune actually removes the underlying data blobs that are no longer referenced by any remaining snapshot. It also repacks data to consolidate smaller pack files and remove any internal fragmentation, potentially freeing up significant space.

      restic prune
      

      • Important: prune can be a resource-intensive operation (CPU, I/O, memory), especially on large repositories. It needs to rebuild the index and potentially rewrite large amounts of data. Plan to run it during off-peak hours.
      • prune creates new pack files and removes old ones. This means it temporarily needs extra space in the repository to hold both old and new data during the operation.
      • If prune is interrupted, the repository remains in a consistent state, but some garbage data might not have been cleaned. You can simply run prune again.
    3. Understanding --repack (obsolete context, now implicit): In older Restic versions, there was a separate rebuild-index and prune might have needed a --repack flag or a subsequent repack command to consolidate data. In modern Restic, prune handles repacking implicitly and efficiently. The restic repack command still exists for specific scenarios (like forcing a repack without pruning or to specific pack sizes) but is less commonly used for routine maintenance. prune is generally sufficient.

  • Rebuilding the Index restic rebuild-index: The index files in a Restic repository map data blob IDs to the pack files where they are stored. If these index files become corrupted or are missing, Restic cannot efficiently find data. restic rebuild-index scans all pack files and rebuilds the index from scratch.

    restic rebuild-index
    
    This is usually not needed for routine maintenance unless restic check reports index-related errors or Restic itself suggests it. prune also rebuilds the index.

  • Understanding Lock Files and Handling Stale Locks restic unlock: Restic uses lock files in the repository (locks/ directory) to prevent multiple Restic processes from modifying the repository simultaneously, which could lead to corruption.

    • When a Restic operation (like backup, prune, forget) starts, it creates a lock. When it finishes cleanly, it removes the lock.
    • If Restic crashes, is killed abruptly, or a network connection to a remote repository drops, a lock file might be left behind. This is called a "stale lock."
    • If a stale lock exists, subsequent Restic commands (especially write operations) will fail with an error message indicating the repository is locked.
    • restic list-locks: Shows all active locks in the repository.
      restic list-locks
      
    • restic unlock [lockID]: Removes a specific lock.
    • restic unlock --remove-all: Removes all locks. Use with caution! Only do this if you are certain no other Restic process is legitimately using the repository. If you remove a lock that's actively in use by another Restic instance, you risk repository corruption.
      # Best practice:
      # 1. Check for running restic processes.
      # 2. If none, list locks:
      restic list-locks
      # 3. If stale locks are present, remove them:
      restic unlock --remove-all
      

Understanding Deduplication in Depth

We've mentioned deduplication, but let's explore how it works more deeply, as it's central to Restic's efficiency.

  • Chunking Algorithm (Content Defined Chunking - CDC): Restic doesn't just split files into fixed-size blocks (like 4MB chunks). If it did, inserting a single byte at the beginning of a large file would cause every subsequent block to change, leading to poor deduplication for that file. Instead, Restic uses Content Defined Chunking. It scans the file's content using a rolling hash function (often based on Rabin fingerprints). When the hash value matches a certain pattern or falls within a certain range (determined by a "mask"), Restic declares a chunk boundary.

    • Variable Chunk Sizes: This results in variable chunk sizes, typically averaging around a target size (e.g., 1MB) but ranging from a minimum (e.g., 512KB) to a maximum (e.g., 8MB). These parameters are set when the repository is initialized and stored in the config file.
    • Robustness to Insertions/Deletions: Because chunk boundaries are determined by the content itself, if you insert or delete data in the middle of a file, only the chunks directly affected by the change (and possibly one or two adjacent chunks) will be different. Chunks far away from the modification, whose content hasn't changed, will remain identical and thus be deduplicated.
  • How Changes in Files Affect New Backups:

    1. New File: The entire file is chunked. For each chunk, Restic calculates its SHA-256 hash. It checks if a blob with this hash already exists in the repository's index.
      • If yes (chunk is a duplicate): Restic creates a pointer to the existing blob.
      • If no (new chunk): Restic compresses (minimally), encrypts, and stores the new blob in a pack file, then updates the index.
    2. Modified File: Restic re-chunks the entire modified file.
      • Unchanged parts of the file will likely produce the same chunks as before (thanks to CDC). These will be identified as duplicates.
      • Changed parts of the file will produce new chunks. These will be processed as above (check hash, store if new).
    3. Unmodified File (based on metadata): If a file's metadata (modification time, size, inode number on Unix-like systems) hasn't changed since the "parent" snapshot (Restic can quickly check this if a parent snapshot is specified or found), Restic can often skip re-reading and re-chunking the file entirely, assuming its content is also unchanged. This significantly speeds up backups of large, mostly static datasets. The --force flag can make Restic re-read all files.
  • Impact on Storage Space and Backup Speed:

    • Storage Space: Deduplication dramatically reduces storage requirements, especially for:
      • Multiple backups of the same dataset over time (only changes are stored).
      • Backups of multiple virtual machines that share common operating system files.
      • Directories with many duplicate files.
    • Backup Speed (for subsequent backups):
      • Scanning for changes can still take time, especially for many small files.
      • However, transferring data is much faster as only new/unique chunks are uploaded.
      • The client CPU does work for chunking, hashing, and encryption.

Environment Variables for Restic

Restic commands often require you to specify the repository location (-r or --repo) and the password. Typing these repeatedly is tedious and error-prone, especially in scripts. Restic supports several environment variables to simplify this:

  • RESTIC_REPOSITORY: Specifies the location of the Restic repository. If set, you don't need to use the -r or --repo flag.

    • Example (Linux/macOS Bash): export RESTIC_REPOSITORY=/srv/restic-repo
    • Example (Windows PowerShell): $env:RESTIC_REPOSITORY = "D:\backup\restic-repo"
  • RESTIC_PASSWORD / RESTIC_PASSWORD_FILE (Security Implications): These provide the repository password.

    • RESTIC_PASSWORD: Sets the password directly.
      • Example (Linux/macOS Bash): export RESTIC_PASSWORD="your_secret_password"
      • Example (Windows PowerShell): $env:RESTIC_PASSWORD = "your_secret_password"
      • Security Risk: Storing passwords directly in environment variables can be a security risk, as they might be visible in process lists (ps aux | grep restic on Linux might show it) or shell history. Generally not recommended for scripts or shared environments.
    • RESTIC_PASSWORD_FILE: Specifies a path to a text file containing the password. Restic will read the first line of this file as the password.
      • Example (Linux/macOS Bash): export RESTIC_PASSWORD_FILE=/etc/restic/password.txt
      • Example (Windows PowerShell): $env:RESTIC_PASSWORD_FILE = "C:\secrets\restic_pass.txt"
      • Security Best Practice: This is generally more secure than RESTIC_PASSWORD. Ensure the password file itself has restrictive permissions (e.g., chmod 600 /etc/restic/password.txt on Linux, so only the owner can read/write).
    • If both are set, RESTIC_PASSWORD takes precedence. If neither is set, Restic will prompt for the password interactively.
  • Other Useful Variables:

    • RESTIC_CACHE_DIR: Restic maintains a local cache (default: ~/.cache/restic on Linux/macOS, platform-specific on Windows) to speed up operations by storing index data and other metadata locally. You can change the cache location with this variable. This is useful if the default location is on a slow disk or has limited space.
      • Example: export RESTIC_CACHE_DIR=/var/tmp/restic-cache
    • TMPDIR: Restic uses temporary files for some operations. If your default temporary directory (/tmp) is small or slow, you can point Restic to use a different location by setting TMPDIR.
      • Example: export TMPDIR=/mnt/fast_ssd/tmp
    • Restic honors global proxy variables like HTTP_PROXY, HTTPS_PROXY, FTP_PROXY for remote repository access.

Using these environment variables, especially RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE, is highly recommended for scripting and automation.

Workshop Repository Maintenance and Automation Prep

This workshop will simulate a period of backups, then apply retention policies using forget and prune. We'll also set up environment variables for easier use in future scripts.

Goals:

  1. Create several new snapshots to simulate backups over time.
  2. Run restic check to verify repository health.
  3. Use restic forget --dry-run to plan a retention policy.
  4. Apply the retention policy with restic forget.
  5. Reclaim space with restic prune.
  6. Observe the effects on snapshot count and repository size (if noticeable with small data).
  7. Set up persistent environment variables for repository and password file for future use (demonstration, actual persistence depends on shell/OS).

Prerequisites:

  • The Restic repository (my_local_repo) from previous workshops.
  • The source_data directory used previously.
  • Environment variables RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE should ideally still be set from the previous workshop for convenience during this one. If not, set them for your current session.

Steps:

  1. Set Up Session Environment Variables (if not already set): Ensure RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE point to your workshop repository and password file.

    • Linux/macOS (Bash/Zsh - example paths):
      export RESTIC_REPOSITORY=~/restic_workshop/my_local_repo
      # Ensure mypass.txt exists and has your password
      # echo "your_super_secret_password" > ~/restic_workshop/mypass.txt
      # chmod 600 ~/restic_workshop/mypass.txt
      export RESTIC_PASSWORD_FILE=~/restic_workshop/mypass.txt
      
    • Windows (PowerShell - example paths):
      $env:RESTIC_REPOSITORY = "$env:USERPROFILE\restic_workshop\my_local_repo"
      # Ensure mypass.txt exists and has your password
      # Set-Content -Path "$env:USERPROFILE\restic_workshop\mypass.txt" -Value "your_super_secret_password"
      $env:RESTIC_PASSWORD_FILE = "$env:USERPROFILE\restic_workshop\mypass.txt"
      
  2. Simulate Backups Over Time: We'll make a few small changes to source_data and create new backups. We'll use restic backup with different tags to simulate daily/weekly backups.

    • Initial state: Check current snapshots.
      restic snapshots
      
    • Day 1: (Assume initial backup from workshop 1 was Day 0)
      • Linux/macOS: echo "Update Day 1" >> ./source_data/greeting.txt
      • Windows: Add-Content -Path .\source_data\greeting.txt -Value "Update Day 1"
        restic backup ./source_data --tag daily_sim --tag day1
        
    • Day 2:
      • Linux/macOS: echo "Update Day 2" >> ./source_data/important_doc.md
      • Windows: Add-Content -Path .\source_data\important_doc.md -Value "Update Day 2"
        restic backup ./source_data --tag daily_sim --tag day2
        
    • Day 3 to Day 7 (repeat similar modifications and backups):
      # Day 3
      # ... make a small change to a file in ./source_data ...
      # restic backup ./source_data --tag daily_sim --tag day3
      
      # ... continue for day4, day5, day6, day7 ...
      # For brevity in this written workshop, let's just do a few more:
      # Day 3
      echo "Update Day 3" >> ./source_data/my_photos/photo1.jpg
      restic backup ./source_data --tag daily_sim --tag day3
      # Day 4
      echo "Update Day 4" >> ./source_data/greeting.txt
      restic backup ./source_data --tag daily_sim --tag day4
      # Day 8 (Simulate start of week 2)
      echo "Update Day 8 - Week 2" >> ./source_data/important_doc.md
      restic backup ./source_data --tag daily_sim --tag week2_start
      
    • List snapshots again to see the history:
      restic snapshots
      
      You should now have several snapshots.
  3. Check Repository Integrity: Before any major operation like pruning, it's good practice to check the repo.

    restic check
    # For a small repo like this, --read-data is quick enough too
    restic check --read-data
    
    Ensure it reports "no errors found."

  4. Plan Retention Policy with restic forget --dry-run: Let's say we want to keep:

    • The last 3 snapshots (covers very recent changes).
    • Daily snapshots for the last 7 days (not strictly needed if covered by --keep-last 3 but good for demo).
    • One snapshot tagged week2_start.
      restic forget --dry-run \
          --keep-last 3 \
          --keep-daily 7 \
          --keep-tag week2_start \
          --prune # Add --prune here to see what prune would do, it implies also running prune afterwards
      
      The --prune flag in forget --dry-run --prune is a bit confusing. What restic forget --dry-run ... itself shows is which snapshots would be removed. If you use restic forget --dry-run --prune ..., it doesn't actually run prune in dry-run mode. A better way to plan is:
      echo "### Planning snapshots to forget (dry run) ###"
      restic forget --dry-run --keep-last 3 --keep-daily 5 --group-by paths
      # Adjust numbers like --keep-daily 5 based on how many you created.
      # For our few snapshots, let's simplify: keep the last 2, plus the one tagged week2_start
      restic forget --dry-run --keep-last 2 --keep-tag week2_start
      
      Examine the output. It will list snapshots it intends to "remove" (i.e., mark as forgotten). Adjust your --keep-* parameters until you are satisfied with the plan. For example, if you only have 5 snapshots total, --keep-last 7 will keep all of them. Let's try a policy that will definitely remove some, assuming you have about 5-7 snapshots now: Keep the 2 newest, and any tagged workshop1 (our very first one).
      restic forget --dry-run --keep-last 2 --keep-tag workshop1
      
      Note which snapshots are listed under "remove".
  5. Apply the Retention Policy with restic forget: Once you're happy with the dry run, remove the --dry-run (or -n) flag to actually forget the snapshots.

    restic forget --keep-last 2 --keep-tag workshop1
    # Confirm if prompted.
    
    List snapshots again:
    restic snapshots
    
    You should see fewer snapshots listed. The data blobs from the forgotten snapshots are still in the repository but are now unreferenced.

  6. Reclaim Space with restic prune: Now, run prune to remove the unreferenced data and potentially repack existing data.

    restic prune
    
    This might take a moment even for a small repository. It will show progress. After prune completes:

    • Run restic check again to ensure health.
    • Optionally, check the size of your repository directory (du -sh ~/restic_workshop/my_local_repo on Linux/macOS or (Get-ChildItem ~\restic_workshop\my_local_repo -Recurse | Measure-Object -Property Length -Sum).Sum / 1MB in PowerShell). With very small text files, the size difference might be negligible as overhead and metadata structure could dominate. With larger, more varied data, prune's effect is more apparent.
  7. Setup Persistent Environment Variables (Conceptual): For future use, especially in scripts, you'd want RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE to be set automatically. How you do this depends on your OS and shell:

    • Linux/macOS (Bash/Zsh): Add to your shell's startup file (e.g., ~/.bashrc, ~/.zshrc, or ~/.profile):
      # In ~/.bashrc or ~/.zshrc:
      # export RESTIC_REPOSITORY="/path/to/your/real_repo"
      # export RESTIC_PASSWORD_FILE="/path/to/your/real_repo_password_file"
      
      Then source ~/.bashrc or open a new terminal. For this workshop, we've been setting them per session. For actual automation, this is how you'd make them more permanent for your user session or scripts.
    • Windows (PowerShell): To set them persistently for the current user:
      # [System.Environment]::SetEnvironmentVariable("RESTIC_REPOSITORY", "C:\path\to\your\real_repo", "User")
      # [System.Environment]::SetEnvironmentVariable("RESTIC_PASSWORD_FILE", "C:\path\to\your\real_repo_password_file", "User")
      
      You'll need to open a new PowerShell window for these to take effect.
    • System-wide (e.g., in scripts run by cron or systemd): It's often better to define these variables at the top of your backup script itself, or use features of the scheduler (like systemd's EnvironmentFile= or Environment=).

    For now, ensure they are set in your current terminal session if you plan to proceed directly to the next sections. We will use these variables extensively.

This workshop provided hands-on experience with vital repository maintenance tasks. Regularly checking, forgetting old snapshots according to a policy, and pruning are key to a healthy, efficient Restic backup system. Setting up environment variables will make subsequent interactions and scripting much smoother.

4. Remote Repositories and Security

While local repositories are great for quick backups and restores, a robust backup strategy (like the 3-2-1 rule: 3 copies of data, on 2 different media, with 1 off-site) necessitates storing backups remotely. Restic excels here with its client-side encryption and support for various remote backends. This section explores how to use remote storage and delves deeper into Restic's encryption model.

Supported Remote Backends

Restic can natively interact with several types of remote storage, and its capabilities can be extended further using tools like rclone.

  • SFTP (SSH File Transfer Protocol):

    • If you have a server accessible via SSH, you can use its SFTP service as a Restic backend.
    • Restic connects via SSH, then interacts with the remote filesystem using SFTP commands.
    • Requires an SSH server running on the remote machine.
    • Authentication is typically via SSH keys (recommended) or password.
    • Repository path format: sftp:user@host:/path/to/repo
  • REST Server (rest-server):

    • Restic provides its own lightweight HTTP server called rest-server. You can run this on a machine to serve a Restic repository over HTTP/HTTPS.
    • Offers features like append-only mode, which can protect against attackers deleting backups if they compromise the client but not the rest-server itself (with append-only, they can add new bad data but not remove old good data).
    • Repository path format: rest:http://user:password@host:port/repo_name or rest:https://...
  • Amazon S3 and Compatible Services:

    • Amazon S3: Native support for Amazon's Simple Storage Service.
      • Requires AWS Access Key ID, Secret Access Key, and bucket name.
      • Repository path format: s3:s3.amazonaws.com/bucket_name/path_prefix or s3:https://<endpoint>/bucket_name/path_prefix for specific regions or S3 gateways.
    • MinIO: A popular open-source, S3-compatible object storage server you can self-host.
    • Wasabi, Backblaze B2, DigitalOcean Spaces, etc.: Many cloud storage providers offer S3-compatible APIs. Restic can often use these by specifying the provider's S3 endpoint.
      • Backblaze B2 example: b2:bucketName:/path/prefix (requires B2 account ID/application key). Restic has dedicated B2 support.
  • Azure Blob Storage:

    • Native support for Microsoft Azure Blob Storage.
    • Requires Azure account name and key.
    • Repository path format: azure:containerName:/path/prefix
  • Google Cloud Storage (GCS):

    • Native support for Google Cloud Storage.
    • Requires GCS project ID and authentication (usually via a service account JSON file).
    • Repository path format: gs:bucketName:/path/prefix
  • Rclone:

    • Rclone is a powerful command-line program to manage files on cloud storage. It supports a vast number of services (Dropbox, OneDrive, Google Drive, Box, many more).
    • Restic can use Rclone as a "backend bridge." You configure a remote in Rclone, and then tell Restic to use that Rclone remote.
    • This dramatically expands the number of cloud services Restic can use.
    • Repository path format: rclone:yourRcloneRemoteName:path/to/repo
    • Restic executes the rclone binary in the background.

Setting up an SFTP Backend

SFTP is a common and relatively easy-to-set-up remote backend if you have a machine you can SSH into.

  • Prerequisites:

    1. A remote server with an SSH server installed and running.
    2. A user account on that server for Restic backups. It's good practice to create a dedicated, less-privileged user for this.
    3. Sufficient storage space on the server for your backups.
  • SSH Key-Based Authentication (Recommended):
    Using SSH keys is more secure and convenient than password authentication for automated backups.

    1. Generate an SSH Key Pair (on your client machine, if you don't have one):
      ssh-keygen -t ed25519 -C "restic_backup_key"
      # Or use rsa: ssh-keygen -t rsa -b 4096 -C "restic_backup_key"
      # Follow prompts. You can choose to set a passphrase for the key itself for added security.
      # This creates ~/.ssh/id_ed25519 (private key) and ~/.ssh/id_ed25519.pub (public key).
      
    2. Copy the Public Key to the Remote Server:
      Replace restic_user and your_remote_server_ip_or_hostname with actual values.
      ssh-copy-id restic_user@your_remote_server_ip_or_hostname
      # This appends your public key to ~/.ssh/authorized_keys on the remote server for restic_user.
      # If ssh-copy-id is not available (e.g., on Windows or some minimal Linux), manually:
      # cat ~/.ssh/id_ed25519.pub | ssh restic_user@your_remote_server_ip_or_hostname "mkdir -p ~/.ssh && chmod 700 ~/.ssh && cat >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys"
      
    3. Test SSH Key Authentication:
      ssh restic_user@your_remote_server_ip_or_hostname
      # You should log in without being prompted for a password (though you might be prompted for the SSH key's passphrase if you set one).
      
    4. (Optional but Recommended) Restrict the SSH Key:
      For enhanced security, you can restrict what this specific SSH key can do on the server. Edit ~/.ssh/authorized_keys on the server for the restic_user. Prefix the key entry with restrictions. A common one for Restic (which uses SFTP subsystem) is to limit it to SFTP only and potentially restrict the starting directory. Example authorized_keys entry on server: command="sftpserver -u 0002 -R /backup_storage/restic_user_repo",no-pty,no-X11-forwarding,no-agent-forwarding,no-port-forwarding ssh-ed25519 AAAA... restic_backup_key This is more advanced. A simpler start is just ensuring the restic_user has limited shell capabilities or is chrooted to their backup directory. For Restic's SFTP backend, it often relies on the standard SFTP subsystem. rrsync (restricted rsync) scripts are also an option if you wanted to further lock down. For plain Restic SFTP, just ensuring the user can only write to their designated backup path is a good start.
  • Initializing a Restic Repository over SFTP:
    Let's say restic_user on your_remote_server_ip_or_hostname will store repositories under /var/backups/restic/my_machine_backup. The repository URL for Restic will be: sftp:restic_user@your_remote_server_ip_or_hostname:/var/backups/restic/my_machine_backup

    # On your client machine
    # Ensure RESTIC_PASSWORD_FILE is set or be ready to enter the password
    restic -r sftp:restic_user@your_remote_server_ip_or_hostname:/var/backups/restic/my_machine_backup init
    
    Restic will connect via SSH (using your key), then use SFTP commands to create the repository structure in the specified remote path.

    • If the SSH server runs on a non-standard port (e.g., 2222), use sftp:user@host:port:/path. For Restic v0.15.0+, the syntax ssh:user@host:port::sftppath for non-standard ports with SFTP is also mentioned in discussions, but sftp:user@host:port:/path should be tried first or check restic help for repository syntax. Commonly, for non-standard SSH port, the SSH client configuration (~/.ssh/config) is the cleanest way:
      Host my_sftp_server_alias
          HostName your_remote_server_ip_or_hostname
          User restic_user
          Port 2222
          IdentityFile ~/.ssh/id_ed25519_for_restic
      
      Then you can use a simpler Restic repo URL: sftp:my_sftp_server_alias:/var/backups/restic/my_machine_backup
  • Performance Considerations:

    • SFTP performance depends heavily on network latency and bandwidth, as well as server-side I/O.
    • SSH encryption/decryption adds some CPU overhead on both client and server.
    • For very high-latency links or very large numbers of small files, SFTP might be slower than object storage backends like S3, which are designed for concurrent HTTP requests.
    • Restic's backup command uses a configurable number of parallel connections for uploading data to SFTP (--sftp-connections, default 5), which helps.

Using a MinIO (S3-compatible) Server

MinIO is an excellent open-source object storage server that implements the Amazon S3 API. You can self-host MinIO on your own hardware (or a VM/container) to create a private S3-like service.

  • Setting up a Local MinIO Server (e.g., via Docker for testing):
    This is a quick way to get a MinIO instance running for experimentation. For production, refer to MinIO's documentation for robust deployment.

    1. Install Docker if you haven't already.
    2. Run MinIO container:
      # Create directories to persist MinIO data and config
      mkdir -p ~/minio/data ~/minio/config
      
      docker run -d \
         -p 9000:9000 \
         -p 9001:9001 \
         --name minio_server \
         -e "MINIO_ROOT_USER=YOUR_MINIO_ACCESS_KEY" \
         -e "MINIO_ROOT_PASSWORD=YOUR_MINIO_SECRET_KEY_VERY_STRONG" \
         -v ~/minio/data:/data \
         -v ~/minio/config:/root/.minio \
         minio/minio server /data --console-address ":9001"
      
      • Replace YOUR_MINIO_ACCESS_KEY (e.g., resticadmin) and YOUR_MINIO_SECRET_KEY_VERY_STRONG (e.g., aVeryStrongSecretKeyForRestic) with your desired credentials. Make the secret key strong!
      • -p 9000:9000: Exposes the MinIO S3 API port.
      • -p 9001:9001: Exposes the MinIO web console port.
      • -v ~/minio/data:/data: Mounts a local directory to store MinIO's objects.
    3. Access MinIO Console: Open your web browser to http://localhost:9001. Log in with the MINIO_ROOT_USER and MINIO_ROOT_PASSWORD you set.
    4. Create a Bucket: In the MinIO console, create a new bucket (e.g., restic-backups). This bucket will host your Restic repository.
  • Configuring Restic to use MinIO:
    Restic needs the MinIO server's endpoint, your access key, and your secret key.

    • Endpoint: For our local Docker example, it's http://localhost:9000. For a production MinIO, it would be its public or internal IP/hostname and port, likely with HTTPS.
    • Access Key: YOUR_MINIO_ACCESS_KEY
    • Secret Key: YOUR_MINIO_SECRET_KEY_VERY_STRONG

    You can provide these via environment variables (recommended for S3 keys):

    # On your client machine (where Restic runs)
    export AWS_ACCESS_KEY_ID="YOUR_MINIO_ACCESS_KEY"
    export AWS_SECRET_ACCESS_KEY="YOUR_MINIO_SECRET_KEY_VERY_STRONG"
    # For non-AWS S3 providers, you might need to set AWS_CA_BUNDLE if using self-signed certs for MinIO HTTPS
    # For MinIO specifically, Restic might also auto-detect if you use a non-HTTPS endpoint like http://localhost:9000
    
    Now, initialize the Restic repository in the MinIO bucket: The repository URL format is s3:<endpoint>/<bucket_name>/<optional_path_prefix_in_bucket>
    # Ensure RESTIC_PASSWORD_FILE is set for your Restic repository password
    restic -r s3:http://localhost:9000/restic-backups/my_minio_repo init
    

    • s3:http://localhost:9000: Specifies the S3 protocol and endpoint. Using http because our Docker MinIO is not set up with HTTPS by default. For production, always use HTTPS.
    • /restic-backups: The bucket name.
    • /my_minio_repo: An optional "folder" or prefix within the bucket where this specific Restic repository will live. This allows you to have multiple Restic repositories in the same bucket.

    Once initialized, all Restic commands (backup, snapshots, restore, check, prune) will work with this S3 repository URL, using the AWS environment variables for authentication.

Encryption Deep Dive

Restic's security model is built around strong, client-side, authenticated encryption. Understanding this is crucial for trusting your backups.

  • AES-256 Encryption:

    • Restic uses AES-256 (Advanced Encryption Standard with 256-bit keys) in Counter mode (CTR) for encrypting data blobs and tree blobs. AES is a widely adopted, strong symmetric encryption algorithm. CTR mode turns a block cipher (like AES) into a stream cipher, which is efficient for encrypting data of varying lengths.
  • Authenticated Encryption (AEAD):

    • Simply encrypting data (confidentiality) is not enough; you also need to ensure data integrity and authenticity (that the data hasn't been tampered with and that it originated from an authorized source).
    • Restic achieves this by using a Message Authentication Code (MAC). Specifically, it uses Poly1305-AES as the MAC algorithm.
    • For each encrypted blob, a MAC is computed and stored alongside it. When reading data, Restic recomputes the MAC and compares it to the stored one. If they don't match, the data is considered corrupt or tampered with, and Restic will report an error. This prevents an attacker from modifying ciphertext in a meaningful way without detection.
  • Key Derivation from the Password: You provide a single repository password. Restic doesn't store this password directly. Instead, it uses it to derive several cryptographic keys:

    1. Password Hashing: Your repository password is first processed by a key derivation function (KDF). Restic uses scrypt for this. scrypt is designed to be computationally intensive and memory-hard, making brute-force attacks against the password very difficult.
    2. Master Keys: scrypt produces two master keys:
      • One for encryption (master encryption key).
      • One for MAC computation (master MAC key).
    3. Per-File/Blob Keys (Conceptual): While these master keys exist, Restic doesn't directly use them to encrypt every data blob. Instead, the actual encryption and MAC keys for individual blobs are derived in a way that ensures cryptographic separation, often involving nonces or unique identifiers related to the blobs themselves, combined with the master keys. The specifics are complex but designed to prevent issues like key reuse.
  • The Role of config and keys Files in the Repository:

    • keys/: This directory in the repository stores files, each containing a copy of the (encrypted) master encryption and MAC keys. These master keys are themselves encrypted using a key derived directly from your repository password via scrypt.
      • When you access the repository, Restic asks for your password, runs it through scrypt, and uses the output to try and decrypt one of the files in the keys/ directory. If successful, it obtains the master keys for the repository.
      • Having multiple key files allows for future scenarios like password rotation or multiple users with different passwords having access to the same underlying master keys (though Restic's current model is primarily single-password).
    • config: This file stores repository-wide configuration, including the repository ID, version, and parameters for chunking. It is not encrypted in the same way as data because some of its content might be needed before decryption keys are available (e.g., to identify repository format). However, sensitive data itself is never stored here unencrypted.
  • What Happens If You Lose Your Password? Your data is irrecoverably lost.

    • Because Restic performs client-side encryption and you are the sole holder of the password (or the means to derive the keys), there is no backdoor, no password reset mechanism, and no way for Restic developers or anyone else to recover your data without that password.
    • This is a feature, not a bug. It guarantees your privacy and control.
    • Action: Store your repository password in a very secure and reliable place (e.g., a reputable password manager, a written copy in a safe). Consider creating a "disaster recovery" sheet with the password and repository location details.
  • Changing Repository Password (Not Directly Supported for Existing Keys): Restic does not have a simple restic change-password command that re-encrypts the existing master keys with a new password. The keys/ files are tied to the password used at their creation.

    • Workaround/Migration: The typical way to "change" a password is to:
      1. Initialize a new repository with the new desired password.
      2. Use restic copy to copy all snapshots from the old repository to the new repository. This command re-encrypts data using the new repository's keys (and thus password) as it copies.
        # Example:
        # RESTIC_PASSWORD_FILE_OLD points to old password
        # RESTIC_PASSWORD_FILE_NEW points to new password
        restic -r /path/to/old_repo --password-file $RESTIC_PASSWORD_FILE_OLD \
               copy --repo2 /path/to/new_repo --password-file2 $RESTIC_PASSWORD_FILE_NEW
        
      3. Verify the new repository.
      4. Once confident, you can eventually delete the old repository.
    • This process can be time-consuming for large repositories as it involves reading, decrypting, re-encrypting, and writing all unique data.

Workshop Setting up a Remote SFTP Repository

This workshop will guide you through setting up a Restic repository on a remote server using SFTP and SSH key authentication.

Goals:

  1. Prepare a remote server (or a local VM acting as one) with an SSH user.
  2. Set up SSH key-based authentication from your client to the server for this user.
  3. Initialize a Restic repository on the SFTP server from your client.
  4. Back up a sample directory to the SFTP repository.
  5. List snapshots and restore a file from the SFTP repository to verify.

Prerequisites:

  • Client Machine: Your primary machine where Restic is installed.
  • Server Machine: Another machine (can be a Virtual Machine on your local computer, a Raspberry Pi, or a cloud server) running Linux with an SSH server (like OpenSSH Server). You need sudo or root access on this server for user creation.
  • Basic understanding of Linux user management and SSH.

Server Preparation (on your "remote" server):

  1. Install SSH Server (if not already present): On most server-oriented Linux distros, OpenSSH server is installed by default. If not:

    # On the server (e.g., Debian/Ubuntu)
    sudo apt update
    sudo apt install openssh-server
    sudo systemctl enable ssh
    sudo systemctl start ssh
    

  2. Create a Dedicated Restic User: It's good practice to use a dedicated, non-privileged user for Restic backups.

    # On the server
    sudo adduser restic_backup_user
    # Follow prompts to set a password (you won't use it for SSH if keys are set up, but good for local login if needed)
    # and other user info.
    

  3. Create a Directory for Restic Repositories: This directory will hold the Restic repository data for restic_backup_user.

    # On the server
    sudo mkdir -p /srv/restic_storage
    sudo chown restic_backup_user:restic_backup_user /srv/restic_storage
    sudo chmod 700 /srv/restic_storage # Only user can rwx
    
    The actual repository will be a subdirectory within /srv/restic_storage, e.g., /srv/restic_storage/my_client_backups.

Client Preparation (on your Restic client machine):

  1. Generate SSH Key Pair (if you don't have one you want to use):

    # On the client machine
    ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519_restic_sftp -C "restic_sftp_key_for_$(hostname)"
    # When prompted for a passphrase, you can leave it empty for fully automated access (less secure for the key itself)
    # or provide one (more secure, ssh-agent can cache it). For this workshop, empty is fine.
    # This creates ~/.ssh/id_ed25519_restic_sftp (private) and ~/.ssh/id_ed25519_restic_sftp.pub (public).
    

  2. Copy Public Key to Server: Replace your_server_ip_or_hostname with the actual IP or hostname of your server.

    # On the client machine
    ssh-copy-id -i ~/.ssh/id_ed25519_restic_sftp.pub restic_backup_user@your_server_ip_or_hostname
    # You'll be prompted for restic_backup_user's password on the server one last time here.
    
    If ssh-copy-id is unavailable:
    cat ~/.ssh/id_ed25519_restic_sftp.pub | ssh restic_backup_user@your_server_ip_or_hostname "mkdir -p ~/.ssh && chmod 700 ~/.ssh && cat >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys"
    

  3. Configure SSH Client (Optional but Recommended for Specific Key): To make SSH use this specific key automatically for this host/user, edit or create ~/.ssh/config on your client machine:

    Host my_sftp_backup_server
        HostName your_server_ip_or_hostname
        User restic_backup_user
        IdentityFile ~/.ssh/id_ed25519_restic_sftp
        # Add Port XXXXX if your server SSH is on a non-standard port
    
    Now, you can test by SSHing using the alias:
    ssh my_sftp_backup_server
    # Should log you in as restic_backup_user on the server without a password.
    exit # to come back to client
    
    If you don't set up ~/.ssh/config, SSH will try default keys, which might also work if the public key was added to authorized_keys generally.

Performing Restic Operations:

  1. Define SFTP Repository Path and Restic Password:

    • Repository URL: sftp:my_sftp_backup_server:/srv/restic_storage/my_client_sftp_repo (If not using ~/.ssh/config alias: sftp:restic_backup_user@your_server_ip_or_hostname:/srv/restic_storage/my_client_sftp_repo)
    • Restic Password: Use the same mypass.txt from previous workshops or create a new one for this repository. Ensure RESTIC_PASSWORD_FILE environment variable is set.
      # On the client
      # Example:
      # echo "another_secure_password_for_sftp_repo" > ~/restic_workshop/sftp_pass.txt
      # chmod 600 ~/restic_workshop/sftp_pass.txt
      export RESTIC_PASSWORD_FILE=~/restic_workshop/sftp_pass.txt # Adjust path if needed
      
  2. Initialize the SFTP Restic Repository:

    # On the client
    export RESTIC_SFTP_REPO="sftp:my_sftp_backup_server:/srv/restic_storage/my_client_sftp_repo"
    # (Or the full sftp:user@host:/path format if not using ssh config alias)
    
    restic -r $RESTIC_SFTP_REPO init
    
    You should see a success message. On the server, check /srv/restic_storage/my_client_sftp_repo. It should now contain the Restic repository structure (config, data, keys, etc.), owned by restic_backup_user.

  3. Back up Sample Data to SFTP Repository: Let's back up the source_data directory from your restic_workshop folder again, this time to the remote SFTP repo.

    # On the client, ensure you are in ~/restic_workshop
    # source_data directory should exist from previous workshops
    # If not: mkdir source_data; echo "SFTP test content" > source_data/sftp_test.txt
    restic -r $RESTIC_SFTP_REPO backup ./source_data --tag sftp_backup
    
    Observe the output. It might be slightly slower than a local backup due to network transfer.

  4. List Snapshots on SFTP Repository:

    # On the client
    restic -r $RESTIC_SFTP_REPO snapshots
    
    You should see the snapshot you just created with the sftp_backup tag.

  5. Restore a File from SFTP Repository: Create a temporary directory for the restore.

    # On the client
    mkdir ./sftp_restore_test
    # Let's restore just one file from the snapshot, e.g., /source_data/sftp_test.txt (if you created it)
    # or /source_data/greeting.txt if using prior data. Use `restic -r $RESTIC_SFTP_REPO ls latest` to find a file path.
    # Assuming /source_data/greeting.txt exists in the snapshot:
    restic -r $RESTIC_SFTP_REPO restore latest --target ./sftp_restore_test --include "/source_data/greeting.txt"
    
    Verify the restored file:
    ls -R ./sftp_restore_test
    cat ./sftp_restore_test/source_data/greeting.txt
    

Congratulations! You have successfully set up an SFTP backend for Restic, configured SSH key authentication, and performed backup and restore operations. This is a significant step towards a more robust, off-site backup strategy. Remember to manage the RESTIC_PASSWORD_FILE and your SSH private key securely.

5. Advanced Restic Usage and Automation

With a solid understanding of Restic's basics and remote repository capabilities, we can now explore advanced usage patterns, focusing on automation, efficient data selection, and strategic backup planning. These techniques are crucial for implementing a reliable, unattended backup system.

Scripting Backups

Manually running restic backup is fine for ad-hoc backups, but for regular, reliable protection, you need automation. Shell scripts (Bash on Linux/macOS, PowerShell on Windows) are a common way to achieve this.

  • Writing Shell Scripts (Bash Example): A good backup script should be robust and informative. Key elements:

    1. Configuration: Define repository, password file, paths to back up, and exclusion lists, preferably at the top or via external config files/environment variables.
    2. Locking (simple script-level): Prevent multiple instances of your script from running simultaneously if it performs operations that shouldn't overlap (though Restic's own locking usually handles repository access). A simple pidfile mechanism can be used.
    3. Pre-backup Hooks: Commands to run before the backup (e.g., dumping a database, stopping a service).
    4. The Restic Backup Command: Carefully constructed with all necessary options.
    5. Post-backup Hooks: Commands to run after the backup (e.g., restarting a service, cleaning up database dumps). This includes forget and prune operations.
    6. Logging: Record what happened, any errors, and key statistics.
    7. Error Handling: Detect failures and react appropriately (e.g., send a notification).

    Basic Bash Backup Script Example:

    #!/bin/bash
    
    # --- Configuration ---
    # Best practice: Set these via environment variables or a secure config file read by the script
    export RESTIC_REPOSITORY="sftp:my_sftp_backup_server:/srv/restic_storage/my_client_sftp_repo"
    export RESTIC_PASSWORD_FILE="/home/user/.config/restic/sftp_repo_password.txt" # Adjust path
    
    # Paths to back up (space-separated)
    BACKUP_PATHS="/home/user/documents /etc /var/www"
    
    # Exclude file (one pattern per line)
    EXCLUDE_FILE="/home/user/.config/restic/restic_excludes.txt"
    
    # Log file
    LOG_FILE="/var/log/restic_backup.log"
    
    # Retention policy
    KEEP_DAILY=7
    KEEP_WEEKLY=4
    KEEP_MONTHLY=6
    KEEP_YEARLY=1
    
    # --- Script Logic ---
    # Function for logging with timestamp
    log() {
        echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "${LOG_FILE}"
    }
    
    # Ensure password file exists and has correct permissions
    if [ ! -f "${RESTIC_PASSWORD_FILE}" ]; then
        log "ERROR: Restic password file not found at ${RESTIC_PASSWORD_FILE}"
        exit 1
    fi
    # Check permissions (optional, but good practice)
    # if [ "$(stat -c %a "${RESTIC_PASSWORD_FILE}")" != "600" ] && [ "$(stat -c %a "${RESTIC_PASSWORD_FILE}")" != "400" ]; then
    #     log "WARNING: Password file ${RESTIC_PASSWORD_FILE} has insecure permissions."
    # fi
    
    
    log "Starting Restic backup job for paths: ${BACKUP_PATHS}"
    
    # Pre-backup commands (example: dump a database)
    # log "Dumping PostgreSQL database..."
    # pg_dumpall -U postgres | gzip > /tmp/postgres_dump.sql.gz
    # BACKUP_PATHS="${BACKUP_PATHS} /tmp/postgres_dump.sql.gz" # Add dump to backup paths
    
    # Restic backup command
    # Using --verbose for more detailed logging from Restic
    # Using --tag "automated" for easy identification
    # Using --exclude-file if it exists
    EXCLUDE_OPTS=""
    if [ -f "${EXCLUDE_FILE}" ]; then
        EXCLUDE_OPTS="--exclude-file=${EXCLUDE_FILE}"
        log "Using exclude file: ${EXCLUDE_FILE}"
    fi
    
    log "Running restic backup..."
    restic backup ${BACKUP_PATHS} \
        --tag automated \
        ${EXCLUDE_OPTS} \
        --verbose >> "${LOG_FILE}" 2>&1 # Append stdout and stderr to log
    
    BACKUP_EXIT_CODE=$?
    
    if [ ${BACKUP_EXIT_CODE} -eq 0 ]; then
        log "Restic backup completed successfully."
    else
        log "ERROR: Restic backup failed with exit code ${BACKUP_EXIT_CODE}."
        # Add notification command here (e.g., mail, webhook)
        # exit 1 # Decide if script should terminate on backup failure
    fi
    
    # Post-backup commands (example: remove database dump)
    # log "Cleaning up PostgreSQL dump..."
    # rm -f /tmp/postgres_dump.sql.gz
    
    # Prune old snapshots (forget and prune)
    # Only run prune if backup was successful or if you want to prune regardless
    if [ ${BACKUP_EXIT_CODE} -eq 0 ] || [ "$1" == "--force-prune" ]; then # Allow forcing prune via argument
        log "Running restic forget and prune..."
        restic forget \
            --keep-daily ${KEEP_DAILY} \
            --keep-weekly ${KEEP_WEEKLY} \
            --keep-monthly ${KEEP_MONTHLY} \
            --keep-yearly ${KEEP_YEARLY} \
            --prune \
            --group-by paths,tags >> "${LOG_FILE}" 2>&1 # Group by paths AND tags for retention
    
        FORGET_EXIT_CODE=$?
        if [ ${FORGET_EXIT_CODE} -eq 0 ]; then
            log "Restic forget and prune completed successfully."
        else
            log "ERROR: Restic forget/prune failed with exit code ${FORGET_EXIT_CODE}."
        fi
    else
        log "Skipping forget/prune due to backup failure or policy."
    fi
    
    # (Optional) Check repository integrity periodically
    # if (( $(date +%u) == 7 )); then # e.g., run check on Sundays
    #    log "Running weekly repository check (no data read)..."
    #    restic check >> "${LOG_FILE}" 2>&1
    # fi
    
    log "Restic backup job finished."
    exit 0
    
    Make this script executable: chmod +x your_backup_script.sh Remember to create the exclude file (restic_excludes.txt) and the password file with appropriate content and permissions.

  • Logging and Error Handling in Scripts:

    • Logging: As shown above, redirect stdout and stderr from Restic commands to a log file (>> "${LOG_FILE}" 2>&1). Use tee -a if you want to see output on the console and log it. Prefix log entries with timestamps.
    • Error Handling: Check the exit code of Restic commands ($? in Bash). A non-zero exit code usually indicates an error.
      • 0: Success
      • 1: General error (e.g., snapshot contains no files, source file not found)
      • 3: Fatal error (e.g., repository cannot be opened, integrity issue) Based on the exit code, your script can log the error, send notifications (email, Slack, etc.), or take other actions. set -e at the top of a Bash script makes it exit immediately if any command fails, which can be useful but requires careful handling if you want to perform cleanup actions on failure.
  • Using --json for Machine-Readable Output: Many Restic commands support the --json flag (e.g., restic backup --json, restic snapshots --json). This makes Restic output data in JSON format, which is much easier for scripts to parse than human-readable text.

    • Example: Get the ID of the last backup:
      # Get the last snapshot's short ID
      LAST_SNAPSHOT_ID=$(restic snapshots --json --latest 1 | jq -r '.[0].short_id')
      # Requires 'jq' (a command-line JSON processor)
      # If successful, LAST_SNAPSHOT_ID will contain the ID
      
      This is useful for more advanced scripting where you need to programmatically use information from Restic.

Scheduling Backups

Once you have a reliable backup script, you need to schedule it to run automatically.

  • cron on Linux/macOS: cron is the standard job scheduler on Unix-like systems.

    1. Edit your user's crontab: crontab -e
    2. Add a line to schedule your script. The format is: minute hour day_of_month month day_of_week /path/to/command
      • Example: Run /home/user/bin/my_restic_backup.sh every day at 2:30 AM:
        # Ensure RESTIC_PASSWORD_FILE and other env vars are set within the script
        # or in the crontab itself (though less secure for passwords in crontab).
        # It's best if the script handles its environment.
        30 2 * * * /home/user/bin/my_restic_backup.sh > /tmp/restic_cron.log 2>&1
        
        • It's good practice to redirect stdout and stderr from the cron job to a log file or /dev/null if the script itself handles logging.
        • Ensure your script uses absolute paths for commands or sets its own PATH environment variable, as cron jobs run with a minimal environment.
        • Important: Ensure RESTIC_PASSWORD_FILE and RESTIC_REPOSITORY are correctly set or accessible in the cron environment. Often, it's best to export these at the beginning of your script.
  • Task Scheduler on Windows: Windows uses Task Scheduler for automated jobs.

    1. Open "Task Scheduler" (search in Start Menu).
    2. Click "Create Basic Task..." or "Create Task..." for more options.
    3. Name/Description: Give your task a name (e.g., "Restic Daily Backup").
    4. Trigger: Define when it should run (e.g., "Daily", set time).
    5. Action: "Start a program".
      • Program/script: powershell.exe (or pwsh.exe for PowerShell Core)
      • Add arguments: -ExecutionPolicy Bypass -File "C:\path\to\your_restic_backup.ps1"
      • (Or directly run restic.exe if you pass all arguments on the command line here, but a script is more flexible).
    6. Conditions/Settings: Configure power options, what to do if task fails, etc.
    7. Run with highest privileges: May be needed if backing up system files.
    8. User Account: Specify the user account under which the task should run. This account needs access to the Restic binary, password file, and data to be backed up. A PowerShell backup script (.ps1) would be analogous to the Bash script, using PowerShell cmdlets for file operations, logging, and environment variables ($env:RESTIC_REPOSITORY, $env:RESTIC_PASSWORD_FILE).
  • systemd Timers on Linux: systemd offers a more modern and flexible alternative to cron on Linux systems that use systemd. It involves two unit files: a .service file (defines what to run) and a .timer file (defines when to run it).

    1. Create a .service file (e.g., /etc/systemd/system/restic-backup.service or ~/.config/systemd/user/restic-backup.service for a user service):
      [Unit]
      Description=Restic Backup Service
      # Add After=network-online.target if backing up to a remote repo
      # Add Requires=network-online.target if it MUST have network
      
      [Service]
      Type=oneshot
      # User=your_backup_user # If running as a system service but want a specific user
      # Group=your_backup_group
      ExecStart=/usr/local/bin/my_restic_backup.sh # Absolute path to your script
      # Environment="RESTIC_REPOSITORY=..."
      # Environment="RESTIC_PASSWORD_FILE=..."
      # (Alternatively, set these inside my_restic_backup.sh)
      StandardOutput=append:/var/log/restic-backup-service.log
      StandardError=append:/var/log/restic-backup-service.error.log
      
      [Install]
      WantedBy=multi-user.target # Or default.target for user services
      
    2. Create a .timer file (e.g., /etc/systemd/system/restic-backup.timer or ~/.config/systemd/user/restic-backup.timer):
      [Unit]
      Description=Run Restic Backup Daily
      RefuseManualStart=no # Allow manual starting of the timer
      RefuseManualStop=no  # Allow manual stopping of the timer
      
      [Timer]
      # Run daily at 2:30 AM
      OnCalendar=*-*-* 02:30:00
      # Alternatively, run 15 minutes after boot and daily thereafter
      # OnBootSec=15min
      # OnUnitActiveSec=1d
      Persistent=true # Run job if missed due to downtime when machine next boots
      
      [Install]
      WantedBy=timers.target
      
    3. Enable and Start the Timer:
      • If created as system services (in /etc/systemd/system/):
        sudo systemctl daemon-reload
        sudo systemctl enable restic-backup.timer
        sudo systemctl start restic-backup.timer
        # Check status:
        sudo systemctl list-timers
        sudo systemctl status restic-backup.service
        # View logs: journalctl -u restic-backup.service
        
      • If created as user services (in ~/.config/systemd/user/):
        systemctl --user daemon-reload
        systemctl --user enable restic-backup.timer
        systemctl --user start restic-backup.timer
        # Check status:
        systemctl --user list-timers
        # User services need lingering enabled for the user if they should run when user is not logged in:
        # sudo loginctl enable-linger your_username
        
        systemd offers better logging integration (via journalctl), dependency management, and resource control compared to cron.

Excluding Files and Directories Effectively

You often don't want to back up everything (e.g., cache files, temporary files, large unimportant downloads). Restic provides several ways to exclude files.

  • --exclude <pattern>: Specify a pattern to exclude. Can be used multiple times. Patterns are shell glob patterns (e.g., *.tmp, node_modules).

    • Example: restic backup /home/user --exclude='*.log' --exclude='/home/user/Downloads'
    • Paths are matched relative to the backup source if they are not absolute.
    • If a pattern ends with /, it only matches directories.
    • ** can match any sequence of characters including path separators, e.g., **/cache/** would match any file or directory under a directory named cache anywhere in the backup.
  • --exclude-file <filepath>: Provide a path to a text file containing one exclusion pattern per line. Comments (lines starting with #) and empty lines are ignored. This is much cleaner for managing many exclusions. Example restic_excludes.txt:

    # Cache directories
    **/.cache
    **/Cache
    **/cache
    
    # Temporary files
    *.tmp
    *.temp
    *.~
    
    # Specific large directories I don't need backed up
    /home/user/Downloads/isos
    /home/user/SteamLibrary
    
    # Node.js dependencies
    node_modules/
    
    Then run: restic backup /home/user --exclude-file=/path/to/restic_excludes.txt

  • --exclude-caches Tag: This special option tells Restic to look for directories that are marked with a "Cache Directory Tag." This is a file named CACHEDIR.TAG inside a directory. The content of this file should be as specified by the Cache Directory Tagging Standard. If Restic finds such a tagged directory, it will exclude it. Many applications are starting to adopt this standard. To use: restic backup /home/user --exclude-caches You can combine this with other --exclude options.

  • --iexclude <pattern> and --iexclude-file <filepath>: Case-insensitive versions of --exclude and --exclude-file.

  • Order of Precedence for Include/Exclude Rules: Restic evaluates include/exclude rules in the order they are given on the command line. Generally, the last matching rule wins. However, there are also --files-from type options that interact with this. For basic --exclude and --include (used for restore, not typically for backup selection which is path-based):

    • If you have restic restore ... --include foo --exclude bar, the order matters.
    • It's often simpler to manage exclusions for backup primarily through --exclude and --exclude-file. For backup, if you specify multiple source paths and also exclusions, the exclusions apply to all items being considered from those source paths.

Backup Strategies

A "strategy" involves deciding what to back up, how often, and where to store it, guided by principles like the 3-2-1 rule.

  • Full vs. Incremental (Restic is always "incremental forever" effectively):

    • Traditional Full Backup: A complete copy of all selected data.
    • Traditional Incremental Backup: Copies data changed since the last backup (full or incremental).
    • Traditional Differential Backup: Copies data changed since the last full backup.
    • Restic's Approach: Restic doesn't strictly follow these traditional models for its core mechanism. Every Restic backup is like a "full" snapshot in terms of what it represents (a complete view of the selected data at that point in time). However, due to deduplication, it only transfers and stores the changed data (new unique chunks) compared to what's already in the repository.
      • The first backup of a dataset is effectively a "full" transfer.
      • Subsequent backups are "incremental" in terms of data transfer and storage, but result in a new, complete, independent snapshot. This is often called "incremental forever" or "deduplicated fulls." You don't need to manage chains of full + incrementals; every snapshot is self-contained for restore.
  • 3-2-1 Backup Rule and how Restic fits in:
    A widely respected guideline for data protection:

    • 3 Copies of Your Data: Your primary (live) data + two backups.
    • 2 Different Storage Media: Store copies on at least two distinct types of storage (e.g., internal HDD, external HDD, LTO tape, cloud storage). This protects against failure of a specific medium type.
    • 1 Off-site Copy: At least one backup copy should be stored in a different physical location (e.g., cloud, friend's house, office safe). This protects against local disasters like fire, flood, or theft.

    How Restic Helps Implement 3-2-1:

    • Multiple Copies: You can create multiple Restic repositories.
      • Repo 1: Local external HDD (Copy 2, Medium 1, On-site)
      • Repo 2: Remote SFTP server or S3 cloud bucket (Copy 3, Medium 2, Off-site)
    • You would run your Restic backup script targeting both repositories (or use restic copy to synchronize snapshots between an on-site and off-site repository).
    • Restic's client-side encryption ensures your off-site copy is secure even if stored on third-party infrastructure.
  • Backing Up Different Types of Data:

    • User Files (Documents, Photos, etc.): Straightforward. Point Restic to ~/Documents, ~/Pictures, etc.
    • Application Configuration Files: Usually in /etc, ~/.config, ~/.local/share. Back these up.
    • Databases (PostgreSQL, MySQL, SQLite, etc.):
      • Critical: You cannot reliably back up live database files directly by just copying them, as they might be in an inconsistent state.
      • Solution: Use the database's native dump tool to create a consistent backup file first, then back up that dump file with Restic.
        • PostgreSQL: pg_dump or pg_dumpall
        • MySQL/MariaDB: mysqldump
        • SQLite: The .sqlite file can often be copied if no process is writing to it, or use the .backup command in the sqlite3 CLI.
      • Your pre-backup hook in the script would run the dump; your post-backup hook might remove the dump file.
    • Virtual Machines:
      • If VMs are shut down, you can back up their disk image files.
      • If live, it's better to use snapshot capabilities of the hypervisor (if available) to get a consistent state, then back up that snapshot or export. Or, run Restic inside the VM to back up its critical data.
    • Docker Volumes:
      • Stop containers using the volume.
      • Back up the volume's data from the Docker host path (e.g., /var/lib/docker/volumes/myvolume/_data).
      • Or, run Restic in a temporary container that mounts the volume and the Restic config/cache, then performs the backup.

Workshop Automated Backup Script with Exclusions

This workshop will guide you through creating a Bash script for automated backups, incorporating exclusions, basic logging, and then setting it up with cron.

Goals:

  1. Create an exclusion file.
  2. Develop a Bash script that:
    • Uses environment variables for repository and password file (assumed to be set in the script or calling environment).
    • Backs up your user's home directory (or a chosen subdirectory for safety/speed in the workshop).
    • Uses the exclusion file.
    • (Optionally) Excludes files larger than a certain size using find (more complex) or relies on Restic's standard exclusions for simplicity here. Restic doesn't have a direct --exclude-larger-than flag; this would typically involve pre-filtering with find or a similar tool if strictly needed. For this workshop, we'll focus on pattern exclusions.
    • Logs its activity to a file.
  3. Set up a cron job (or systemd timer if you prefer and are on Linux) to run this script.

Prerequisites:

  • A Linux or macOS environment (for Bash and cron).
  • Restic installed.
  • An initialized Restic repository (local or remote, e.g., the SFTP one from Workshop 4).
  • RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE environment variables should be handled by the script or pre-set.

Steps:

  1. Prepare Environment and Test Data:

    • Repository: Decide which repository to use. For this workshop, using the local ~/restic_workshop/my_local_repo is fine, or use the SFTP one if you have it running (sftp:my_sftp_backup_server:/srv/restic_storage/my_client_sftp_repo).
    • Password File: Ensure the corresponding password file is set up (e.g., ~/restic_workshop/mypass.txt or ~/restic_workshop/sftp_pass.txt).
    • Backup Source: For safety and speed during the workshop, let's not back up your entire home directory. Create a dedicated test directory.
      # In ~/restic_workshop
      mkdir -p ./home_sim/Documents
      mkdir -p ./home_sim/Downloads
      mkdir -p ./home_sim/.cache # A cache directory to exclude
      echo "My important document" > ./home_sim/Documents/doc1.txt
      echo "A temporary download" > ./home_sim/Downloads/big_file.iso.tmp # A temp file
      echo "Cache content" > ./home_sim/.cache/app_cache_data
      touch ./home_sim/another_file.log # A log file
      
    • Target for Backup: We will back up ./home_sim.
  2. Create an Exclusion File: Create ~/restic_workshop/my_restic_excludes.txt:

    # Exclude all .cache directories and their contents
    **/.cache
    
    # Exclude temporary files
    *.tmp
    *.temp
    
    # Exclude log files
    *.log
    
    # Exclude specific subdirectories of Downloads if needed
    # home_sim/Downloads/unwanted_stuff/
    
    Note: The paths in the exclude file are relative to the items being scanned. **/.cache will find any directory named .cache anywhere within the backup source. If you backed up /home/user, and had /home/user/app/.cache, it would be matched. If backing up ./home_sim, then ./home_sim/.cache is matched.

  3. Develop the Backup Script: Create ~/restic_workshop/do_backup.sh with the following content. Adjust REPO_URL and PASS_FILE carefully.

    #!/bin/bash
    
    # Exit on error
    set -e
    
    # --- Configuration ---
    # !! IMPORTANT !! SET THESE TO YOUR ACTUAL VALUES
    # Using the local repo for this workshop example for simplicity
    REPO_URL="${HOME}/restic_workshop/my_local_repo"
    PASS_FILE="${HOME}/restic_workshop/mypass.txt" # Assumes this file contains the password for REPO_URL
    
    # Source directory to back up
    SOURCE_DIR="${HOME}/restic_workshop/home_sim"
    
    # Exclude file
    EXCLUDE_FILE="${HOME}/restic_workshop/my_restic_excludes.txt"
    
    # Log file
    LOG_DIR="${HOME}/restic_workshop/logs"
    LOG_FILE="${LOG_DIR}/restic_backup_$(date +%Y-%m-%d).log"
    
    # Retention (applied after successful backup)
    KEEP_LAST=5
    KEEP_DAILY=7
    KEEP_WEEKLY=4
    # --- End Configuration ---
    
    # Ensure log directory exists
    mkdir -p "${LOG_DIR}"
    
    # Function for logging
    log_msg() {
        echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "${LOG_FILE}"
    }
    
    # Check if source directory exists
    if [ ! -d "${SOURCE_DIR}" ]; then
        log_msg "ERROR: Source directory ${SOURCE_DIR} not found."
        exit 1
    fi
    
    # Check if password file exists
    if [ ! -f "${PASS_FILE}" ]; then
        log_msg "ERROR: Restic password file ${PASS_FILE} not found."
        exit 1
    fi
    # It's good practice to ensure password file has restrictive permissions (e.g. 600 or 400)
    # chmod 600 "${PASS_FILE}" # Uncomment if you want script to enforce this
    
    # Set Restic environment variables
    export RESTIC_REPOSITORY="${REPO_URL}"
    export RESTIC_PASSWORD_FILE="${PASS_FILE}"
    
    log_msg "=== Starting Restic Backup for ${SOURCE_DIR} ==="
    
    # Backup
    log_msg "Running backup..."
    # Using nice and ionice to lower priority if running from cron
    nice -n 10 ionice -c 3 restic backup "${SOURCE_DIR}" \
        --exclude-file="${EXCLUDE_FILE}" \
        --tag "automated_home_sim" \
        --verbose >> "${LOG_FILE}" 2>&1 # Append Restic's own verbose output
    
    BACKUP_EC=$?
    if [ ${BACKUP_EC} -eq 0 ]; then
        log_msg "Backup completed successfully."
    elif [ ${BACKUP_EC} -eq 1 ]; then
        log_msg "Backup completed with some warnings (e.g., source files changed during backup). Exit code: ${BACKUP_EC}"
        # For exit code 1 (warnings), we might still want to proceed with forget/prune
    else
        log_msg "ERROR: Backup failed with exit code ${BACKUP_EC}."
        log_msg "See Restic output above for details."
        log_msg "=== Backup Job FAILED ==="
        exit ${BACKUP_EC} # Exit script with Restic's error code
    fi
    
    # Forget and Prune (only if backup was successful or had warnings)
    log_msg "Running forget and prune..."
    nice -n 10 ionice -c 3 restic forget \
        --keep-last ${KEEP_LAST} \
        --keep-daily ${KEEP_DAILY} \
        --keep-weekly ${KEEP_WEEKLY} \
        --prune \
        --group-by "paths,tags" >> "${LOG_FILE}" 2>&1
    
    FORGET_EC=$?
    if [ ${FORGET_EC} -eq 0 ]; then
        log_msg "Forget and prune completed successfully."
    else
        log_msg "ERROR: Forget/prune failed with exit code ${FORGET_EC}."
    fi
    
    log_msg "=== Backup Job Finished ==="
    exit 0
    
    Make it executable: chmod +x ~/restic_workshop/do_backup.sh

  4. Test the Script Manually:

    ~/restic_workshop/do_backup.sh
    

    • Check the output on the console and in the log file (~/restic_workshop/logs/restic_backup_YYYY-MM-DD.log).
    • Verify a new snapshot appears: restic -r "${HOME}/restic_workshop/my_local_repo" -p "${HOME}/restic_workshop/mypass.txt" snapshots
    • Check if exclusions worked. Use restic ls <snapshot_id> to inspect the contents of the new snapshot. You should not see .cache directory, *.tmp, or *.log files from home_sim. E.g.:
      # Get latest snapshot ID for the specific path and tag
      LATEST_ID=$(restic -r "${HOME}/restic_workshop/my_local_repo" -p "${HOME}/restic_workshop/mypass.txt" snapshots --json --latest 1 --path "${HOME}/restic_workshop/home_sim" --tag "automated_home_sim" | jq -r '.[0].short_id')
      echo "Inspecting snapshot: $LATEST_ID"
      restic -r "${HOME}/restic_workshop/my_local_repo" -p "${HOME}/restic_workshop/mypass.txt" ls $LATEST_ID
      
      You should see home_sim/Documents/doc1.txt but not the excluded files/folders.
  5. Set up a Cron Job:

    • Open your crontab: crontab -e
    • Add a line to run the script. For testing, let's run it every 5 minutes. (Remember to remove or change this after testing!)
      # Example: Run every 5 minutes for testing.
      # IMPORTANT: Change this to a sane schedule like daily after testing.
      */5 * * * * /bin/bash ${HOME}/restic_workshop/do_backup.sh
      
      • Note: Cron often runs with a very minimal environment. The script explicitly sets RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE using absolute paths, which is good. nice and ionice are used to make the backup less resource-intensive if run in the background.
      • PATH issues: If restic or other commands like nice, ionice, jq are not found, you might need to specify their full paths in the script or set a PATH variable at the top of the script: export PATH=/usr/local/bin:/usr/bin:/bin:$PATH.
    • Wait for it to run. Check the log file in ~/restic_workshop/logs/.
    • Verify new snapshots are created.
    • Once testing is complete, change the cron schedule to something reasonable, e.g., daily at 3 AM:

      0 3 * * * /bin/bash ${HOME}/restic_workshop/do_backup.sh
      
      And save the crontab.

    • Alternative for systemd users (Linux): You could adapt the systemd timer example from the theory section to call do_backup.sh. The .service file's ExecStart would point to your script. Ensure the user running the systemd service (if it's a system service) has access to the repository, password file, and source data. User services are often easier for home directory backups.

This workshop provided a practical template for an automated Restic backup script. Remember that robust scripting also involves more comprehensive error notification (e.g., email alerts), which was omitted for brevity but is crucial for production systems. Regularly check your logs and test your restores!

6. Restic Internals and Troubleshooting

Understanding some of Restic's internal workings can be immensely helpful when troubleshooting issues or trying to optimize performance. This section provides a glimpse into Restic's data structures, common problems and their solutions, performance tuning tips, and how to handle data migration or repository upgrades.

Understanding Restic's Data Structures

Restic organizes data within its repository in a specific, well-defined way. The repository itself is just a directory (or an S3 bucket prefix, etc.) containing several subdirectories and a config file.

  • config file:

    • Located at the root of the repository.
    • A JSON file containing the repository ID, version information (which version of the repository format is being used), and chunker parameters (min/avg/max chunk size).
    • This file is not encrypted, as Restic needs to read it to understand how to interact with the repository.
  • keys/ directory:

    • Contains files, each holding an encrypted copy of the master encryption key and master MAC key for the repository.
    • These files are encrypted using a key derived from your repository password via scrypt.
    • This allows password changes by adding a new key file encrypted with a new password, then removing the old one (though Restic's current tooling for this is via restic copy to a new repo, or more complex key management).
  • snapshots/ directory:

    • Stores one file per snapshot. Each file is named with the full SHA-256 ID of the snapshot.
    • These files are JSON structures containing metadata about the backup:
      • Time of backup.
      • Hostname where the backup was made.
      • Username.
      • Tags associated with the snapshot.
      • Paths that were backed up.
      • A pointer (SHA-256 hash) to the root tree object that represents the top-level directory structure of this snapshot.
    • Snapshot files are encrypted.
  • index/ directory:

    • Contains index files. Each file is named with a SHA-256 ID.
    • These files map blob IDs (SHA-256 hashes of content) to the pack files where these blobs are stored, along with their offset and length within the pack.
    • The index is crucial for quickly locating data and for deduplication (checking if a blob already exists).
    • Index files are encrypted. Restic loads these into memory (or a local cache) for fast lookups.
    • restic rebuild-index reconstructs these files if they get corrupted. restic prune also rebuilds the index.
  • locks/ directory:

    • Contains lock files created by Restic operations to prevent concurrent modifications that could corrupt the repository.
    • Each lock file is named with a unique ID.
    • Stale locks (left behind after a crash) can be removed with restic unlock.
  • data/ directory:

    • This is where the actual backup data is stored in pack files.
    • Pack files are named with two hexadecimal digits as a subdirectory (e.g., 00/, 0a/, ff/) and then the SHA-256 ID of the pack file itself (e.g., 00/00ab12ef...). This sharding helps manage very large numbers of pack files.
    • Each pack file contains multiple blobs concatenated together.
    • Pack files are encrypted.
  • Blobs (the fundamental units): Restic identifies two main types of blobs by their content, and stores them all mixed together in pack files:

    1. Data Blobs:
      • These store the (compressed and then encrypted) content of your file chunks.
      • When Restic chunks a file, each chunk's content is hashed. This hash becomes the ID of the data blob.
    2. Tree Blobs:
      • These store (encrypted) directory listings and file metadata.
      • A tree blob represents a single directory. It contains a list of "nodes." Each node can be:
        • A file: storing its name, size, modification time, permissions, and a list of data blob IDs (hashes) that constitute its content.
        • A subdirectory: storing its name and a pointer (hash) to another tree blob representing that subdirectory.
        • Other filesystem objects like symlinks.
      • The structure is hierarchical: a snapshot points to a root tree blob, which points to other tree blobs for subdirectories, and eventually to data blobs for file content.
      • Tree blobs are also deduplicated. If a directory's content (list of files and subdirs with their metadata) hasn't changed, its tree blob hash will be the same, and it won't be stored again.
  • How it all fits together (simplified backup process):

    1. Restic scans the files/directories to be backed up.
    2. For each file, it performs Content Defined Chunking.
    3. Each chunk's content is hashed (SHA-256). This hash is the data blob's ID.
    4. Restic checks its index: does this data blob ID already exist?
      • If yes: Deduplicated. Restic just notes the ID.
      • If no: The chunk is compressed (if applicable, Restic's compression is light), encrypted, and eventually written to a pack file. The index is updated.
    5. For each directory, Restic creates a list of its contents (files, subdirectories, metadata, pointers to their respective data/tree blob IDs). This list forms a tree blob.
    6. The tree blob is hashed. This hash is its ID. It's then checked against the index for deduplication, encrypted, and stored if new.
    7. This process continues recursively up to the root(s) of the backup.
    8. Finally, a snapshot file is created, pointing to the root tree blob(s) of the backup, along with other metadata (time, host, tags). This snapshot file is encrypted and saved.

Troubleshooting Common Issues

Even with a robust tool like Restic, you might encounter issues. Here's how to approach some common ones:

  • "Repository not found" or "password incorrect" or "Fatal: unable to open_repository: repository does not exist":

    • Cause:
      1. Incorrect repository path (-r or RESTIC_REPOSITORY). Double-check for typos, correct protocol (e.g., sftp:, s3:), and server details.
      2. Incorrect password (either typed interactively or in RESTIC_PASSWORD_FILE / RESTIC_PASSWORD). Restic cannot distinguish between a wrong password and a non-existent/inaccessible repository for security reasons (to avoid leaking information about repository existence).
      3. Network issues preventing access to a remote repository.
      4. Permissions issues (Restic process cannot read/write to the local repository path or connect to the remote one).
      5. For S3/B2/Azure/GCS: Incorrect credentials (access keys, secrets, account IDs) or bucket/container names. Ensure environment variables like AWS_ACCESS_KEY_ID are correctly set and exported.
    • Troubleshooting:
      • Verify the repository path meticulously.
      • If using RESTIC_PASSWORD_FILE, ensure the file path is correct and the file contains only the password and a single newline at the end (some systems are sensitive to extra newlines or spaces). cat -A your_password_file can show hidden characters.
      • Try providing the password interactively to rule out password file issues.
      • Test network connectivity to remote hosts (e.g., ping, ssh, mc ls ALIAS/bucket if using MinIO client).
      • Check file system permissions for local repositories.
      • For cloud backends, double-check credentials and bucket/container policies in the cloud provider's console. Use verbose flags with Restic (e.g., -vv) or with the cloud provider's CLI tool to get more diagnostic info.
  • Stale Locks (restic unlock):

    • Symptom: Commands fail with "repository is locked by another process" or similar.
    • Cause: A previous Restic process (backup, prune, etc.) terminated uncleanly (crashed, killed, network drop) without removing its lock file from the locks/ directory in the repository.
    • Solution:
      1. Verify: Make absolutely sure no other Restic process is legitimately accessing the repository. Check ps aux | grep restic on relevant machines.
      2. List Locks: restic list-locks will show existing lock files, their IDs, hostnames, and creation times. This can help identify if a lock is genuinely old or potentially active.
      3. Remove Locks:
        • restic unlock <lock_id_from_list> to remove a specific lock.
        • restic unlock --remove-all to remove all locks. Use with extreme caution, only when certain no other process is active.
    • Prevention: Ensure scripts handle signals gracefully (trap SIGINT, SIGTERM) to attempt clean shutdown. Robust network connections help for remote repositories.
  • Slow Backups/Restores (Network, Disk I/O, CPU):

    • Cause & Troubleshooting:
      • Network (for remote repos):
        • Bandwidth: Is your upload/download speed a bottleneck? Use speed test tools. Restic offers --limit-upload and --limit-download (in KiB/s) to throttle itself if it's overwhelming your connection.
        • Latency: High latency (long ping times) to the remote server can significantly slow down operations involving many small file transfers or interactions, like SFTP.
        • Packet Loss: Can cripple performance. ping -c 100 your_server can show packet loss.
      • Disk I/O (Client and/or Server):
        • Slow hard drives (especially SMR drives for writes, or very fragmented drives).
        • Use tools like iostat, iotop (Linux) or Resource Monitor (Windows) to check disk activity and queue lengths.
        • For local repositories, ensure the disk is healthy and has enough free space.
        • For SFTP servers, the server's disk performance is critical.
      • CPU (Client):
        • Restic performs chunking, hashing, and encryption (client-side). This is CPU-intensive. Older or low-power CPUs can be a bottleneck.
        • Check CPU usage during backup (top, htop on Linux).
        • Restic tries to use multiple CPU cores.
      • Small Files: Backing up a huge number of very small files can be slow due to metadata overhead per file and the overhead of individual operations, regardless of raw throughput.
      • Repository State: A repository needing a prune or with a very fragmented index might perform sub-optimally.
      • Antivirus/Security Software: On the client, such software might be scanning files Restic reads/writes, slowing it down. Try temporarily disabling or creating exceptions (with caution).
  • Corrupted Repository (restic check --read-data helps):

    • Symptom: restic check reports errors. restic backup or restic restore might fail with errors about missing blobs, pack files, or MAC verification failures.
    • Cause:
      • Hardware issues (failing disk on client or server, bad RAM).
      • Filesystem corruption.
      • Bugs in Restic (rare, but possible).
      • Manual tampering with repository files.
      • Unreliable network causing silent data corruption during transfer to remote storage (though Restic's end-to-end checksums usually catch this).
    • Troubleshooting & Recovery (Very Advanced, Potential Data Loss):
      1. STOP all backups to this repository immediately.
      2. Run restic check --read-data. Note all errors carefully. This identifies which pack files or blobs are affected.
      3. If the errors point to specific pack files:
        • restic find --pack <pack_id> can show which snapshots/files might be affected by a bad pack.
        • There is NO easy "repair" command in Restic that can magically fix corrupted data. The goal is to salvage as much unaffected data as possible.
        • One drastic measure could be to remove the problematic pack file(s) from the data/ directory (e.g., mv data/xy/xyz... /tmp/bad_packs/). This WILL lead to data loss for any snapshots/files that relied on blobs in that pack.
        • After removing bad packs, run restic rebuild-index. This will make Restic "forget" about the blobs in the removed packs.
        • Then run restic check again. It might show missing blobs.
        • Attempt to restore critical data. Some files/snapshots will be incomplete or unrestorable.
      4. The safest approach if corruption is found is often to:
        • Identify which snapshots are affected using restic find --blob <blob_id> for each missing/corrupt blob reported by check.
        • Try to restore unaffected snapshots or files to a new location.
        • Create a brand new, healthy Restic repository.
        • Back up your source data again to this new repository.
        • If possible and if some snapshots in the old (damaged) repo are still valuable and seem mostly intact, you might try restic copy from the old to the new repo, but it may fail if it encounters the corrupted parts.
      5. Prevention: Use reliable hardware, ECC RAM if possible, filesystems with checksumming (ZFS, Btrfs), and regular restic check --read-data runs.
  • Dealing with "parent snapshot not found" errors (rare):

    • Cause: During backup, Restic can use a "parent" snapshot to speed up scanning for unchanged files. If this parent snapshot ID is specified (e.g., via --parent <id>) but doesn't exist, or if Restic's logic for finding a suitable parent fails in some edge case, this error can occur. It might also happen if a snapshot was partially removed or the repository metadata is inconsistent.
    • Solution:
      • Try running the backup without explicitly specifying a --parent flag: restic backup /path/to/data. Restic will then try to find a suitable parent automatically based on host and paths.
      • If it persists, try restic backup --force /path/to/data. The --force flag makes Restic re-read all files instead of relying on metadata comparisons with a parent, which might bypass the issue.
      • Run restic check to ensure repository integrity. If there are underlying issues, they need to be addressed.

Performance Tuning

Optimizing Restic's performance involves considering the entire chain: source disk, CPU, network, and destination storage.

  • Network Bandwidth Considerations:

    • As mentioned, use --limit-upload N and --limit-download N (where N is in KiB/s) if Restic is saturating your link and causing issues for other applications.
    • For remote repositories, choose a server/service geographically close to you to reduce latency.
    • For SFTP, Restic uses multiple connections (default 5, configurable with -o sftp.connections=N or --sftp-connections=N in newer versions). For S3 and other HTTP-based backends, Restic also uses multiple concurrent connections (default 5, configurable with --option s3.connections=N or similar backend-specific options). Tuning this might help on high-bandwidth, high-latency links, but too many connections can also be detrimental or hit server-side limits.
  • Disk I/O on Client and Server:

    • Source (Client): If backing up from slow HDDs, especially many small files, reading the source data can be the bottleneck. Consider faster SSDs for frequently changing source data.
    • Cache (Client): Restic uses a local cache (~/.cache/restic or RESTIC_CACHE_DIR). Ensure this is on a reasonably fast disk. The cache stores index data and lock files for remote repositories, reducing repeated downloads.
    • Destination (Server/Local Repo):
      • If writing to a local HDD, ensure it's not overly fragmented and is performing well.
      • For self-hosted remote storage (SFTP, MinIO), the server's disk I/O is critical. Use SSDs on the server if possible, or a fast RAID array.
  • CPU Usage:

    • Encryption and chunking are CPU-bound. Faster CPUs with more cores will generally perform better.
    • Restic will try to use available cores. There isn't much direct tuning for CPU usage other than ensuring your system isn't already CPU-starved by other processes. Using nice (Linux/macOS) or process priority settings (Windows) can make Restic "play nicer" with other applications.
  • Using a Local Cache (RESTIC_CACHE_DIR):

    • As mentioned, RESTIC_CACHE_DIR (default ~/.cache/restic) is important for performance with remote repositories. It stores parts of the index and metadata locally.
    • If the default location is on a slow or network-mounted drive, move the cache to a fast local SSD: export RESTIC_CACHE_DIR=/mnt/fast_ssd/restic_cache.
    • The cache can grow quite large for very large repositories. restic cache --cleanup can remove old, unused cache data. restic cache --max-size SIZE can attempt to limit its size.
  • Parallelism Options:

    • --option sftp.connections=N (or similar for other backends like b2.connections, s3.connections) allows tuning the number of parallel upload/download streams for some backends. The default is often 5. Increasing this might help on high-latency, high-bandwidth links but could also overwhelm the server or your local resources. Experiment carefully.
    • For backup itself, Restic has internal parallelism for scanning files and processing data. There isn't a direct "number of backup threads" knob usually, but it's designed to utilize multiple cores.
  • Pack Size (restic prune --max-repack-size SIZE):

    • When prune repacks data, it aims for a certain pack size. If you have specific needs (e.g., very slow storage where fewer, larger files are better, or a filesystem with limits on file size), you can influence this, but it's an advanced option, and defaults are usually fine. restic prune itself will try to create packs up to around 16MiB by default. restic repack offers more control over this.

Data Migration and Repository Upgrades

  • Moving a Repository to a New Backend (e.g., Local to S3): The most straightforward and Restic-idiomatic way is using restic copy.

    1. Initialize the new (empty) destination repository:
      export RESTIC_PASSWORD_FILE_NEW="..." # Password for the new repo
      restic -r s3:your-s3-endpoint/new-bucket/new-repo --password-file $RESTIC_PASSWORD_FILE_NEW init
      
    2. Copy snapshots from the old repository to the new one:
      export RESTIC_PASSWORD_FILE_OLD="..." # Password for the old repo
      restic -r /path/to/old_local_repo --password-file $RESTIC_PASSWORD_FILE_OLD \
             copy \
             --repo2 s3:your-s3-endpoint/new-bucket/new-repo --password-file2 $RESTIC_PASSWORD_FILE_NEW
      
      • This command reads all data from the source, decrypts it (using old password), re-encrypts it (using new password, if different, or if keys are different even with same password for new repo), and writes it to the destination. Deduplication is preserved (it only copies unique data).
      • Can copy all snapshots or select specific ones.
      • This can take a long time for large repositories.
    3. Alternative (Non-Restic, for identical backend types): If moving between two identical backend types (e.g., one S3 bucket to another, or one local filesystem to another), you could use tools like rclone copy or rsync to copy the repository directory structure directly.
      • CRITICAL: The repository must NOT be in use during this kind of direct copy.
      • The repository password and internal encryption keys remain unchanged.
      • After copying, run restic check --read-data on the new location extensively to ensure integrity.
      • This is faster if underlying data doesn't need re-encryption but carries more risk if not done carefully. restic copy is safer.
  • Restic's Repository Format Versions and Upgrades (restic migrate):

    • Over time, the Restic developers might introduce improvements or changes to the repository format.
    • When you initialize a repository, it uses the latest format version supported by that Restic binary.
    • If you update your Restic binary to a newer version that supports a newer repository format, your old repository will still work.
    • However, to take advantage of new format features (e.g., for performance or efficiency), you might need to migrate the repository.
    • The command restic migrate <migration_name> is used for this.
      • Example: A past migration was add_compression (though compression is applied by default to new repos now). If it were still a manual migration, it would be restic migrate add_compression.
    • Migrations are typically one-way and may involve rewriting data, so they can be time-consuming.
    • Always read the Restic release notes carefully when upgrading Restic to see if any migrations are recommended or available.
    • Always back up your repository's config and keys/ directory before attempting a migration, just in case. Or even better, have a full backup of the repo if possible, or test migration on a cloned copy of the repo first.

Workshop Diagnosing and Fixing a "Problem"

This workshop simulates a couple of common, non-destructive "problems" to practice troubleshooting.

Goals:

  1. Simulate and resolve a stale lock issue.
  2. Simulate and understand a password mismatch.
  3. (Discussion) Understand output from restic check when issues are found.

Prerequisites:

  • An initialized Restic repository (e.g., ~/restic_workshop/my_local_repo with its mypass.txt).
  • Environment variables RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE set to this local repo and passfile.

Steps:

  1. Simulate a Stale Lock:

    • The "Safe" Stale Lock Simulation: Restic's own lock files are fairly robust. One way to get a lock without a running process is to manually create one, or to trick Restic. Let's try creating a lock file manually. First, see what real lock files look like (if any exist from a previous interrupted operation, clear them with restic unlock --remove-all on your test repo). A lock file contains JSON with pid, host, user, time, exclusive fields.
      # Ensure you are in ~/restic_workshop and env vars are set for my_local_repo
      # If my_local_repo/locks exists and has files, clear it for this test:
      # restic unlock --remove-all
      # (Ensure no restic process is actually running against this repo!)
      
      # Now, manually create a fake lock file.
      # The name of the lock file is a random hex string.
      # We'll just create one with a known name for simplicity for the workshop.
      mkdir -p "${RESTIC_REPOSITORY}/locks" # Ensure locks dir exists
      FAKE_LOCK_ID="abcdef1234567890abcdef1234567890abcdef1234567890abcdef12345678"
      FAKE_LOCK_FILE="${RESTIC_REPOSITORY}/locks/${FAKE_LOCK_ID}"
      echo '{"time":"2023-01-01T10:00:00Z","exclusive":true,"hostname":"fakehost","username":"fakeuser","pid":1234}' > "${FAKE_LOCK_FILE}"
      log_msg "Created fake lock file: ${FAKE_LOCK_FILE}"
      
    • Attempt a Restic Operation: Try listing snapshots. This usually doesn't require an exclusive lock, but let's try a check which might. Or even backup.
      restic check
      # Or try: restic backup ./source_data --tag lock_test (if source_data exists)
      
      You should see an error message similar to: Fatal: repository is already locked exclusively by PID 1234 on fakehost by user fakeuser, lock created at ... To override the lock, use the 'unlock' command or the --ignore-lock flag. (The exact error message might vary slightly depending on Restic version and the operation).
    • Diagnose and Resolve:
      1. restic list-locks You should see your fake lock listed.
      2. Since we know this is a "stale" lock (we faked it, and PID 1234 on fakehost isn't real), we can remove it. restic unlock --remove-all (for a test repo, this is fine) Or, more targeted: restic unlock ${FAKE_LOCK_ID} (but you need to get the ID from list-locks output if it was random).
      3. Try the Restic operation (restic check) again. It should now work.
      4. Clean up: rm -f "${FAKE_LOCK_FILE}" if unlock didn't remove it or if you want to be sure. (Actually, restic unlock should remove it).
  2. Simulate Password Issue:

    • Change RESTIC_PASSWORD_FILE content or variable: Edit your ~/restic_workshop/mypass.txt and change the password to something incorrect (e.g., add "WRONG" to the end). Or, unset RESTIC_PASSWORD_FILE and try to type interactively.
      # Example: Temporarily use a wrong password file
      echo "wrongpassword123" > ~/restic_workshop/wrong_pass.txt
      export RESTIC_PASSWORD_FILE_ORIGINAL=$RESTIC_PASSWORD_FILE # Save original
      export RESTIC_PASSWORD_FILE=~/restic_workshop/wrong_pass.txt
      
    • Attempt to Access Repository:
      restic snapshots
      
      Restic will output: Fatal: unable to open repository: wrong password or no key found (Or similar, it won't explicitly say "wrong password" to avoid confirming a repository's existence to an attacker).
    • Resolve: Restore the correct password.
      # Restore original password file setting
      export RESTIC_PASSWORD_FILE=$RESTIC_PASSWORD_FILE_ORIGINAL
      rm ~/restic_workshop/wrong_pass.txt # Clean up
      # Or, if you edited mypass.txt directly, change it back to the correct password.
      
      Try restic snapshots again. It should now work. This highlights the importance of accurately managing your password or password file.
  3. Discussion restic check Output on Errors:

    • It's difficult and risky to intentionally corrupt a real repository for a workshop.
    • Instead, let's discuss what restic check (especially with --read-data) might show if it finds problems:

      • pack <ID>: not referenced in any index: A pack file exists in data/ but no index file lists its contents. restic rebuild-index might fix this if the pack is valid. If rebuild-index doesn't help, the pack might be orphaned or an old remnant.
      • pack <ID>: referenced in index <indexID>, but not found: An index file references a pack file that is missing from the data/ directory. This means data loss.
      • blob <ID> in pack <ID> at offset <X> length <Y>: MAC verification failed: When reading a blob from a pack file and decrypting it, the Message Authentication Code (MAC) doesn't match. This means the (encrypted) data is corrupted. This is a serious error indicating data damage.
      • tree <ID>: not found or not a tree: A snapshot or another tree object references a tree ID, but Restic cannot find or validate this tree blob. This can lead to parts of a snapshot being unreadable.
      • unused blob <ID>: After a prune, check might list some blobs as unused. This is often normal for a short period as prune works, but persistent large numbers of unused blobs after successful prunes might indicate an issue. prune should clean these up.
    • What to do?

      • As discussed in the theory, there's no magic "repair" button.
      • Focus on restic rebuild-index for index issues.
      • For missing packs or MAC failures, it means data is lost. The goal is to salvage what you can.
      • Consult Restic forums or GitHub issues with detailed logs from restic check --read-data if you encounter real corruption.

This workshop demonstrated how to identify and resolve some common operational issues. Regular checks and careful handling of repository access are key to maintaining a healthy backup system.

7. Integrating Restic with Other Tools

Restic's command-line nature and robust design make it highly suitable for integration with other tools and workflows. This section explores common integrations, such as backing up Docker volumes, using Rclone as a versatile backend, monitoring Restic backups, and setting up Restic's own rest-server.

Restic and Docker

Docker is widely used for containerizing applications. Persistent data for these applications is often stored in Docker volumes. Backing up these volumes is crucial.

  • Backing up Docker Volumes: Docker volumes are managed by Docker and reside in a specific path on the Docker host (e.g., /var/lib/docker/volumes/ on Linux). There are a few strategies:

    1. Backup from the Host:
      • Identify the mount point of the named volume on the host. You can find this using docker volume inspect <volume_name>. It will show a Mountpoint path.
      • Important: For data consistency, especially for databases or applications actively writing to the volume, you should stop the container(s) using the volume before backing it up.
        docker stop my_app_container
        # Let's say 'docker volume inspect my_volume' shows Mountpoint: /var/lib/docker/volumes/my_volume/_data
        restic backup /var/lib/docker/volumes/my_volume/_data --tag docker_my_volume
        docker start my_app_container
        
      • This can be incorporated into pre/post-backup hooks in your script.
    2. Run Restic Inside a "Sidecar" or "Job" Container:
      • Create a Docker service or run a temporary container that mounts:
        • The Docker volume you want to back up (e.g., mount my_app_data_volume to /data_to_backup inside the Restic container).
        • A directory for Restic's configuration (repository URL, password file path) or pass them as environment variables.
        • Optionally, a directory for Restic's cache (/root/.cache/restic or as configured).
      • The Restic container then runs the restic backup /data_to_backup command.
      • Example docker run command (conceptual):
        # Assume RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE are set in the environment
        # or passed directly to the Restic container using -e
        docker run --rm \
            -v my_app_data_volume:/data_to_backup:ro \ # Mount volume read-only if just backing up
            -v /path/to/host/restic_config:/etc/restic \ # Mount a dir with password file
            -v /path/to/host/restic_cache:/root/.cache/restic \ # Mount cache dir
            -e RESTIC_REPOSITORY="your_repo_url" \
            -e RESTIC_PASSWORD_FILE="/etc/restic/password" \
            restic/restic \ # Official Restic Docker image
            backup /data_to_backup --tag docker_sidecar_backup
        
        • The restic/restic image on Docker Hub is convenient.
        • This approach keeps Restic and its dependencies containerized.
  • Strategies for Backing Up Containerized Applications:

    • Databases: Always use the database's native dump tool (e.g., mysqldump, pg_dump) executed inside the database container or via docker exec. Pipe the output to a file that Restic can then back up (either from the host or via a sidecar Restic container that also mounts the dump location).
      # Example for PostgreSQL:
      docker exec my_postgres_container pg_dumpall -U postgres > /tmp/db_dump.sql
      # Then back up /tmp/db_dump.sql using Restic
      rm /tmp/db_dump.sql
      
    • Application Data vs. Configuration:
      • Separate application data (often in volumes) from application configuration (Docker Compose files, environment files, custom entrypoint scripts).
      • Back up both. Config files are small and critical for recreating the service.
    • Backup Frequency: Might vary. Database transaction logs might need frequent backups, while application code (if managed by version control) might not need Restic backup as often as the data volumes.

Restic with Rclone

Rclone is a versatile command-line program for managing files on over 70 cloud storage services. Restic can use Rclone as a "backend," effectively allowing Restic to store its repositories on any service Rclone supports.

  • Using Rclone as a Versatile Backend:

    • Why?
      • Access to cloud providers not natively supported by Restic (e.g., Dropbox, Google Drive, OneDrive, Jottacloud, etc.).
      • Utilize Rclone's advanced features like its own encryption layering (though Restic already encrypts, this could be for obfuscation or policy compliance), caching, or union/crypt remotes.
      • Consolidate cloud access: If you already use Rclone for other purposes, Restic can leverage its existing remotes.
    • How it Works: Restic executes the rclone binary in the background to perform operations on the remote. Restic tells Rclone what to do (list files, read data, write data, delete data) using Rclone's specific commands for serving Restic (rclone serve restic).
  • Setting up an Rclone Remote:

    1. Install Rclone: Download from rclone.org and set it up.
    2. Configure a Remote: Use rclone config to create a new remote.
      rclone config
      # Follow the interactive prompts:
      # n) New remote
      # name> myCloudStorage  (e.g., for Google Drive, Dropbox, etc.)
      # Storage> (select the number corresponding to your cloud provider)
      # ... follow provider-specific authentication steps ...
      
      This creates an entry in Rclone's configuration file (usually ~/.config/rclone/rclone.conf).
  • Configuring Restic to Use an Rclone Remote: The Restic repository URL format is rclone:<rcloneRemoteName>:<path_in_remote>.

    • <rcloneRemoteName>: The name you gave the remote in rclone config (e.g., myCloudStorage).
    • <path_in_remote>: The path/folder within that Rclone remote where the Restic repository will live (e.g., restic_backups/my_server).

    Example: Initialize a Restic repo on an Rclone remote named gdrive_backup in a folder my_app_restic_repo:

    # Ensure RESTIC_PASSWORD_FILE is set
    # Ensure rclone is in your PATH and configured
    restic -r rclone:gdrive_backup:my_app_restic_repo init
    
    All subsequent Restic commands will use this rclone: prefixed repository URL. Restic will invoke rclone as needed.

    • Performance: Can be slower than native Restic backends due to the overhead of Restic calling Rclone, which then calls the cloud API. However, for many use cases, it's perfectly acceptable.
    • Rclone Version: Ensure your Rclone version is compatible with Restic's expectations. Generally, use recent versions of both.

Monitoring Restic Backups

For automated backups, monitoring is essential to ensure they are running successfully and to be alerted of failures.

  • Using Prometheus with a Restic Exporter:

    • Prometheus is a popular open-source monitoring and alerting toolkit.
    • Several third-party Restic exporters are available (e.g., search GitHub for "restic prometheus exporter").
    • These exporters typically run restic stats --json, restic snapshots --json, and restic check periodically, parsing the output and exposing metrics in a format Prometheus can scrape (e.g., total backup size, last backup time, number of snapshots, success/failure status of checks).
    • You can then build dashboards (e.g., in Grafana) and set up alerts in Prometheus's Alertmanager.
    • This provides a comprehensive overview of your Restic repositories' health and status.
  • Parsing Restic's JSON Output for Status:

    • As shown in the scripting section, Restic commands with --json provide machine-readable output.
    • Your backup script can parse this JSON (e.g., using jq) to extract key metrics:
      • Snapshot ID, time, size added, total files processed from restic backup --json.
      • List of snapshots, their sizes, tags from restic snapshots --json.
      • Statistics about repository size, deduplication ratio from restic stats --json --mode raw-data.
    • This data can be logged, sent to a central monitoring system, or used to make decisions within the script.
  • Integrating with Healthchecks.io or Similar "Dead Man's Switch" Services:

    • Healthchecks.io (and similar services like Uptime Kuma's push monitors, Cronitor) provide "cron monitoring" or "heartbeat monitoring."
    • How it works:
      1. You create a "check" in Healthchecks.io, and it gives you a unique URL.
      2. Your backup script, at the very end of a successful run, makes an HTTP GET request to this URL (e.g., using curl or wget).
      3. If Healthchecks.io doesn't receive this "ping" within an expected timeframe (e.g., 25 hours for a daily job), it assumes the job failed (or didn't run at all) and sends you an alert.
    • Example (in your Bash backup script, after successful completion):
      # At the end of a successful script run
      HEALTHCHECK_URL="https://hc-ping.com/YOUR_UNIQUE_UUID_HERE"
      curl -fsS --retry 3 "$HEALTHCHECK_URL" > /dev/null || log_msg "WARNING: Healthcheck ping failed."
      
    • This is a simple yet effective way to get notified if your backups stop running. It doesn't tell you why they failed (your script's logging should do that), but it tells you that they failed to report in.

Rest-server for a Dedicated Restic Repository Server

rest-server is Restic's own lightweight HTTP server designed specifically to serve Restic repositories. It offers some advantages over general-purpose servers like SFTP or a raw S3 bucket for Restic.

  • Setting up rest-server:

    1. Download: Get the rest-server binary from the Restic GitHub releases page (it's a separate project from Restic itself but maintained by Restic developers).
    2. Choose a Data Directory: Decide where rest-server will store the repository data (e.g., /srv/restic_server_data). rest-server can serve multiple repositories from subdirectories under its main data path.
    3. Authentication: rest-server uses htpasswd-style authentication.
      • Create an htpasswd file (e.g., using htpasswd utility from Apache tools, or a Go-based htpasswd generator):
        # Install apache2-utils if htpasswd is not available
        # sudo apt install apache2-utils (Debian/Ubuntu)
        htpasswd -B -c /etc/restic/rest_server_htpasswd your_backup_user
        # Enter password for your_backup_user
        
    4. Run rest-server:
      # Example:
      /path/to/rest-server \
          --path /srv/restic_server_data \ # Directory to store repositories
          --listen :8000 \                 # Listen on port 8000
          --private-repos \                # Users can only access their own sub-directory
          --htpasswd-file /etc/restic/rest_server_htpasswd \
          --log /var/log/rest_server.log
      # Add --tls to enable HTTPS if you provide --tls-cert and --tls-key
      # Or run behind a reverse proxy like Nginx or Caddy for HTTPS and other features.
      
      • --private-repos: If enabled, when your_backup_user authenticates, they can only access /srv/restic_server_data/your_backup_user/. The repository name in the Restic URL will then be relative to this user-specific path.
      • If not using --private-repos, the Restic URL path will be relative to --path.
  • Security Considerations:

    • HTTPS is CRITICAL: Always run rest-server with TLS (HTTPS) enabled, either natively (--tls --tls-cert ... --tls-key ...) or by placing it behind a reverse proxy (Nginx, Caddy, Traefik) that handles TLS termination. Transmitting Restic repository credentials over plain HTTP is insecure.
    • Strong Passwords: Use strong, unique passwords in your htpasswd file.
    • Firewall: Restrict access to the rest-server port on your firewall to only trusted client IPs if possible.
    • Regular Updates: Keep rest-server and your OS updated.
  • Advantages over Plain SFTP or S3:

    • Append-Only Mode (--append-only): This is a key feature. If rest-server is run with --append-only, Restic clients can create new snapshots and add data, but they cannot delete existing snapshots or data via the Restic forget, prune, or tag --remove commands.
      • This protects against a compromised client (e.g., ransomware on a client machine) trying to delete backups. The attacker could create new (garbage) snapshots but couldn't wipe out the history.
      • Pruning in append-only mode must be done directly on the rest-server machine, not from the client. You'd temporarily stop the append-only server, run restic prune locally on the server's data directory, then restart rest-server in append-only mode.
    • Prometheus Metrics (--prometheus): rest-server can expose Prometheus metrics about its operation.
    • Designed for Restic: It understands Restic's access patterns and can be more efficient for certain operations than a generic file server.
  • Using a rest-server Repository with Restic Client: The repository URL format is rest:http://user:password@host:port/repository_name or rest:https://....

    • If --private-repos is used on the server: rest:https://your_backup_user:the_password@your_rest_server_host:8000/ (the repository is implicitly named after the user, or you can add a sub-path).
    • Or, if not using --private-repos and the repo is /srv/restic_server_data/my_main_repo: rest:https://your_backup_user:the_password@your_rest_server_host:8000/my_main_repo
    • It's better to put the username/password in a .netrc file or use Restic's password/password-file mechanisms for the repo URL's credentials rather than embedding in the URL, though Restic will prompt if not provided. For the repository URL itself, the user/pass for the REST server authentication is part of the URL. The Restic repository encryption password is still handled by RESTIC_PASSWORD or RESTIC_PASSWORD_FILE. These are two different passwords.

Workshop Backing Up Docker Volumes and Setting up Rest-Server

This workshop is in two parts. Part 1 focuses on backing up a Docker volume. Part 2 involves setting up a basic rest-server instance.

Part 1: Docker Volume Backup

Goals:

  1. Create a Docker container with a named volume storing some data.
  2. Stop the container and back up its volume data from the host using Restic.
  3. (Alternative discussion) Back up the volume using Restic running inside another Docker container.
  4. Restore volume data and verify.

Prerequisites:

  • Docker installed and running.
  • Restic installed on the Docker host.
  • An initialized Restic repository (e.g., local ~/restic_workshop/my_local_repo).
  • RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE environment variables set for this repo.

Steps (Part 1):

  1. Create a Docker Container with a Named Volume:

    # Create a named volume
    docker volume create my_test_data_volume
    
    # Run a simple container that writes to this volume and then exits
    docker run --rm -v my_test_data_volume:/app_data --name data_writer alpine /bin/sh -c \
        "echo 'Hello from Docker Volume!' > /app_data/message.txt && \
         echo 'Some more data...' >> /app_data/another_file.log && \
         ls -l /app_data"
    
    # Verify content was written (optional, by running another container to read)
    # docker run --rm -v my_test_data_volume:/app_data alpine cat /app_data/message.txt
    

  2. Locate the Volume's Mountpoint on the Host:

    docker volume inspect my_test_data_volume
    
    Look for the "Mountpoint" line. It will be something like /var/lib/docker/volumes/my_test_data_volume/_data. Copy this path. Let's call it DOCKER_VOLUME_PATH.

  3. Back Up the Volume Data from the Host: Important: If the container that uses this volume were long-running (like a database), you would docker stop <container_name> first. Since our data_writer container already exited, the data is quiescent.

    # Replace DOCKER_VOLUME_PATH with the actual path from 'docker volume inspect'
    # On Linux, you might need sudo to access /var/lib/docker/volumes/
    # Example: sudo restic backup /var/lib/docker/volumes/my_test_data_volume/_data --tag docker_volume_test
    # If you don't want to use sudo for restic itself, you could temporarily change permissions or use ACLs,
    # but for a quick test, sudo is often easiest if restic needs to read system-owned files.
    # Assuming your user has read access for this workshop or you're running as root/sudo:
    DOCKER_VOLUME_PATH=$(docker volume inspect my_test_data_volume | jq -r '.[0].Mountpoint') # If jq is installed
    echo "Backing up from: ${DOCKER_VOLUME_PATH}"
    # Ensure RESTIC_REPOSITORY and RESTIC_PASSWORD_FILE are set to your local test repo
    sudo restic backup "${DOCKER_VOLUME_PATH}" --tag workshop_docker_vol
    
    If sudo restic is used, it won't see your user's RESTIC_PASSWORD_FILE env var. Either run sudo -E restic ... to preserve environment, or specify --password-file and --repo directly on the sudo restic command line. For simplicity if sudo is problematic with env vars for you:
    # Temporarily make it readable if needed, ONLY FOR WORKSHOP, NOT FOR PROD
    # sudo chmod -R a+rX "${DOCKER_VOLUME_PATH}"
    restic backup "${DOCKER_VOLUME_PATH}" --tag workshop_docker_vol
    # sudo chmod -R o-rX "${DOCKER_VOLUME_PATH}" # Revert permissions
    

  4. Restore Volume Data to a New Location and Verify:

    mkdir ~/restic_workshop/docker_vol_restore
    # Find snapshot ID or use latest with path
    LATEST_VOL_SNAPSHOT_ID=$(restic snapshots --json --latest 1 --path "${DOCKER_VOLUME_PATH}" --tag workshop_docker_vol | jq -r '.[0].short_id')
    
    restic restore ${LATEST_VOL_SNAPSHOT_ID} --target ~/restic_workshop/docker_vol_restore
    
    # Verify
    ls -l ~/restic_workshop/docker_vol_restore/${DOCKER_VOLUME_PATH}/ # Note the full path in restore
    cat ~/restic_workshop/docker_vol_restore/${DOCKER_VOLUME_PATH}/message.txt
    
    You should see Hello from Docker Volume! and another_file.log.

  5. Alternative Discussion Restic in Docker for Backup: Instead of backing up from the host, you could run Restic in a container:

    # docker run --rm \
    #     -v my_test_data_volume:/source_data:ro \  # Mount the volume to back up
    #     -v your_restic_password_file_on_host:/tmp/restic_pass \ # Mount password file
    #     -v your_restic_cache_dir_on_host:/root/.cache/restic \ # Mount cache (optional)
    #     -e RESTIC_REPOSITORY="your_repo_url" \
    #     -e RESTIC_PASSWORD_FILE="/tmp/restic_pass" \
    #     restic/restic backup /source_data --tag docker_container_backup
    
    This encapsulates Restic itself. It's very useful for orchestrated environments like Kubernetes.

Part 2: Setting up a Basic Rest-Server

Goals:

  1. Install rest-server.
  2. Configure basic htpasswd authentication.
  3. Run rest-server to serve a local directory.
  4. Initialize a new Restic repository using this rest-server and perform a test backup.

Prerequisites:

  • A Linux machine (can be your main machine or a VM).
  • htpasswd utility (from apache2-utils or similar).
  • Ability to download and run the rest-server binary.

Steps (Part 2):

  1. Download rest-server: Go to https://github.com/restic/rest-server/releases/latest. Download the binary for your OS/architecture (e.g., rest-server_<version>_linux_amd64).

    # Example download and setup
    cd ~/restic_workshop
    # wget https://github.com/restic/rest-server/releases/download/v0.12.1/rest-server_0.12.1_linux_amd64.tar.gz # Replace with latest
    # tar -xzf rest-server_*.tar.gz
    # For simplicity, let's assume the binary is now at ./rest-server_0.12.1_linux_amd64/rest-server
    # sudo mv ./rest-server_*/rest-server /usr/local/bin/rest-server
    # chmod +x /usr/local/bin/rest-server
    # Verify: rest-server --version
    # For workshop, can just run from current dir: ./rest-server
    # For this guide, let's assume you downloaded and extracted it to ~/restic_workshop/rest-server_binary/rest-server
    # For example:
    # wget https://github.com/restic/rest-server/releases/download/v0.12.1/rest-server_0.12.1_linux_amd64 -O ~/restic_workshop/rest-server
    # chmod +x ~/restic_workshop/rest-server
    # Now ~/restic_workshop/rest-server is the executable
    REST_SERVER_BINARY=~/restic_workshop/rest-server # Adjust if you put it elsewhere like /usr/local/bin
    
    Make sure to get the latest version and adjust commands accordingly.

  2. Create Data Directory and Htpasswd File:

    mkdir -p ~/restic_workshop/rest_server_data  # Data path for rest-server
    mkdir -p ~/restic_workshop/rest_server_config # For htpasswd file
    
    # Install htpasswd if not present (Debian/Ubuntu: sudo apt install apache2-utils)
    # Create htpasswd file with a user 'workshop_user'
    htpasswd -B -c ~/restic_workshop/rest_server_config/htpasswd workshop_user
    # Enter a password for workshop_user when prompted (e.g., "restworkshoppass")
    

  3. Run rest-server (No TLS for this basic local workshop - NOT FOR PRODUCTION): Warning: Running without TLS is insecure for real use. This is only for a quick local test. Open a new terminal window for rest-server as it will run in the foreground.

    # In NEW Terminal 1 (for rest-server):
    cd ~/restic_workshop # Ensure paths are relative to here if needed
    ${REST_SERVER_BINARY} \
        --path ./rest_server_data \
        --listen localhost:8000 \
        --htpasswd-file ./rest_server_config/htpasswd \
        --private-repos \
        --debug # For more verbose output during the workshop
    # It should say something like "Starting server..."
    # Keep this terminal open.
    

  4. Initialize and Use Restic with rest-server: Go back to your original terminal window (Terminal 2).

    • Repository URL: rest:http://workshop_user:PASSWORD@localhost:8000/my_first_rest_repo Replace PASSWORD with the password you set for workshop_user in htpasswd. If --private-repos is used, the repository path is relative to the user's directory automatically created by rest-server (./rest_server_data/workshop_user/). The /my_first_rest_repo then becomes a subdirectory within that. So, effectively, the data will be in ./rest_server_data/workshop_user/my_first_rest_repo. Let's simplify, if --private-repos is used, the user workshop_user automatically gets rest_server_data/workshop_user/. You can then init a repo directly in there. The URL becomes rest:http://workshop_user:PASSWORD@localhost:8000/ if you want the repo directly in the user's root, or rest:http://workshop_user:PASSWORD@localhost:8000/specific_repo_name
    • Define variables for Restic repository (new one for rest-server):
      # In Terminal 2:
      # REST_USER_PASSWORD should be the password you set for 'workshop_user' in htpasswd
      export REST_USER_PASSWORD="restworkshoppass" # The password for workshop_user
      export REST_SERVER_REPO_URL="rest:http://workshop_user:${REST_USER_PASSWORD}@localhost:8000/workshop_repo1"
      
      # New Restic encryption password for this new repository
      echo "restic_encryption_key_for_rest_server" > ~/restic_workshop/rest_server_enc_pass.txt
      export RESTIC_PASSWORD_FILE=~/restic_workshop/rest_server_enc_pass.txt
      
    • Initialize Restic Repository:

      # In Terminal 2:
      restic -r "${REST_SERVER_REPO_URL}" init
      
      Check Terminal 1 (rest-server output). You should see activity (GET, PUT requests). On the filesystem, check ~/restic_workshop/rest_server_data/workshop_user/workshop_repo1. It should now contain Restic's repo structure.

    • Perform a Test Backup:

      # In Terminal 2:
      # Let's back up the 'source_data' directory again for a quick test
      restic -r "${REST_SERVER_REPO_URL}" backup ./source_data --tag rest_server_test
      
      Again, observe logs in Terminal 1.

    • List Snapshots:
      # In Terminal 2:
      restic -r "${REST_SERVER_REPO_URL}" snapshots
      
      You should see your new snapshot.
  5. Cleanup:

    • In Terminal 1 (rest-server), press Ctrl+C to stop rest-server.
    • You can remove ~/restic_workshop/rest_server_data, ~/restic_workshop/rest_server_config, and ~/restic_workshop/rest_server_enc_pass.txt if you want to clean up this workshop's specific files.

This workshop demonstrated basic integration with Docker volumes and setting up a rest-server. For production rest-server use, always implement TLS and consider running it as a system service with proper logging and resource management. The append-only mode is a particularly valuable feature for enhancing backup security.

Conclusion

Throughout this comprehensive guide, we've journeyed from the fundamentals of Restic to its advanced applications in a self-hosted environment. You've learned how to install Restic, initialize repositories, perform backups and restores, manage repository health and size, and leverage remote backends for robust data protection. We've also delved into scripting, automation, security considerations, and integration with other essential tools like Docker and Rclone.

Recap of Restic's Strengths:

  • Security First: Client-side encryption with strong algorithms ensures your data remains private, regardless of where it's stored. You hold the keys.
  • Efficiency: Deduplication via content-defined chunking saves significant storage space and bandwidth, making frequent backups feasible.
  • Simplicity and Flexibility: A clear command-line interface combined with support for numerous local and remote backends provides versatility for diverse self-hosting setups.
  • Reliability: Features like restic check allow you to verify the integrity of your backups, giving you confidence in your ability to restore.
  • Open Source: Transparency and an active community contribute to its trustworthiness and continuous improvement.

Importance of Regular Testing of Backups:

A backup strategy is only as good as its last successful restore. It's not enough to assume your backups are working; you must regularly test them. This involves:

  • Periodically performing actual restores of randomly selected files or entire directories to a test location.
  • Verifying the integrity of the restored data.
  • Reviewing backup logs for errors or warnings.
  • Running restic check (and occasionally restic check --read-data) to ensure repository health.

Make testing a routine part of your data management discipline. This practice will not only confirm that your backups are functional but also familiarize you with the restore process, which is invaluable during a real data loss emergency.

Further Learning Resources:

Encouragement for Responsible Data Management:

Data is valuable. Whether it's personal memories, critical project files, or configurations for your self-hosted services, losing it can be devastating. By learning and implementing tools like Restic, you are taking a significant step towards responsible data stewardship.

Embrace the principles of the 3-2-1 backup rule, automate your backups, monitor them diligently, and test your restores. The peace of mind that comes from knowing your data is safe and recoverable is well worth the effort. Continue to explore, experiment, and refine your backup strategy as your needs and self-hosted infrastructure evolve. Happy self-hosting, and may your data always be secure!