LVM snapshots for backup

I recently decided to rent a Hetzner server for my private projects and websites this includes:

Other projects may follow (such as a gallery of images and videos).

By default, the Linux installation of this server came with a software RAID-1.

Some thoughts about the backup strategy of this server:

  • I don't want to shut down the mailserver to get a consistent backup.
  • I don't want to implement backup systems for all databases and other programs that write data to the disc.

Just making a tar (or using a tar based backup) on the discs raw files certainly isn't a good idea. A database will always ensure that the disc data is consistent or at least recoverable at any given time. But a simple "tar" isn't an atomic operation. If you have a large file (or multiple files) to backup, the database may write part (A) of the file after the backup-process has read this part. It may write other parts (B) before the backup-process reads them. Each write by itself my create a consistent state. But in your backup, the parts A and B don't belong to each other. The backup is now inconsistent and corrupt. But with bad luck you may only find out months after you restore the backup. (see also Fuzzy Backup)

I wrote that this "may" happen, but on a busy system, this will happen.

Databases usually have a backup function that will dump a consistent database state to a file. They have the ability to control what part of the files get written to disk, so they can implement strategies that make sure that the backed up data does not change during the dump.

Snapshots

Some file systems also have the ability to make atomic snapshots of all files. LVM also has this ability (on a block device level).

Snapshots are usually implemented in a Copy-on-Write mechanisms (btrfs, zfs, lvm). This means that, once a snapshot exists, all blocks written to the disc are copied to another location so the the original unchanged data is still available. That way, only the space for the changed data is used.

After reading a little about the different file-systems, the important contras for each system were

  • zfs uses a lot of memory and has some other drawbacks.
  • btrfs had a major corruption bug in 2016 (same post) and is not considered to be stable by many bloggers. I haven't found the exact description of the bug, but I also haven't found anyone who says "use it now".
  • With lvm you snapshot-data is stored in an additional logical volume. Since it is working at block-level, it doesn't know anything about free space in the file-system. If you want to create snapshots, you have to leave free space in the volume group. The amount of changed data after the snapshot must not exceed this free space. Also, write performance decreases when a snapshot exists, because all the overwritten data has to be copied to the snapshot logical-volume. This essentially means that the snapshot cannot be permanent.

Despite the drawbacks of lvm, I decided to use it, because the drawbacks of zfs and btrfs seemed more serious. What I am doing now is:

  • Create a snapshot of the main logical-volume
  • Mount the snapshot to /media/lvmain-snapshot
  • Run backup2l chrooted to /media/lvmain-snapshot.
    • Backup2l is configured to mount the remote backup-folder via CIFS before running the backup
  • Unmount the snapshot any device mounted into it. This is done in trap EXIT to ensure that it is also performed on irregular exits.

A more detailed description of the setup can be found here: https://gitlab.com/nknapp/hetzner-backup

Feedback

Please don't hesitate to provide feedback and corrections to this post. Or create an issue for improvements or corrections of the setup.