π

Long Term Proxmox VM Backups

A long time ago I made a blog post on using ZPAQ archives for storing long term copies of virtual machines from Proxmox VZDump backups. It seemed incredibly space efficient, at the cost of zpaq archives being quite computationally heavy.

To strike a balance, I decided that my day to day backups would continue to be managed by Proxmox itself using it’s zstandard archive format, both for ease of use, ease of restoration, fairly efficient backup size and low CPU overhead.

For longer term storage however, ZPAQ based archives fit the bill better. I don’t mind the extra computational overhead for a weekly or monthly backup, and the space saving aspect becomes the most important consideration, especially if you’re wanting to store backups in cloud storage, which can easily creep up in costs.

I have been using ZPAQ over a considerable length of time and have some really good data to show that it can really make a difference in keeping a long term backup archive, that only typically is 2x the size of the original disk image, yet contains 30+ backups that can easily be restored.

An Example Case

For example, for a Windows 10 virtual machine backup, my ZPAQ file (vzdump-qemu-201.zpaq) is just under 75GB and it contains 34 full vzdump files.

jcx@examples:~$ zpaq l vzdump-qemu-201.zpaq
zpaq v7.15 journaling archiver, compiled Jan  5 2021
vzdump-qemu-201.zpaq: 34 versions, 34 files, 2171876 fragments, 74726.244568 MB

- 2021-10-11 00:09:13  23472866304  0754 vzdump-qemu-201-2021_10_10.vma
- 2021-10-16 01:08:59  25603441664  0774 vzdump-qemu-201-2021_10_16.vma
- 2021-10-29 03:13:20  25529381888  0774 vzdump-qemu-201-2021_10_29.vma
- 2021-11-01 03:07:20  25505584128  0774 vzdump-qemu-201-2021_11_01.vma
- 2021-11-08 03:12:55  25791906816  0774 vzdump-qemu-201-2021_11_08.vma
- 2021-11-15 23:47:09  24654038016  0774 vzdump-qemu-201-2021_11_15.vma
- 2021-11-22 03:07:54  22784812032  0774 vzdump-qemu-201-2021_11_22.vma
- 2021-11-29 03:08:16  22529745920  0774 vzdump-qemu-201-2021_11_29.vma
- 2021-12-06 03:07:51  22917411840  0774 vzdump-qemu-201-2021_12_06.vma
- 2021-12-13 03:07:55  23717610496  0774 vzdump-qemu-201-2021_12_13.vma
- 2021-12-20 03:08:03  25543779328  0774 vzdump-qemu-201-2021_12_20.vma
- 2021-12-27 03:08:58  23354811392  0774 vzdump-qemu-201-2021_12_27.vma
- 2022-01-03 03:09:10  23728616448  0774 vzdump-qemu-201-2022_01_03.vma
- 2022-01-10 03:09:17  24141607936  0774 vzdump-qemu-201-2022_01_10.vma
- 2022-01-17 03:08:48  25157493760  0774 vzdump-qemu-201-2022_01_17.vma
- 2022-01-24 03:09:36  25984607232  0774 vzdump-qemu-201-2022_01_24.vma
- 2022-01-31 03:09:12  26629510144  0774 vzdump-qemu-201-2022_01_31.vma
- 2022-02-07 03:09:44  26457023488  0774 vzdump-qemu-201-2022_02_07.vma
- 2022-02-14 03:10:31  24978011136  0774 vzdump-qemu-201-2022_02_14.vma
- 2022-02-21 03:10:29  25055302656  0774 vzdump-qemu-201-2022_02_21.vma
- 2022-02-28 03:09:16  24903341056  0774 vzdump-qemu-201-2022_02_28.vma
- 2022-03-07 03:14:13  24846509056  0774 vzdump-qemu-201-2022_03_07.vma
- 2022-03-14 03:10:33  24345306112  0774 vzdump-qemu-201-2022_03_14.vma
- 2022-03-21 03:11:09  26017444864  0774 vzdump-qemu-201-2022_03_21.vma
- 2022-03-28 02:07:34  23117341696  0774 vzdump-qemu-201-2022_03_28.vma
- 2022-04-04 02:06:08  23270106112  0774 vzdump-qemu-201-2022_04_04.vma
- 2022-04-11 02:06:26  23315727360  0774 vzdump-qemu-201-2022_04_11.vma
- 2022-04-18 02:06:43  23267525632  0774 vzdump-qemu-201-2022_04_18.vma
- 2022-04-25 02:09:59  22605267968  0774 vzdump-qemu-201-2022_04_25.vma
- 2022-05-09 02:08:33  23006458880  0774 vzdump-qemu-201-2022_05_09.vma
- 2022-05-16 02:09:16  25538429952  0774 vzdump-qemu-201-2022_05_16.vma
- 2022-05-23 02:11:15  29059912704  0774 vzdump-qemu-201-2022_05_23.vma
- 2022-05-30 02:09:25  25759990784  0774 vzdump-qemu-201-2022_05_30.vma
- 2022-06-10 23:00:53  26047902720  0774 vzdump-qemu-201-2022_06_10.vma

838638.827520 MB of 838638.827520 MB (34 files) shown
  -> 201290.310320 MB (11339348 refs to 2171876 of 2171876 frags) after dedupe
  -> 74726.244568 MB compressed.
7.373 seconds (all OK)

I’ve shortened the filenames for the sake of brevity, but as you can see, each vma file represents roughly 27GB, and in total there is 839GB worth of backup. After deduplication (only storing a single reference to the same data across files) this represents around 202GB. Add in ZPAQs compression and it gives you a total archive size of only 75GB.

70% compressed or with the deduplication it becomes 8% of the original data’s size.

Impressive!

The Backup

I configure a backup job within Proxmox to create a backup every week in uncompressed form in a mostly temporary location, with retention options set so that it only keeps one. I then have cron run a bash script a few hours afterwards in order to add the file to the zpaq archive.

#!/bin/bash
zpaq a /path/to/zpaq/location/vzdump-qemu-VMID.zpaq /source/of/vzdump/files/vzdump-qemu-VMID-*.vma -m5 -t8

The options are zpaq a starts zpaq in ‘add’ mode, then you need the path to where you’d like to keep the zpaq file. The second path is the file you’d like to add to that archive. -m5 tells it to use the maximum compression that it is able (from 0 to 5), the higher the number the more computationally heavier it will be but the better the compression results are. In the highest mode it will use several clever tricks to get the better compression, but it will use 100% of however many threads you assign to it, and will take considerable time for the first run to index the entire first file.

If your system has to do other tasks it is recommended to limit the number of threads to an appropriate number. I have a high core count, but also want the backup to be completed reasonably quickly, so I usually use t8 to make it use 8 threads. The more threads, the faster it will run. If you don’t specify a thread count, it will use all the cores you have available, which might not be desired.

The Conclusion

I’m really happy with this longer term backup solution. It’s incredibly space efficient for keeping an incredible amount of history, utilising little administrative effort. However, the initial archive creation times when you add the first version of a file does take quite a considerable time (in the range of hours). The backups that are done to the same file of the same machine take considerably less (in the range of minutes) because it has already deeply analysed most of the content of the disk already and only needs to work out what has changed.

You can run ZPAQ in a worse compression mode, but I’ve found that it’s compression results aren’t anywhere near as good. Extraction times are pretty fast, which is what is usually important in a ‘set and forget’ situation.

One improvement I might make in future is to add a sha256sum of each individual file in the backup, just so that I can verify that no bits have been flipped, just for that extra certainty after extracting a file, although I’ve yet to encounter any problems so far with restorations 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

Solve to post *

jcx.life