sanoid on TrueNAS

I have been thinking about using sanoid to manage the snapshots on my TrueNAS instance for some time now. I bounced off the idea a few times due to laziness and difficulty in working around the appliance status of TrueNAS. Another project, sanoid-portable, has made the process much simpler, overcoming both cited issues; below I've outlined the setup process.

Why not use TrueNAS snapshots and replication?

I won't waste a lot of your time on my reasons for moving away from the built-in snapshot and replication utilties, but here are a few bullets that pushed me to make the change:

  • I'd like my backup and replication setup to be more portable: I would like to continue with at least one of my machines based on FreeBSD and with the TrueNAS moving away from their Core product, I will eventually need to migrate to another FreeBSD-based NAS OS or vanilla FreeBSD.
  • I've had issues with TrueNAS consistently pruning snapshots. This was especially evident on the replication target. I never successfully remedied this issue.
  • I want separate retention policies. I never figured out how to have different pruning schedules on the production and backup servers.
  • The configuration of multiple snapshot timings in the UI is clunky and can be confusing.
  • I'd like better monitoring of the process to make sure snapshots are being taken and replicated.

sanoid and syncoid appear to address each of these issues, so I decided to give it a try.

Remove existing snapshots (optional)

I had an unknown, but massive, number of snapshots on my server. Some of this was my fault for taking snapshots more frequently than needed, much was the fault of TrueNAS for failing to prune snapshots. I wanted to start fresh with sanoid and make sure that all of the snapshots had the proper naming format to be managed by sanoid. I have backups, so I wasn't worried about accidentally deleting something; if you're going to undertake this procedure, you should also make sure you have your data backed up.

Attempting to manage snapshots in the UI is slow an frustrating - you can only view and delete them a handful at a time. Instead, we'll be using the terminal. You can use either the Shell in the UI or ssh into your server; I've done the latter.

List and delete (dry run) your snapshots

To view your current snapshots run:

zfs list -t snap

This will show "all" of the snapshots on the system. When I say "all", you should be aware that the list will be trunctated, depending on the actual number of snapshots, but it'll give you a place to start and some names to work with.

Next, we'll preview the commands to delete the snapshots. According to this post on the TrueNAS forums, we can use the following command:

zfs list -t snap | awk '/<pattern>/ { printf "zfs destroy %s\n", $1 }'

Running this command will not delete anything, it prints the statements that would delete the snapshots, if they were run.

This command pipes zfs list -t snap (that we've previously run) to awk.

awk will then match a pattern on the output of the first command and output the zfs destroy command, replacing %s with the name of the snapshot to be deleted. Review these commands before actually running them to make sure there isn't an unexpected snapshot hiding in there.

awk patterns

The awk patterns should start and end with /. It will then match the strings (snapshot names) that begin with your pattern. Keep in mind, you'll need to escape any / in your paths with \/.

You can match any level of parent-child dataset hierarchies:

  1. Remove snapshots for each (child) dataset individually. For example, I have the datasets fast/app/immich and fast/app/zabbix. If I only want to delete the snapshots for Zabbix, I would run:
zfs list -t snap | awk '/fast\/app\/zabbix/ { printf "zfs destroy %s\n", $1 }'

This will return all the snapshots that begin with fast/app/zabbix, e.g., fast/app/zabbix@auto-2024-12-31_00-00_1h.

  1. Delete all the children of a particular dataset, saving the effort of having to delete the snapshots of each dataset individually. For example, if I want to delete the snapshots for the immich and zabbix datasets, I would run:
zfs list -t snap | awk '/fast\/app/ { printf "zfs destroy %s\n", $1 }'
  1. Moving up the hierarchy, we could delete everything in the fast pool with:
zfs list -t snap | awk '/fast/ { printf "zfs destroy %s\n", $1 }'

Running the destroy commands

The commands run so far will only print the commands to delete the snapshots, not actually run them. To delete the snapshots, pipe them to sh by adding | sh to the end of the commands above:

zfs list -t snap | awk '/<pattern>/ { printf "zfs destroy %s\n", $1 }' | sh

For example, deleting all the snapshots in the fast/app dataset:

zfs list -t snap | awk '/fast\/app/ { printf "zfs destroy %s\n", $1 }' | sh

Setup sanoid-portable

The main roadblock I faced in getting sanoid setup on TrueNAS was the inability to install the necessary dependencies due to the appliance nature of TrueNAS itself. This is where sanoid-portable comes in, it is a single binary containing sanoid, syncoid, findoid and all their dependencies, allowing any and all of these utilities to be easily run on TrueNAS.

Create a dataset for persistent storage

Any changes you make to the root file system risk being removed by an update. To avoid this, I have created a dataset, fast/persistent, with the purpose of saving any custom binaries and/or configuration.

See my post on zfs replication over Nebula for the steps to create this persistent dataset, or just make a new dataset as you normally would in the web UI.

Once you have the dataset created, you can add additional files and directories to the dataset via the terminal:

mkdir -p /mnt/fast/persistent/sanoid/config
chmod -R 750 /mnt/fast/persistent/sanoid

In the command above, we're creating both the sanoid directory, where we'll install the sanoid-portable binary, and the sanoid/config directory, where we'll save sanoid.conf and sanoid.defaults.conf. We're also limiting access to root.

Install sanoid-portable

All commands below need to be run as root. If you're on TrueNAS scale and not logged in as root, switch using sudo su.

First, we'll download the binary to the directory we created with:

cd /mnt/fast/persistent/sanoid
wget https://github.com/decoyjoe/sanoid-portable/releases/latest/download/sanoid-portable

Then, make it executable:

chmod +x sanoid-portable

The following command transforms the portable into a native binary for the system on which it's installed:

sh ./sanoid-portable --assimilate

Finally, we'll create symlinks for the */oid tools, so that we can run them by name:

ln -s sanoid-portable sanoid
ln -s sanoid-portable syncoid
ln -s sanoid-portable findoid

If you don't plan to use all of the tools, you can create the symlinks only for those you want to use.

Configure sanoid

Download the two example configuration files from the sanoid GitHub repo:

wget -O /mnt/fast/persistent/sanoid/config/sanoid.conf https://github.com/jimsalterjrs/sanoid/blob/master/sanoid.conf
wget -O /mnt/fast/persistent/sanoid/config/sanoid.defaults.conf https://github.com/jimsalterjrs/sanoid/blob/master/sanoid.defaults.conf

I won't go into detail on configuring sanoid - the docs will do a better job than I can - but here's my sanoid.conf as an example:

[big/applications]
        use_template = production
        recursive = yes
        process_children_only = yes

[big/photos]
        use_template = daily
        recursive = yes
        process_children_only = yes

[big/pve-big]
        use_template = production
        recursive = yes
        process_children_only = yes

[big/S3]
        use_template = production
        recursive = yes

[fast/app]
        use_template = production
        recursive = yes
        process_children_only = yes

[fast/pve-fast]
        use_template = production
        recursive = yes
        process_children_only = yes


#############################
# templates below this line #
#############################

[template_daily]
        ### Remember: these are local snapshots, more history is/can be available
        ### on the backup machine
        frequently = 0
        hourly = 0
        daily = 30
        monthly = 3
        yearly = 0
        autosnap = yes
        autoprune = yes

[template_production]
        frequently = 0
        hourly = 36
        daily = 30
        monthly = 3
        yearly = 0
        autosnap = yes
        autoprune = yes

Run sanoid

We're going to run sanoid as a cron job. Before setting up the job, let's take a look at the commands.

The sanoid docs suggests running the snapshot and pruning functions as separate cron jobs to avoid long-running pruning operations blocking new snapshot creation:

*/15 * * * * root flock -n /var/run/sanoid/cron-take.lock -c "TZ=UTC sanoid --take-snapshots"
*/15 * * * * root flock -n /var/run/sanoid/cron-prune.lock -c "sanoid --prune-snapshots"

I didn't do this for a few reasons:

  1. /var/run/sanoid/... doesn't exist in this setup and I didn't feel like figuring out the equivalent.
  2. I will not be taking snapshots any more frequently than hourly and deleting all the snapshots of even my largest datasets took no more than a few minutes, so there should be plenty of time to perform the pruning operation.
  3. Event if I'm wrong about #2, my situation is such that a missed hourly isn't a big deal.

Instead, I am using the following command for the cron job:

TZ=UTC /mnt/fast/persistent/sanoid/sanoid --cron --configdir /mnt/fast/persistent/sanoid/config/

If you want to test this, you can run it directly in the terminal, optionally adding --verbose to see a more detailed output.

A couple notes on the command:

  • The docs recommend including TZ=UTC to avoid any potential issues with daylight savings.
  • The FreeBSD instructions state that you must use chsh -s /bin/sh to successfully run sanoid on FreeBSD. This does not appear to be the case when using sanoid-portable.

Schedule the cron job

As with everything TrueNAS, if you don't do it through the UI, you risk breaking something or having it eventually disappear:

  1. Navigate to the web UI for your TrueNAS server and login.
  2. On the left navigation menu, choose Tasks then Cron Jobs.
  3. Click Add to configure a new job.
  4. Configure as follows
    5. Description: Run sanoid hourly
    6. Command: TZ=UTC /mnt/fast/persistent/sanoid/sanoid --cron --configdir /mnt/fast/persistent/sanoid/config/
    7. Run As User: root
    8. Schedule: Hourly (0 * * * *) at the start of each hour
  5. Click Save

Verify

As with everything, make sure your cron job is actually creating snapshots. You can go back to the command from the beginning of this post:

zfs list -t snap

If everything is working, you should see a list of newly created snapshots. Keep in mind: the cron job runs at the top of the hour.

Next up will be using syncoid to replicate your sanoid snapshots.