✨ MANS ✨ — Muffin's Awesome NAS Stack

◢◤◢◤◢◤ ███╗░░░███╗░█████╗░███╗░░██╗░██████╗ ◢◤◢◤◢◤
◢◤◢◤◢◤ ████╗░████║██╔══██╗████╗░██║██╔════╝ ◢◤◢◤◢◤
◢◤◢◤◢◤ ██╔████╔██║███████║██╔██╗██║╚█████╗░ ◢◤◢◤◢◤
◢◤◢◤◢◤ ██║╚██╔╝██║██╔══██║██║╚████║░╚═══██╗ ◢◤◢◤◢◤
◢◤◢◤◢◤ ██║░╚═╝░██║██║░░██║██║░╚███║██████╔╝ ◢◤◢◤◢◤
◢◤◢◤◢◤ ╚═╝░░░░░╚═╝╚═╝░░╚═╝╚═╝░░╚══╝╚═════╝░ ◢◤◢◤◢◤

Intro

An Ansible role for setting up a Debian based NAS using mergerfs, SnapRaid & snapraid-btrfs, utilising caching with automatic cache 'moving' to a backing pool.

Tasks

This role will do the following:

Check for updates and autoupdate if accepted
Manage the base OS
- Apply TZ
- Enables passwordless-sudo (optional)
- Create/append users/groups
- Required apps are installed
  - Optional apps are installed (user defined)
- Install fastfetch (optional)
- Apply ssh keys (optional) (user defined)
- Upgrade all packages and clean unused
Install ZSH (optional)
- Install powerlevel10k (optional)
- Runs fastfetch at login (optional)
Install rclone (stefangweichinger.ansible_rclone) (optional)
Install Docker (geerlingguy.docker) (optional)
Install mergerfs
Install SnapRaid
Install btrfs
Wipe and setup disks
- Wipe all disks that do not have the correct fs
- Setup data_disks with btrfs data subvolumes
- Setup parity_disks & cache_disks as ext4
- Configure mounts and fstab
Configures mergerfs with systemd service files dependent on the config in vars.yml
Configures SnapRaid/snapper
- Snapper configs verified and failsafe config re0-create initiated if issued are detected
Configures Samba (vladgh.samba.server) w/ custom performance tuning
Configures mergerfs-cache-mover
Deploys & Configures Scrutiny (MANS deploys an active fork of Scrutiny, not the original project)

More Info

I have written in-depth blog posts about how I got here and detailing how this setup works.

Part 3: Designing & Deploying MANS — A Hybrid NAS Approach with SnapRAID, MergerFS, and OpenZFS attempts to explain my design approach.
Part 4: MANS (Muffin's Awesome NAS Stack), An Overview And Guide is my definitive guide to this role.

I would highly recommend you read, at least, those two blog posts so you are familiar with how this works. I understand the yolo mentality, but please do not run this blindly.

Prerequisites

You must be using Debian. You can a check in the vars that stops you from using anything else but I will not support any fixes that are required to make this work on another distro.
Understand how Snapraid works and its drawbacks.
Understand what this role does and how it configures all the bits of software it uses.
A device (ideally running MacOS/Linux) with Ansible.
A machine that you will be using as a NAS with multiple drives (minimum 3).
- Disk(s) for parity that is/are larger or as large as your largest data disk, for most setups, see note below.

Note

As of v0.92 multiple smaller parity disks can be used. This works by having multiple parity files on the smaller disks. So you may use 2x 8TB parity disks when using 16TB data disks, for example. Be aware that this means your parity/'backup' is now dependent on multiple disks being available. This is not the same as having 2 parity disks configured.

Clone

Clone the repo somewhere on your machine.

https://github.com/monstermuffin/muffins-awesome-nas-stack.git

Config

This role will take your settings from a global vars file and apply those to the setup. If/when you need anything changed, including adding/removing disks, you should modify the vars file and rerun the playbook.

Firstly, make a copy of the inventory file, example playbook and example vars file whilst in the root of the project.

cd muffins-awesome-nas-stack
cp inventories/inventory.example.yml inventories/inventory.yml
cp vars_example.yml vars.yml

Edit your new inventory.yml file to include the IP/hostname of your Debian server/machine, along with the user you want to run Ansible as that has root access.

It should look something like this:

---
mans_host:
  hosts:
    hht-fs01:
      ansible_host: hht-fs01.internal.muffn.io
      ansible_user: muffin

Vars

Below is an explanation of the core variables needed to make the role function, as well as some things you may wish to change.

This does not cover all the variables just what you should change to your preference at a minimum, and the core variables needed to configure the services.

passwordless_sudo: true — Enables passwordless sudo. If set to false you must always specify your become pass on every run.

configure_zsh: true — Installs ohmyzsh and sets your default shell to ohmyzsh.

configure_powerlevel10k: true — Configures powerlevel10k

fastfetch_motd: true — Sets a MOTD to be fastfetch. Useful IMO. Must have zsh enabled to work. Fastfetch will be installed if missing.

install_rclone: true — Installs rclone.

install_docker: true — Installs and configures docker. Setting this or configure_scrutiny to true will run docker.

configure_scrutiny: true — Installs and configures Scrutiny in Docker for your disks.

Note

MANS deploys an active fork of Scrutiny, not the original project which is largely unmaintained.

configure_hdidle: true — Installs and configures hdidle. No longer recommended, see note below.

skip_os_check: false — Skips the OS check for Debian. Unsupported.

wipe_and_setup: true — If disks need to be setup, this needs to be set to true else the playbook will immediately fail. You will still be required to accept a prompt before changes are made.

extra_apps: — List any extra applications from apt that you would like to install. Follow the example layout and uncomment.

Important

Whilst hdidle worked well for me initially, I have had issues with some disks going into read only mode as BTRFS does not like waiting for the disk to be ready. This behaviour has been reported by 2 others, and disabling hdidle has fixed the issue, so I no longer recommend enabling it. If you had previously enabled it and wish to disable, you can do so by setting configure_hdidle: false in your vars.yml.

samba users // password — Use vault to set a password, or just slap some plain text into here, I won't know.

content_files — If you are not planning on using a cache disk you must remove the third path and possibly replace it. Depending on how many parity disks you have, you may need more content files and these must be on separate disks. You cannot place them on the data disks as it is not supported to have these in a subvolume.

data_directories — Top level directories that will be created on every data_disks and parity_disks. This can be a list of strings or a list of dictionaries (or both; mixing the two is fine), as demonstrated below:

# Option 1 - With default ownership and permissions.
# This will create the directories with:
#   Owner: `user` (set in vars)
#   Group: `media_group` (set in vars)
#   Mode: 0770
data_directories:
  - movies
  - tv
  - music
  - youtube

# Option 2 - With custom ownership and permissions:
data_directories:
  - name: movies
    owner: joe
    group: users
    mode: '0775'
  - name: tv
    owner: lisa
    group: "{{ media_group }}"
    mode: '0770'

Disk Config

You must have your disks formatted in the format that is pre-filled. You can of course add or remove any entries as necessary, but /dev/disk/by-id/your-disk must be how the vars are entered.

Any of the disks can be added/removed at any time, simply make your changes and rerun, that's the point of this.

To get your disks in the correct format copypasta the following into your terminal:

lsblk -do NAME,SIZE,MODEL | while read -r name size model; do
    echo -e "Disk: /dev/$name\nSize: $size\nModel: $model\nID Links:";
    ls -l /dev/disk/by-id/ | grep "/$name$" | grep -v "wwn-" | awk '{print "  /dev/disk/by-id/"$9}';
    echo "";
done

data_disks — Your data disks.

parity_disks — Your parity disk(s). Must be at least 1 disk here.

Important

As of MANS v0.92 parity disk configuration has changed significantly due to #24 and #25.

The role will fail if you have an older version of the variables but a newer version of the role. Simply change the format as below.

There are two 'modes' to put a parity disk into, dedicated and split.

Dedicated parity disks are single parity disk(s). This means the disk(s) are larger or as large as your largest data disk.
Split parity disks are smaller then your largest data disk but larger or as large when combined together. This allows you to use multiple smaller parity disks when using larger data disks.

Example parity config:

parity_disks:
  # Level 1 split across two disks
  - device: /dev/disk/by-id/disk1
    mode: split
    level: 1
  - device: /dev/disk/by-id/disk2
    mode: split
    level: 1
  # Level 2 is a single dedicated disk
  - device: /dev/disk/by-id/disk3
    mode: dedicated
    level: 2

Levels are defined as an entire parity level. So you can only ever have one level per dedicated disk, as this is a dedicated parity level. The only time a level should be spread across disks is when using split mode. This is done because of the complexities of deploying such a config accurately.

A split config can have as many disks as required to be larger or as large as your largest data disk, as long as they are in the same level.

Example A: You want one parity level and your parity disk is larger than any of your data disks:

parity_disks:
  - device: /dev/disk/by-id/disk1
    mode: dedicated
    level: 1

Tip

In most cases this is the setup you will be using.

Example B: You want two parity levels and your parity disks are larger than any of your data disks:

parity_disks:
  - device: /dev/disk/by-id/disk1
    mode: dedicated
    level: 1
  - device: /dev/disk/by-id/disk2
    mode: dedicated
    level: 2

Tip

In most cases this is the setup you will be using if you want multiple parity.

Example C: You want to split a parity across two smaller disks. Your largest data disk is 16TB. Both your parity disks are 8TB:

parity_disks:
  - device: /dev/disk/by-id/disk1
    mode: split
    level: 1
  - device: /dev/disk/by-id/disk2
    mode: split
    level: 1

Example D: You want to mix a dedicated parity disk as well as add a split parity across two other disks:

parity_disks:
  - device: /dev/disk/by-id/disk1
    mode: split
    level: 1
  - device: /dev/disk/by-id/disk2
    mode: split
    level: 1
  - device: /dev/disk/by-id/disk3
    mode: dedicated
    level: 2

Example E: You want two levels of parity, both using split disks:

parity_disks:
  - device: /dev/disk/by-id/disk1
    mode: split
    level: 1
  - device: /dev/disk/by-id/disk2
    mode: split
    level: 1
  - device: /dev/disk/by-id/disk3
    mode: split
    level: 2
  - device: /dev/disk/by-id/disk4
    mode: split
    level: 2

Important

MANS will attempt to warn about incorrect parity var config at the start of the run, but this cannot be guaranteed.

cache_disks — Any fast disk you want to send writes to, ideally this should be an NVME. This variable can be:

1 single disk in the form of /dev/disk/by-id/your-disk.
Multiple disks in the form of /dev/disk/by-id/your-disk. When multiple disks are used, this is not adding redundancy, just more cache.
An existing path on your operating system. If you already have space on your OS drive for example, you could use something like /opt/mergerfs-cache or something. To ensure this doesn't fill up, adjust the mergerfs-cache-mover vars as needed (below.)
A path and a disk in the format of /dev/disk/by-id/your-disk. I don't know why but it's supported.

configure_mergerfs_cache_minfreespace — Minimum free space for cache disks (default: 50G). See issue #92.

configure_mergerfs_minfreespace — Minimum free space for data disks (default: 100G, previously 10G).

Cache mover things: https://github.com/MonsterMuffin/mergerfs-cache-mover

If you left configure_scrutiny to true then you can setup omnibus or collector mode here, if you don't know then leave the default, omnibus.

To get notifications about your disk health, enable one or more of the notification options and enter the relevant variables for the service.

Install Requirements

To install the requirements, in the proect dir run the following:

pip install -r requirements.txt
ansible-galaxy install -r requirements.yml

Deploying

To run the playbook, simply run:

ansible-playbook playbook.yml -Kk

If you have opted to install your ssh keys with this role, subsequent runs will not require k.

If you have opted to configure passwordless_sudo, K will not be required on subsequent runs.

The playbook should execute all the required actions to set up & configure MANS. Subsequent runs will of course be much faster.

You can target specific elements of the setup process with tags. For example:

# Only run mergerfs setup
ansible-playbook playbook.yml --tags mergerfs
# You can also use the shorthand version, `-t`
ansible-playbook playbook.yml -t mergerfs

You can use this in reverse, excluding any step with a given tag:

ansible-playbook playbook.yml --skip-tags mergerfs

You can see all available tags:

ansible-playbook playbook.yml --list-tags

You can list all tasks and their tags:

ansible-playbook playbook.yml --list-tasks

You can show all tasks that would be included with a given tag:

ansible-playbook playbook.yml --tags install_btrfs --list-tasks

Usage

After a successful deployment, you will have the following (dependent on config):

Mounts

/mnt/media — This is the 'cached' share (if any cache device was specified). This is where writes will go and samba is configured to serve to/from.
/mnt/media-cold — 'Non-cached' share. This is the pool of backing data disks.
/mnt/cache-pool — If multiple cache devices were defined, this is the mount point for the pooled cache devices.
/mnt/data-disks/dataxx — Mount points for data_disks.
/mnt/parity-disks/parityxx — Mount points for parity_disks.
/mnt/cache-disks/cachexx — Mount points for cache_disks.

Logs

/var/log/snapraid-btrfs-runner.log — Logs for snapraid-btrfs-runner runs.
/var/log/snapper.log — Logs for snapper.
/var/log/cache-mover.log — Logs for mergerfs-cache-mover runs.

Commands

sudo python3 /var/snapraid-btrfs-runner/snapraid-btrfs-runner.py -c /var/snapraid-btrfs-runner/snapraid-btrfs-runner.conf — Runs snapraid-btrfs-runner manually.
sudo python3 /opt/mergerfs-cache-mover/cache-mover.py --console-log — Runs mergerfs-cache-mover manually.
sudo snapper list-configs — Lists all valid snapper configs.
sudo snapraid-btrfs ls - List snapshots.
sudo btrfs subvolume list /mnt/data-disks/data0x - Show Btrfs subvolumes for given data disk.

Ports

8080 - Scrutiny web-ui (omnibus).

Data Recovery

MANS provides two layers of data protection:

Btrfs snapshots (managed by Snapper): for recovering accidentally deleted or corrupted files on healthy disks.
SnapRAID parity: for recovering data from failed/dead disks using parity reconstruction.

File & Data Recovery (Btrfs Snapshots)

Use this when files have been accidentally deleted, corrupted, or gone missing but the underlying disk is still healthy and mounted.

Identifying the Problem

Check the snapraid-btrfs-runner log for warnings:

sudo tail -50 /var/log/snapraid-btrfs-runner.log

Common signs that files need recovering:

WARNING! All the files previously present in disk 'dX' ... are now missing or have been rewritten!
Deleted files exceed delete threshold (the runner aborted because too many files disappeared)
Files missing from your media applications (Plex, Jellyfin, etc.)

Finding the Right Snapshot

List snapshots across all disks:

sudo snapraid-btrfs ls

Each disk shows its snapshots. Look for the one marked snapraid-btrfs=synced (this is the last known-good state). Note the snapshot number (the # column).

Browse a snapshot's contents to confirm the data you need is there:

# List available snapshots for a disk
sudo ls /mnt/data-disks/<disk>/.snapshots/

# Browse the snapshot contents (replace <disk>, <snap_num>, <category>)
sudo ls /mnt/data-disks/<disk>/.snapshots/<snap_num>/snapshot/<category>/

# Verify data is intact (not empty stubs)
sudo du -sh /mnt/data-disks/<disk>/.snapshots/<snap_num>/snapshot/<category>/

Comparing Snapshot vs Live Data

Quick comparison for a single disk:

# Compare item counts between snapshot and live
sudo ls /mnt/data-disks/<disk>/.snapshots/<snap_num>/snapshot/<category>/ | wc -l
sudo ls /mnt/data-disks/<disk>/<category>/ | wc -l

Or check all disks at once (adjust the category as needed):

category=movies  # change to tv, youtube, etc.
for d in /mnt/data-disks/data*/; do
  disk=$(basename "$d")
  synced=$(sudo snapper -c "$disk" list 2>/dev/null | grep synced | awk '{print $1}')
  [ -z "$synced" ] && continue
  snap_count=$(sudo ls "$d/.snapshots/$synced/snapshot/$category/" 2>/dev/null | wc -l)
  live_count=$(sudo ls "$d/$category/" 2>/dev/null | wc -l)
  diff=$((live_count - snap_count))
  [ "$diff" -ne 0 ] && echo "$disk: snapshot #$synced ($snap_count) -> live ($live_count)  diff=$diff"
done

Recovering Files from Snapshots

Btrfs snapshots use Copy-on-Write, so recovering with --reflink=auto is near-instant and uses no additional disk space. Data blocks are shared between the snapshot and the restored copy.

Step 1: Always test with a single item first:

# Pick something small from the snapshot
sudo du -sh '/mnt/data-disks/<disk>/.snapshots/<snap_num>/snapshot/<category>/<item>/'

# Copy it to the live filesystem
sudo cp -a --reflink=auto \
  '/mnt/data-disks/<disk>/.snapshots/<snap_num>/snapshot/<category>/<item>' \
  '/mnt/data-disks/<disk>/<category>/'

# Verify it's visible on mergerfs and permissions are correct
ls -la '/mnt/media/<category>/<item>/'

Step 2: Bulk recover all missing items (skips anything that already exists):

# Set these variables for your situation
DISK="data01"        # the affected disk
SNAP_NUM="5"         # the synced snapshot number
CATEGORY="movies"    # movies, tv, youtube, etc.

sudo bash -c "
count=0; failed=0
for dir in \"/mnt/data-disks/$DISK/.snapshots/$SNAP_NUM/snapshot/$CATEGORY/\"*/; do
  name=\$(basename \"\$dir\")
  if [ ! -d \"/mnt/data-disks/$DISK/$CATEGORY/\$name\" ]; then
    if cp -a --reflink=auto \"\$dir\" \"/mnt/data-disks/$DISK/$CATEGORY/\"; then
      count=\$((count+1))
      echo \"OK [\$count]: \$name\"
    else
      failed=\$((failed+1))
      echo \"FAILED: \$name\"
    fi
  fi
done
echo \"Done: \$count copied, \$failed failed\"
"

Step 3: Verify and resync parity:

# Confirm files are visible on mergerfs
ls /mnt/media/<category>/ | wc -l

# Resync snapraid parity
# Use --ignore-deletethreshold if there are still legitimate changes from before recovery
sudo snapraid-btrfs sync

Important Notes

This only works while the disk is healthy. If a disk has physically failed, see the Disk Failure Recovery section below.
cp -a --reflink=auto preserves permissions, ownership, and timestamps. The reflink means no extra disk space is used as long as the underlying data blocks haven't been overwritten.
Snapshots are read-only: you cannot accidentally damage them during recovery.
Existing files are skipped: the [ ! -d ... ] check prevents overwriting files that may have been placed by the cache mover or other processes since the snapshot was taken.
Recover before the next sync. Snapshots are cleaned up when a new snapraid-btrfs sync completes successfully. If the runner's delete threshold prevented a sync, your snapshots are safe, but don't manually force a sync until you've recovered what you need.

Disk Failure Recovery (SnapRAID Parity)

Use this when a data disk has physically failed, is producing I/O errors, or is no longer mountable. SnapRAID can reconstruct the contents of one failed disk (single parity) or two failed disks (dual parity) using parity data.

Important

Parity must have been synced before the disk failed. SnapRAID can only recover data that was included in the last successful snapraid-btrfs sync. Any files added after the last sync are not protected.

Before You Begin

Do not run snapraid-btrfs sync. Syncing after a disk failure will update parity to reflect the missing data (an empty/new disk), destroying your ability to recover.
Check Scrutiny / SMART data to confirm which disk has failed and whether it's a total failure or just errors.
Identify the failed disk's SnapRAID label (e.g. d1, d2, etc.) from /etc/snapraid.conf.

Assessing the Damage

# Check which disks are mounted
mount | grep /mnt/data-disks

# Check disk health (smart/status are snapraid commands, not snapraid-btrfs)
sudo snapraid smart

# See the current state of the array
sudo snapraid status

Understanding the Sync Risk

SnapRAID parity is your only way to recover a failed disk's data. If a sync runs against the new empty disk before you've recovered, the parity will be updated to reflect the empty state and the data will be permanently lost.

The snapraid-btrfs-runner has a deletethreshold (default: 150) which will abort a sync if too many files are detected as deleted, this should catch an entire empty disk. However, you should not rely on this as your only protection.

MANS will re-enable the snapraid-btrfs-runner timer when it runs (it ensures the timer is enabled and started). This means after running MANS with the new disk, the automated weekly sync (Sunday 03:00) will be active. You must either:

Complete the recovery before the next scheduled sync, or

Disable the timer after MANS completes until recovery is done:

sudo systemctl stop snapraid-btrfs-runner.timer
sudo systemctl disable snapraid-btrfs-runner.timer

Recovery Process

Step 1: Update MANS config and re-run the playbook.

Replace the failed disk physically, then update the disk ID in your vars file (e.g. vars.yml) under data_disks. Then run MANS with wipe_and_setup: true:

ansible-playbook playbook.yml

MANS will handle all the setup for the new disk automatically:

Format the disk with Btrfs and create the data subvolume
Mount it and update /etc/fstab
Create the snapper config
Create data_directories (movies, tv, etc.)
Reconfigure mergerfs and snapraid to include it

MANS does not run snapraid sync or touch parity data, so your recovery data is safe. However, it will re-enable the sync timer (see above).

Step 2: Disable the sync timer (if you need more time to recover):

sudo systemctl stop snapraid-btrfs-runner.timer
sudo systemctl disable snapraid-btrfs-runner.timer

Step 3: Recover data using SnapRAID parity:

# Verify parity integrity for the failed disk first
sudo snapraid-btrfs check -d <label>

# Recover all missing files (replace <label> with the disk label, e.g. d1)
sudo snapraid-btrfs fix -d <label> -m

# Or recover all files on the disk (missing + errored)
sudo snapraid-btrfs fix -d <label>

The -m flag targets only missing/deleted files. Without it, fix will also rewrite any files with errors. This reconstructs files using parity data and will take a long time depending on how much data needs to be rebuilt.

Step 4: Resync parity and re-enable the timer:

Once recovery is complete, sync parity to include the recovered data and re-enable automated syncs:

sudo snapraid-btrfs sync
sudo systemctl enable snapraid-btrfs-runner.timer
sudo systemctl start snapraid-btrfs-runner.timer

Parity Disk Failure

If a parity disk fails rather than a data disk, no data is lost, but you have no protection until it's replaced. Replace the disk, update the disk ID in your vars file, re-run MANS, and then rebuild parity:

sudo snapraid-btrfs sync --force-full

Limitations

SnapRAID can recover up to as many simultaneous disk failures as you have parity disks (1 parity disk = 1 disk failure, 2 parity disks = 2 disk failures).
Only files included in the last successful sync are recoverable. Unsynced changes are lost.
SnapRAID is not a real-time RAID. There is always a window between syncs where new data is unprotected.
Content files (snapraid.content) must be intact for recovery. MANS stores these on separate disks for redundancy.

Issues & Requests

Please report any issues with full logs (-vvv). If you have any requests or improvements, please feel free to raise this/submit a PR.

Snapper error "The config 'root' does not exist"

This is expected and unfortunately just a symptom of the way this is configured vs. how Snapper expects to be used.

The error presents itself like so:

$ sudo snapper list
The config 'root' does not exist. Likely snapper is not configured.

Most things can be done instead with snapraid-btrfs, for example:

$ sudo snapraid-btrfs list
data01 /mnt/data-disks/data01
# │ Type   │ Pre # │ Date                         │ User │ Cleanup │ Description         │ Userdata
──┼────────┼───────┼──────────────────────────────┼──────┼─────────┼─────────────────────┼──────────────────────
0 │ single │       │                              │ root │         │ current             │
2 │ single │       │ Thu 17 Oct 2024 03:52:31 BST │ root │         │ snapraid-btrfs sync │ snapraid-btrfs=synced

data02 /mnt/data-disks/data02
# │ Type   │ Pre # │ Date                         │ User │ Cleanup │ Description         │ Userdata
──┼────────┼───────┼──────────────────────────────┼──────┼─────────┼─────────────────────┼──────────────────────
0 │ single │       │                              │ root │         │ current             │
2 │ single │       │ Thu 17 Oct 2024 03:52:31 BST │ root │         │ snapraid-btrfs sync │ snapraid-btrfs=synced

data03 /mnt/data-disks/data03
# │ Type   │ Pre # │ Date                         │ User │ Cleanup │ Description         │ Userdata
──┼────────┼───────┼──────────────────────────────┼──────┼─────────┼─────────────────────┼──────────────────────
0 │ single │       │                              │ root │         │ current             │
2 │ single │       │ Thu 17 Oct 2024 03:52:32 BST │ root │         │ snapraid-btrfs sync │ snapraid-btrfs=synced

If necessary, you can run snapper commands via snapraid-btrfs, and this seems to work fine: snapraid-btrfs snapper <command>

Split Parity Migration

MANS now supports split parity files to overcome ext4's 16TB file size limitation. This allows using data disks larger than 16TB with ext4-formatted parity disks by splitting parity data across multiple files. Migration is only required if:

You have an existing single-file setup AND
You plan to use data disks larger than 16TB

Note

Whilst there is no real need to enable migration if the above in your situation is true, there may come a time when this option is deprecated completely. New deployments are split, so this can technically stay here forever, but I can't see the future.

It would be best to migrate when you can.

To migrate:

Set MANS to migrate

In your vars.yml, set:

split_parity_migrate: true

Note

You will need to add this variable most likely if you have an existing MANS setup, please see vars_example.yml for any new vars you may be missing from updates.

Run the MANS playbook

ansible-playbook playbook.yml

After playbook completion:

Delete existing parity file

sudo rm /mnt/parity-disks/parity01/snapraid.parity

Warning

This may take significant time and look like it's hung, it's not. Do this in a tmux window and in another session you can see free/used space slowly changing with df -h. For my 12Tb parity file the delete action took  14m 40s.

Note

You may have more parity files to delete on other disks.

Rebuild parity in split files

Important

If you have a lot of data this can take a significant amount of time. I highly recommend running the command below in a tmux window to run unattended. If you spawn sync in a normal SSH window and that connection is broken, it will break the sync.

sudo snapraid-btrfs sync --force-full

Note

The sync process can take significant time depending on array size. Do not interrupt the process, as above.

Each parity disk will have its own set of split files (e.g., snapraid-1.parity, snapraid-2.parity, snapraid-3.parity)

Files are filled sequentially - when one file is full, SnapRAID moves to the next

Note

As below, this took me about 26h to do 72Tb.

100% completed, 72481609 MB accessed in 26:10     0:00 ETA

     d1 20% | ************
     d2 22% | *************
     d3  9% | *****
     d4  5% | ***
     d5 10% | ******
     d6 10% | ******
     d7  0% |
 parity  7% | ****
   raid  3% | **
   hash 11% | ******
  sched  0% |
   misc  0% |
            |______________________________________________________________
                           wait time (total, less is better)

Changelog

See the full changelog here.

Name		Name	Last commit message	Last commit date
Latest commit History 429 Commits
.github		.github
inventories		inventories
roles		roles
templates		templates
.ansible-lint		.ansible-lint
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md
ansible.cfg		ansible.cfg
playbook.yml		playbook.yml
renovate.json		renovate.json
requirements.txt		requirements.txt
requirements.yml		requirements.yml
vars_example.yml		vars_example.yml

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

✨ MANS ✨ — Muffin's Awesome NAS Stack

Intro

Tasks

More Info

Prerequisites

Clone

Config

Vars

Disk Config

Install Requirements

Deploying

Usage

Mounts

Logs

Commands

Ports

Data Recovery

File & Data Recovery (Btrfs Snapshots)

Identifying the Problem

Finding the Right Snapshot

Comparing Snapshot vs Live Data

Recovering Files from Snapshots

Important Notes

Disk Failure Recovery (SnapRAID Parity)

Before You Begin

Assessing the Damage

Understanding the Sync Risk

Recovery Process

Parity Disk Failure

Limitations

Issues & Requests

Snapper error "The config 'root' does not exist"

Split Parity Migration

Set MANS to migrate

Run the MANS playbook

Delete existing parity file

Rebuild parity in split files

Changelog

Ko-fi

Star History

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 5

Sponsor this project

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages