Commit Graph

1 Commits

Author SHA1 Message Date
Rusty Bird
1695a732b8
file-reflink, a storage driver optimized for CoW filesystems
This adds the file-reflink storage driver. It is never selected
automatically for pool creation, especially not the creation of
'varlibqubes' (though it can be used if set up manually).

The code is quite small:

               reflink.py  lvm.py      file.py + block-snapshot
    sloccount  334 lines   447 (134%)  570 (171%)

Background: btrfs and XFS (but not yet ZFS) support instant copies of
individual files through the 'FICLONE' ioctl behind 'cp --reflink'.
Which file-reflink uses to snapshot VM image files without an extra
device-mapper layer. All the snapshots are essentially freestanding;
there's no functional origin vs. snapshot distinction.

In contrast to 'file'-on-btrfs, file-reflink inherently avoids
CoW-on-CoW. Which is a bigger issue now on R4.0, where even AppVMs'
private volumes are CoW. (And turning off the lower, filesystem-level
CoW for 'file'-on-btrfs images would turn off data checksums too, i.e.
protection against bit rot.)

Also in contrast to 'file', all storage features are supported,
including

    - any number of revisions_to_keep
    - volume.revert()
    - volume.is_outdated
    - online fstrim/discard

Example tree of a file-reflink pool - *-dirty.img are connected to Xen:

    - /var/lib/testpool/appvms/foo/volatile-dirty.img
    - /var/lib/testpool/appvms/foo/root-dirty.img
    - /var/lib/testpool/appvms/foo/root.img
    - /var/lib/testpool/appvms/foo/private-dirty.img
    - /var/lib/testpool/appvms/foo/private.img
    - /var/lib/testpool/appvms/foo/private.img@2018-01-02T03:04:05Z
    - /var/lib/testpool/appvms/foo/private.img@2018-01-02T04:05:06Z
    - /var/lib/testpool/appvms/foo/private.img@2018-01-02T05:06:07Z
    - /var/lib/testpool/appvms/bar/...
    - /var/lib/testpool/appvms/...
    - /var/lib/testpool/template-vms/fedora-26/...
    - /var/lib/testpool/template-vms/...

It looks similar to a 'file' pool tree, and in fact file-reflink is
drop-in compatible:

    $ qvm-shutdown --all --wait
    $ systemctl stop qubesd
    $ sed 's/ driver="file"/ driver="file-reflink"/g' -i.bak /var/lib/qubes/qubes.xml
    $ systemctl start qubesd
    $ sudo rm -f /path/to/pool/*/*/*-cow.img*

If the user tries to create a fresh file-reflink pool on a filesystem
that doesn't support reflinks, qvm-pool will abort and mention the
'setup_check=no' option. Which can be passed to force a fallback on
regular sparse copies, with of course lots of time/space overhead. The
same fallback code is also used when initially cloning a VM from a
foreign pool, or from another file-reflink pool on a different
mountpoint.

'journalctl -fu qubesd' will show all file-reflink copy/rename/remove
operations on VM creation/startup/shutdown/etc.
2018-02-12 21:20:05 +00:00