Commit Graph

75 Commits

Author SHA1 Message Date
Rusty Bird
edda3a1734
storage/reflink: factor out _ficlone() 2018-09-11 23:50:13 +00:00
Rusty Bird
69af0a48ec
storage/reflink: inline and simplify _cmd() 2018-09-11 23:50:12 +00:00
Rusty Bird
fb06a8089a
storage/reflink: _update_loopdev_sizes() without losetup
Factor out a function, and use the LOOP_SET_CAPACITY ioctl instead of
going through losetup.
2018-09-11 23:50:10 +00:00
Rusty Bird
385ba91772
storage/reflink: resize(): don't look for loopdevs if clean 2018-09-09 20:01:22 +00:00
Rusty Bird
e7b7c253ac
storage/reflink: inline _require_self_on_stop() 2018-09-09 20:01:20 +00:00
Rusty Bird
6e8d7d4201
storage/reflink: no-op import_volume() if not save_on_stop
Instead of raising a NotImplementedError, just return self like 'file'
and lvm_thin. This is needed when Storage.clone() is modified in another
commit* to no longer swallow exceptions.

* "storage: factor out _wait_and_reraise(); fix clone/create"
2018-09-09 20:01:19 +00:00
Rusty Bird
60bf68a748
storage/reflink: add _path_import (don't reuse _path_dirty)
Import volume data to a new _path_import (instead of _path_dirty) before
committing to _path_clean. In case the computer crashes while an import
operation is running, the partially written file should not be attached
to Xen on the next volume startup.

Use <name>-import.img as the filename like 'file' does, to be compatible
with qubes.tests.api_admin/TC_00_VMs/test_510_vm_volume_import.
2018-09-09 20:01:18 +00:00
Rusty Bird
d301aa2e50
storage/reflink: delete stale tempfiles on start and remove
When the AT_REPLACE flag for linkat() finally lands in the Linux kernel,
_replace_file() can be modified to use unnamed (O_TMPFILE) tempfiles.
Until then, make sure stale tempfiles from previous crashes can't hang
around for too long.
2018-09-09 20:01:17 +00:00
Rusty Bird
75a4a1340e
storage/reflink: don't recompute static properties per call 2018-09-09 20:01:15 +00:00
Rusty Bird
ef2698adb4
storage/reflink: make revisions() more readable, use iglob 2018-09-09 20:01:14 +00:00
Rusty Bird
18f9356c2c
storage/reflink: refuse to revert() dirty volume 2018-09-09 20:01:13 +00:00
Rusty Bird
677183d8a6
storage/reflink: add revision even if empty
It's sort of useful to be able to revert a volume that has only ever
been started once to its empty state. And the lvm_thin driver allows it
too, so why not.
2018-09-09 20:01:12 +00:00
Rusty Bird
850778b52a
storage/reflink: remove redundant format specifiers 2018-09-09 20:01:11 +00:00
Marek Marczykowski-Górecki
d6b422cc36
Merge remote-tracking branch 'qubesos/pr/207'
* qubesos/pr/207:
  storage/reflink: strictly increasing revision ID
2018-03-22 01:54:38 +01:00
Rusty Bird
6a303760e9
storage/reflink: strictly increasing revision ID
Don't rely on timestamps to sort revisions - the clock can go backwards
due to time sync. Instead, use a monotonically increasing natural number
as the revision ID.

Old revision example: private.img@2018-01-02-03T04:05:06Z (ignored now)
New revision example: private.img.123@2018-01-02-03T04:05:06Z
2018-03-21 16:00:13 +00:00
Marek Marczykowski-Górecki
e5413a3036
Merge branch 'storage-properties'
* storage-properties:
  storage: use None for size/usage properties if unknown
  tests: call search_pool_containing_dir with various dirs and pools
  storage: make DirectoryThinPool helper less verbose, add sudo
  api/admin: add 'included_in' to admin.pool.Info call
  storage: add Pool.included_in() method for checking nested pools
  storage: move and generalize RootThinPool helper class
  storage/kernels: refuse changes to 'rw' and 'revisions_to_keep'
  api/admin: implement admin.vm.volume.Set.rw method
  api/admin: include 'revisions_to_keep' and 'is_outdated' in volume info
2018-03-21 01:43:53 +01:00
Marek Marczykowski-Górecki
d40fae9756
storage: add Pool.included_in() method for checking nested pools
It may happen that one pool is inside a volume of other pool. This is
the case for example for varlibqubes pool (file driver,
dir_path=/var/lib/qubes) and default lvm pool (lvm_thin driver). The
latter include whole root filesystem, so /var/lib/qubes too.
This is relevant for proper disk space calculation - to not count some
space twice.

QubesOS/qubes-issues#3240
QubesOS/qubes-issues#3241
2018-03-20 16:53:39 +01:00
Rusty Bird
1743c76ca9
storage/reflink: reorder start() to be more readable
This also makes slightly more sense in the exotic (and currently unused)
case of restarting a crashed snap_on_start *and* save_on_stop volume.
2018-03-12 16:38:56 +00:00
Rusty Bird
31810db977
storage/reflink: simplify 2018-03-11 17:39:51 +00:00
Rusty Bird
c382eb3752
storage/reflink: let _remove_empty_dir() ignore ENOTEMPTY 2018-03-11 17:39:51 +00:00
Rusty Bird
023cb49293
storage/reflink: show size in refused volume shrink message
Like e6bb282 did for lvm.
2018-03-11 15:34:56 +00:00
Rusty Bird
c31d317c63
storage/reflink: fsync() after resizing existing file
Ensure that the updated metadata is written to disk.
2018-03-11 15:34:55 +00:00
Rusty Bird
37e1aedfa3
reflink: style fix 2018-02-16 21:47:39 +00:00
Rusty Bird
c871424fb0
storage: typo fix 2018-02-16 21:47:37 +00:00
Rusty Bird
1695a732b8
file-reflink, a storage driver optimized for CoW filesystems
This adds the file-reflink storage driver. It is never selected
automatically for pool creation, especially not the creation of
'varlibqubes' (though it can be used if set up manually).

The code is quite small:

               reflink.py  lvm.py      file.py + block-snapshot
    sloccount  334 lines   447 (134%)  570 (171%)

Background: btrfs and XFS (but not yet ZFS) support instant copies of
individual files through the 'FICLONE' ioctl behind 'cp --reflink'.
Which file-reflink uses to snapshot VM image files without an extra
device-mapper layer. All the snapshots are essentially freestanding;
there's no functional origin vs. snapshot distinction.

In contrast to 'file'-on-btrfs, file-reflink inherently avoids
CoW-on-CoW. Which is a bigger issue now on R4.0, where even AppVMs'
private volumes are CoW. (And turning off the lower, filesystem-level
CoW for 'file'-on-btrfs images would turn off data checksums too, i.e.
protection against bit rot.)

Also in contrast to 'file', all storage features are supported,
including

    - any number of revisions_to_keep
    - volume.revert()
    - volume.is_outdated
    - online fstrim/discard

Example tree of a file-reflink pool - *-dirty.img are connected to Xen:

    - /var/lib/testpool/appvms/foo/volatile-dirty.img
    - /var/lib/testpool/appvms/foo/root-dirty.img
    - /var/lib/testpool/appvms/foo/root.img
    - /var/lib/testpool/appvms/foo/private-dirty.img
    - /var/lib/testpool/appvms/foo/private.img
    - /var/lib/testpool/appvms/foo/private.img@2018-01-02T03:04:05Z
    - /var/lib/testpool/appvms/foo/private.img@2018-01-02T04:05:06Z
    - /var/lib/testpool/appvms/foo/private.img@2018-01-02T05:06:07Z
    - /var/lib/testpool/appvms/bar/...
    - /var/lib/testpool/appvms/...
    - /var/lib/testpool/template-vms/fedora-26/...
    - /var/lib/testpool/template-vms/...

It looks similar to a 'file' pool tree, and in fact file-reflink is
drop-in compatible:

    $ qvm-shutdown --all --wait
    $ systemctl stop qubesd
    $ sed 's/ driver="file"/ driver="file-reflink"/g' -i.bak /var/lib/qubes/qubes.xml
    $ systemctl start qubesd
    $ sudo rm -f /path/to/pool/*/*/*-cow.img*

If the user tries to create a fresh file-reflink pool on a filesystem
that doesn't support reflinks, qvm-pool will abort and mention the
'setup_check=no' option. Which can be passed to force a fallback on
regular sparse copies, with of course lots of time/space overhead. The
same fallback code is also used when initially cloning a VM from a
foreign pool, or from another file-reflink pool on a different
mountpoint.

'journalctl -fu qubesd' will show all file-reflink copy/rename/remove
operations on VM creation/startup/shutdown/etc.
2018-02-12 21:20:05 +00:00