This should allow importing a volume and changing the size at the
same time, without performing the resize operation on original
volume first.
The internal API has been renamed to internal.vm.volume.ImportBegin
to avoid confusion, and for symmetry with ImportEnd.
See QubesOS/qubes-issues#5239.
Don't fall back on 'cp' if the FICLONE ioctl gives an errno that's not
plausibly reflink specific, because in such a case any fallback could
theoretically mask real but intermittent system/storage errors.
Looking through ioctl_ficlone(2) and the kernel source, it should be
sufficient to do the fallback only on EBADF/EINVAL/EOPNOTSUPP/EXDEV.
(EISDIR/ETXTBSY don't apply to this storage driver, which will never
legitimately attempt to reflink a directory or an active - in the
storage domain - swap file.)
One alternative would look like
import ctypes
sizeof_int = ctypes.sizeof(ctypes.c_int)
FICLONE = (1073741824 % 256**sizeof_int) | 37897 | (sizeof_int << 16)
but, even if the above really(?) is a 100% correct Python port of
$ echo FICLONE | cpp -include linux/fs.h | tail -n 1
it still seems more likely that the ctypes package is somehow buggy
somewhere than for Qubes storage to run on an exotic architecture with
non 32 bit ints (in the foreseeable future).
So just document the baked in assumption.
The default (= text) mode for a loop device which contains a VM image
looked weird, even though it didn't make a difference here because the
dev_io object was never actually read from.
During regular VM shutdown, the VM should sync() anyway. (And
admin.vm.volume.Import does fdatasync(), which is also fine.) But let's
be extra careful.
There were (at least) five ways for the volume's nominal size and the
volume image file's actual size to desynchronize:
- loading a stale qubes.xml if a crash happened right after resizing the
image but before saving the updated qubes.xml (-> previously fixed)
- restarting a snap_on_start volume after resizing the volume or its
source volume (-> previously fixed)
- reverting to a differently sized revision
- importing a volume
- user tinkering with image files
Rather than trying to fix these one by one and hoping that there aren't
any others, override the volume size getter itself to always update from
the image file size. (If the getter is called though the storage API, it
takes the volume lock to avoid clobbering the nominal size when resize()
is running concurrently.)
And change the volume lock from an asyncio.Lock to a threading.Lock -
locking is now handled before coroutinization.
This will allow the coroutinized resize() and a new *not* coroutinized
size() getter from one of the next commits ("storage/reflink: preferably
get volume size from image size") to both run under the volume lock.
Successfully resize volumes without any currently existing image file,
e.g. cleanly stopped volatile volumes: Just update the nominal size in
this case.
Convert create(), verify(), remove(), start(), stop(), revert(),
resize(), and import_volume() into coroutine methods, via a decorator
that runs them in the event loop's thread-based default executor.
This reduces UI hangs by unblocking the event loop, and can e.g. speed
up VM starts by starting multiple volumes in parallel.
Instead of raising a NotImplementedError, just return self like 'file'
and lvm_thin. This is needed when Storage.clone() is modified in another
commit* to no longer swallow exceptions.
* "storage: factor out _wait_and_reraise(); fix clone/create"
Import volume data to a new _path_import (instead of _path_dirty) before
committing to _path_clean. In case the computer crashes while an import
operation is running, the partially written file should not be attached
to Xen on the next volume startup.
Use <name>-import.img as the filename like 'file' does, to be compatible
with qubes.tests.api_admin/TC_00_VMs/test_510_vm_volume_import.
When the AT_REPLACE flag for linkat() finally lands in the Linux kernel,
_replace_file() can be modified to use unnamed (O_TMPFILE) tempfiles.
Until then, make sure stale tempfiles from previous crashes can't hang
around for too long.
It's sort of useful to be able to revert a volume that has only ever
been started once to its empty state. And the lvm_thin driver allows it
too, so why not.
Don't rely on timestamps to sort revisions - the clock can go backwards
due to time sync. Instead, use a monotonically increasing natural number
as the revision ID.
Old revision example: private.img@2018-01-02-03T04:05:06Z (ignored now)
New revision example: private.img.123@2018-01-02-03T04:05:06Z
* storage-properties:
storage: use None for size/usage properties if unknown
tests: call search_pool_containing_dir with various dirs and pools
storage: make DirectoryThinPool helper less verbose, add sudo
api/admin: add 'included_in' to admin.pool.Info call
storage: add Pool.included_in() method for checking nested pools
storage: move and generalize RootThinPool helper class
storage/kernels: refuse changes to 'rw' and 'revisions_to_keep'
api/admin: implement admin.vm.volume.Set.rw method
api/admin: include 'revisions_to_keep' and 'is_outdated' in volume info
It may happen that one pool is inside a volume of other pool. This is
the case for example for varlibqubes pool (file driver,
dir_path=/var/lib/qubes) and default lvm pool (lvm_thin driver). The
latter include whole root filesystem, so /var/lib/qubes too.
This is relevant for proper disk space calculation - to not count some
space twice.
QubesOS/qubes-issues#3240QubesOS/qubes-issues#3241