The new property is meant for management stack (Salt) to set which DVM
template should be used to maintain given VM. Since the DispVM based on
it will be given ultimate control over target VM (qubes.VMShell
service), it should be trusted. The one pointed to by default_dispvm
not necessary is one.
The property defaults to the value from the template (if any), and then
to a global management_dispvm property. By default it is set to None.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
It makes a lot of sense to call long-running operations in that event
handler, including calling back into the VM. Allow that by using
fire_event_async, not just fire_event.
Also, document the event.
Commit 15cf593bc5 "tests/lvm: fix checking
lvm pool existence" attempted to fix handling '-' in pool name by using
/dev/VG/LV symlink. But those are not created for thin pools. Change
back to /dev/mapper, but include '-' mangling.
Related QubesOS/qubes-issues#4332
Restore old code for calculating subdir within the archive. The new one
had two problems:
- set '/' for empty input subdir - which caused qubes.xml.000 to be
named '/qubes.xml.000' (and then converted to '../../qubes.xml.000');
among other things, this results in the wrong path used for encryption
passphrase
- resolved symlinks, which breaks calculating path for any symlinks
within VM's directory (symlinks there should be treated as normal files
to be sure that actual content is included in the backup)
This partially reverts 4e49b951ce.
FixesQubesOS/qubes-issues#4493
Add 'no-default-kernelopts' feature to skip default hardcoded
Linux-specific kernelopts.
This is especially useful for non-Linux VMs (including Mirage OS).
FixesQubesOS/qubes-issues#4468
vm.kill() will try to get vm.startup_lock, so it can't be called while
holding it already.
Fix this by extracting vm._kill_locked(), which expect the lock to be
already taken by the caller.
* origin/pr/239:
storage: fix NotImplementedError message for import_data()
storage/reflink: make resize()/import_volume() more readable
storage/reflink: unblock import_data() and import_data_end()
Try to collect more details about why the test failed. This will help
only if qvm-open-in-dvm exist early. On the other hand, if it hang, or
remote side fails to find the right editor (which results in GUI error
message), this change will not provide any more details.
First boot of whonix-ws based VM take extended period of time, because
a lot of files needs to be copied to private volume. This takes even
more time, when verbose logging through console is enabled. Extend the
timeout for that.
If domain is set to autostart, qubes-vm@ systemd service is used to
start it at boot. Cleanup the service when domain is removed, and
similarly enable the service when domain is created and already have
autostart=True.
FixesQubesOS/qubes-issues#4014
Cleanup VMs in template reverse topological order, not network one.
Network can be set to None to break dependency, but template can't. For
netvm to be changed, kill VMs first (kill doesn't check network
dependency), so netvm change will not trigger side effects (runtime
change, which could fail).
This fixes cleanup for tests creating custom templates - previously
order was undefined and if template was tried removed before its child
VMs, it fails. All the relevant files were removed later anyway, but it
lead to python objects leaks.
Cleaning up after domain shutdown (domain-stopped and domain-shutdown
events) relies on libvirt events which may be unreliable in some cases
(events may be processed with some delay, of if libvirt was restarted in
the meantime, may not happen at all). So, instead of ensuring only
proper ordering between shutdown cleanup and next startup, also trigger
the cleanup when we know for sure domain isn't running:
- at vm.kill() - after libvirt confirms domain was destroyed
- at vm.shutdown(wait=True) - after successful shutdown
- at vm.remove_from_disk() - after ensuring it isn't running but just
before actually removing it
This fixes various race conditions:
- qvm-kill && qvm-remove: remove could happen before shutdown cleanup
was done and storage driver would be confused about that
- qvm-shutdown --wait && qvm-clone: clone could happen before new content was
commited to the original volume, making the copy of previous VM state
(and probably more)
Previously it wasn't such a big issue on default configuration, because
LVM driver was fully synchronous, effectively blocking the whole qubesd
for the time the cleanup happened.
To avoid code duplication, factor out _ensure_shutdown_handled function
calling actual cleanup (and possibly canceling one called with libvirt
event). Note that now, "Duplicated stopped event from libvirt received!"
warning may happen in normal circumstances, not only because of some
bug.
It is very important that post-shutdown cleanup happen when domain is
not running. To ensure that, take startup_lock and under it 1) ensure
its halted and only then 2) execute the cleanup. This isn't necessary
when removing it from disk, because its already removed from the
collection at that time, which also avoids other calls to it (see also
"vm/dispvm: fix DispVM cleanup" commit).
Actually, taking the startup_lock in remove_from_disk function would
cause a deadlock in DispVM auto cleanup code:
- vm.kill (or other trigger for the cleanup)
- vm.startup_lock acquire <====
- vm._ensure_shutdown_handled
- domain-shutdown event
- vm._auto_cleanup (in DispVM class)
- vm.remove_from_disk
- cannot take vm.startup_lock again
First unregister the domain from collection, and only then call
remove_from_disk(). Removing it from collection prevent further calls
being made to it. Or if anything else keep a reference to it (for
example as a netvm), then abort the operation.
Additionally this makes it unnecessary to take startup lock when
cleaning it up in tests.
vm.shutdown(wait=True) waited indefinitely for the shutdown, which makes
useless without some boilerplate handling the timeout. Since the timeout
may depend on the operating system inside, add a per-VM property for it,
with value inheritance from template and then from global
default_shutdown_timeout property.
When timeout is reached, the method raises exception - whether to kill
it or not is left to the caller.
FixesQubesOS/qubes-issues#1696