Commit Graph

474 Commits

Author SHA1 Message Date
Marek Marczykowski-Górecki
f13029219b
vm: disable/enable qubes-vm@ service when domain is removed/created
If domain is set to autostart, qubes-vm@ systemd service is used to
start it at boot. Cleanup the service when domain is removed, and
similarly enable the service when domain is created and already have
autostart=True.

Fixes QubesOS/qubes-issues#4014
2018-10-27 16:44:53 +02:00
Marek Marczykowski-Górecki
2c1629da04
vm: call after-shutdown cleanup also from vm.kill and vm.shutdown
Cleaning up after domain shutdown (domain-stopped and domain-shutdown
events) relies on libvirt events which may be unreliable in some cases
(events may be processed with some delay, of if libvirt was restarted in
the meantime, may not happen at all). So, instead of ensuring only
proper ordering between shutdown cleanup and next startup, also trigger
the cleanup when we know for sure domain isn't running:
 - at vm.kill() - after libvirt confirms domain was destroyed
 - at vm.shutdown(wait=True) - after successful shutdown
 - at vm.remove_from_disk() - after ensuring it isn't running but just
 before actually removing it

This fixes various race conditions:
 - qvm-kill && qvm-remove: remove could happen before shutdown cleanup
 was done and storage driver would be confused about that
 - qvm-shutdown --wait && qvm-clone: clone could happen before new content was
 commited to the original volume, making the copy of previous VM state
(and probably more)

Previously it wasn't such a big issue on default configuration, because
LVM driver was fully synchronous, effectively blocking the whole qubesd
for the time the cleanup happened.

To avoid code duplication, factor out _ensure_shutdown_handled function
calling actual cleanup (and possibly canceling one called with libvirt
event). Note that now, "Duplicated stopped event from libvirt received!"
warning may happen in normal circumstances, not only because of some
bug.

It is very important that post-shutdown cleanup happen when domain is
not running. To ensure that, take startup_lock and under it 1) ensure
its halted and only then 2) execute the cleanup. This isn't necessary
when removing it from disk, because its already removed from the
collection at that time, which also avoids other calls to it (see also
"vm/dispvm: fix DispVM cleanup" commit).
Actually, taking the startup_lock in remove_from_disk function would
cause a deadlock in DispVM auto cleanup code:
 - vm.kill (or other trigger for the cleanup)
   - vm.startup_lock acquire   <====
     - vm._ensure_shutdown_handled
       - domain-shutdown event
         - vm._auto_cleanup (in DispVM class)
           - vm.remove_from_disk
             - cannot take vm.startup_lock again
2018-10-26 23:54:08 +02:00
Marek Marczykowski-Górecki
5be003d539
vm/dispvm: fix DispVM cleanup
First unregister the domain from collection, and only then call
remove_from_disk(). Removing it from collection prevent further calls
being made to it. Or if anything else keep a reference to it (for
example as a netvm), then abort the operation.

Additionally this makes it unnecessary to take startup lock when
cleaning it up in tests.
2018-10-26 23:54:08 +02:00
Marek Marczykowski-Górecki
e1f65bdf7b
vm: add shutdown_timeout property, make vm.shutdown(wait=True) use it
vm.shutdown(wait=True) waited indefinitely for the shutdown, which makes
useless without some boilerplate handling the timeout. Since the timeout
may depend on the operating system inside, add a per-VM property for it,
with value inheritance from template and then from global
default_shutdown_timeout property.

When timeout is reached, the method raises exception - whether to kill
it or not is left to the caller.

Fixes QubesOS/qubes-issues#1696
2018-10-26 23:54:04 +02:00
Marek Marczykowski-Górecki
58bcec2a64
qubesvm: improve error message about same-pool requirement
Make it clear that volume creation fails because it needs to be in the
same pool as its parent. This message is shown in context of `qvm-create
-p root=MyPool` for example and the previous message didn't make sense
at all.

Fixes QubesOS/qubes-issues#3438
2018-10-18 00:03:05 +02:00
Marek Marczykowski-Górecki
ba210c41ee
qubesvm: don't crash VM creation if icon symlink already exists
It can be leftover from previous failed attempt. Don't crash on it, and
replace it instead.

QubesOS/qubes-issues#3438
2018-10-18 00:01:45 +02:00
Rusty Bird
bee69a98b9
Add default_qrexec_timeout to qubes-prefs
When a VM (or its template) does not explicitly set a qrexec_timeout,
fall back to a global default_qrexec_timeout (with default value 60),
instead of hardcoding the fallback value to 60.

This makes it easy to set a higher timeout for the whole system, which
helps users who habitually launch applications from several (not yet
started) VMs at the same time. 60 seconds can be too short for that.
2018-09-16 18:42:48 +00:00
Rusty Bird
b3983f5ef8
'except FileNotFoundError' instead of ENOENT check 2018-09-13 19:46:45 +00:00
Marek Marczykowski-Górecki
7f1e2741ec
Merge remote-tracking branch 'qubesos/pr/228'
* qubesos/pr/228:
  storage/lvm: filter out warning about intended over-provisioning
  tests: fix getting kernel package version inside VM
  tests/extra: add start_guid option to VMWrapper
  vm/qubesvm: fire 'domain-start-failed' event even if fail was early
  vm/qubesvm: check if all required devices are available before start
  storage/lvm: fix reporting lvm command error
  storage/lvm: save pool's revision_to_keep property
2018-09-07 01:06:59 +02:00
Marek Marczykowski-Górecki
ee25f7c7bb
vm: fix error reporting on PVH without kernel set
Fixes QubesOS/qubes-issues#4254
2018-09-03 00:23:05 +02:00
Jean-Philippe Ouellet
e95ef5f61d
Add domain-paused/-unpaused events
Needed for event-driven domains-tray UI updating and anti-GUI-DoS
usability improvements.

Catches errors from event handlers to protect libvirt, and logs to
main qubesd logger singleton (by default meaning systemd journal).
2018-08-01 05:41:50 -04:00
Marek Marczykowski-Górecki
e51efcf980
vm: document domain-start-failed event 2018-07-16 22:02:59 +02:00
Marek Marczykowski-Górecki
af2435c0d4
Make some properties default to template's value (if any)
Multiple properties are related to system installed inside the VM, so it
makes sense to have them the same for all the VMs based on the same
template. Modify default value getter to first try get the value from a
template (if any) and only if it fails, fallback to original default
value.
This change is made to those properties:
 - default_user (it was already this way)
 - kernel
 - kernelopts
 - maxmem
 - memory
 - qrexec_timeout
 - vcpus
 - virt_mode

This is especially useful for manually installed templates (like
Windows).

Related to QubesOS/qubes-issues#3585
2018-07-16 22:02:58 +02:00
Marek Marczykowski-Górecki
be2465c1f9
Fix issues found by pylint 2.0
Resolve:
 - no-else-return
 - useless-object-inheritance
 - useless-return
 - consider-using-set-comprehension
 - consider-using-in
 - logging-not-lazy

Ignore:
 - not-an-iterable - false possitives for asyncio coroutines

Ignore all the above in qubespolicy/__init__.py, as the file will be
moved to separate repository (core-qrexec) - it already has a copy
there, don't desynchronize them.
2018-07-15 23:51:15 +02:00
Marek Marczykowski-Górecki
6a191febc3
vm/qubesvm: fire 'domain-start-failed' event even if fail was early
Fire 'domain-start-failed' even even if failure occurred during
'domain-pre-start' event. This will make sure if _anyone_ have seen
'domain-pre-start' event, will also see 'domain-start-failed'. In some
cases it will look like spurious 'domain-start-failed', but it is safer
option than the alternative.
2018-04-13 16:07:32 +02:00
Marek Marczykowski-Górecki
ba82d9dc21
vm/qubesvm: check if all required devices are available before start
Fail the VM start early if some persistently-assigned device is missing.
This will both save time and provide clearer error message.

Fixes QubesOS/qubes-issues#3810
2018-04-13 16:03:42 +02:00
Marek Marczykowski-Górecki
93b2424867
vm/qubesvm: fix missing icon handling in clone_disk_files()
Check for icon existence, not a directory for it.
2018-04-06 12:10:50 +02:00
Marek Marczykowski-Górecki
2dee554ab7
vm/mix/net: make vm.gateway6 consistent with vm.gateway
Use VM's actual IP address as a gateway for other VMs, instead of
hardcoded link-local address. This is important for sys-net generated
ICMP diagnostics packets - those must _not_ have link-local source
address, otherwise wouldn't be properly forwarded back to the right VM.
2018-04-03 00:20:06 +02:00
Marek Marczykowski-Górecki
f4be284331
vm/qubesvm: handle libvirt reporting domain already dead when killing
If domain die when trying to kill it, qubesd may loose a race and try to
kill it anyway. Handle libvirt exception in that case and conver it to
QubesVMNotStartedError - as it would be if qubesd would win the race.

Fixes QubesOS/qubes-issues#3755
2018-04-02 23:56:03 +02:00
Marek Marczykowski-Górecki
1e9bf18bcf
Typo fix 2018-04-02 23:24:30 +02:00
Marek Marczykowski-Górecki
e2b70306e5
Fix error message when using invalid VM as a template for DispVM
Don't crash (producing misleading error) when checking if
template_for_dispvms=True.

Fixes QubesOS/qubes-issues#3341
2018-03-05 23:47:33 +01:00
Marek Marczykowski-Górecki
7c4566ec14
vm/qubesvm: allow 'features-request' to have async handlers
Some handlers may want to call into other VMs (or even the one asking),
but vm.run() functions are coroutines, so needs to be called from
another coroutine. Allow for that.
Also fix typo in documentation.
2018-03-02 01:16:38 +01:00
Marek Marczykowski-Górecki
ba5d19e1b4
vm: provide better error message for VM startup timeout
"Cannot execute qrexec-daemon!" error is very misleading for a startup
timeout error, make it clearer. This rely on qrexec-daemon using
distinct exit code for timeout error, but even without that, include its
stderr in the error message.
2018-02-27 04:35:05 +01:00
Marek Marczykowski-Górecki
173e7e4250
vm: fix calling vm.detach_network() when really needed
*oldvalue* argument to property event handler is provided only when the
value was not default. Check vm.netvm directly to resolve also default.
2018-02-26 02:52:27 +01:00
Marek Marczykowski-Górecki
81aea0c2c7
Merge remote-tracking branch 'qubesos/pr/198'
* qubesos/pr/198:
  backup.py: add vmN/empty file if no other files to backup
  Allow include=None to be passed to admin.backup.Info
  Add include_in_backups property for AdminVM
  Use !auto_cleanup as DispVM include_in_backups default
2018-02-25 00:09:12 +01:00
Rusty Bird
dbaf60ca24
Add include_in_backups property for AdminVM 2018-02-23 21:29:15 +00:00
Rusty Bird
84ca0c6df5
Use !auto_cleanup as DispVM include_in_backups default 2018-02-23 21:29:15 +00:00
Marek Marczykowski-Górecki
716114f676
Merge remote-tracking branch 'qubesos/pr/197'
* qubesos/pr/197:
  Don't fire domain-stopped/-shutdown while VM is still Dying
2018-02-22 21:14:55 +01:00
Rusty Bird
f96fd70f76
Don't fire domain-stopped/-shutdown while VM is still Dying
Lots of code expects the VM to be Halted after receiving one of these
events, but it could also be Dying or Crashed. Get rid of the Dying case
at least, by waiting until the VM has transitioned out of it.

Fixes e.g. the following DispVM cleanup bug:

    $ qvm-create -C DispVM --prop auto_cleanup=True -l red dispvm
    $ qvm-start dispvm
    $ qvm-shutdown --wait dispvm  # this won't remove dispvm
    $ qvm-start dispvm
    $ qvm-kill dispvm  # but this will
2018-02-22 19:53:29 +00:00
Christopher Laprise
75d8c553f9
Fix is_running non-boolean 2018-02-20 22:30:47 -05:00
Marek Marczykowski-Górecki
b00bbb73e4
Merge remote-tracking branch 'qubesos/pr/190'
* qubesos/pr/190:
  Missed one test, adding default-user in assert for test test_621_qdb_vm_with_network in TC_90
  replaced underscore by dash and update test accordingly
  Updated assert content for test_620_qdb_standalone in TC_90_QubesVM
  Added the default_user property from the Qube to the qubesdb so it is available when starting X. This is the 1st part of a fix for issue https://github.com/QubesOS/qubes-issues/issues/2372
2018-02-14 01:29:08 +01:00
Marek Marczykowski-Górecki
e3d3e149f0
Fix too long line 2018-02-13 11:27:59 +01:00
Marek Marczykowski-Górecki
209af07fd0
Merge remote-tracking branch 'qubesos/pr/188'
* qubesos/pr/188:
  file-reflink, a storage driver optimized for CoW filesystems
  Make AppVM/DispVM root volume rw to avoid CoW-on-CoW
2018-02-13 05:20:52 +01:00
Marek Marczykowski-Górecki
717bc4ace3
vm/appvm: forbid changing template if the are children DispVMs
Changing AppVM's template will not update root volume reference in
DispVMs based on it. For now forbid such change.

Fixes QubesOS/qubes-issues#3576
2018-02-13 04:53:39 +01:00
Rusty Bird
7a75e1090d
Make AppVM/DispVM root volume rw to avoid CoW-on-CoW 2018-02-12 21:20:04 +00:00
Yassine Ilmi
a0d45aac9c
replaced underscore by dash and update test accordingly 2018-02-01 00:50:42 +00:00
Yassine Ilmi
1c3b412ef8
Added the default_user property from the Qube to the qubesdb so it is available when starting X. This is the 1st part of a fix for issue https://github.com/QubesOS/qubes-issues/issues/2372 2018-02-01 00:12:51 +00:00
Marek Marczykowski-Górecki
a6a7efc9a7
vm/mix/net: fix handling network detach/attach on VM startup
- catch both QubesException and libvirtError - do not kill starting VM
just because an error while connecting _other_ VMs to it
- try to detach network first (and do not abort on error) - if
libvirt/libxl will manage to cleanup stale interface this way, the
attach operation below may succeed.

Fixes QubesOS/qubes-issues#3163
2018-01-29 23:06:21 +01:00
Marek Marczykowski-Górecki
86026e364f
Fix starting PCI-having HVMs on early system boot and later
1. Make sure VMs are started after dom0 actual memory usage is reported
to qmemman, otherwise dom0 will hold 4GB, even if just a little over 1GB
is needed at that time.

2. Request only vm.memory MB from qmemman, instead of vm.maxmem. While
HVM with PCI devices indeed do not support populate-on-demand, this is
already handled in libvirt XML.

The later may often cause VM startup fail on systems with 8GB of memory,
because maxmem is 4GB there and with dom0 keeping the other 4GB (see
point 1) there is not enough memory to start any sych VM.

Fixes QubesOS/qubes-issues#3462
2018-01-29 22:57:32 +01:00
Marek Marczykowski-Górecki
eb846f6647
Merge remote-tracking branch 'qubesos/pr/187'
* qubesos/pr/187:
  Don't fail create/clone if /var/lib/qubes/TYPE/NAME/ exists
  Make 'qvm-volume revert' really use the latest revision
  Fix wrong mocks of Volume.revisions
2018-01-22 15:39:13 +01:00
Marek Marczykowski-Górecki
74eb3f3208
Merge remote-tracking branch 'qubesos/pr/185'
* qubesos/pr/185:
  vm: remove doc for non-existing event `monitor-layout-change`
  vm: include tag/feature name in event name
  events: add support for wildcard event handlers
2018-01-22 15:32:57 +01:00
Rusty Bird
4ae854fdaf
Don't fail create/clone if /var/lib/qubes/TYPE/NAME/ exists 2018-01-21 22:28:47 +00:00
Marek Marczykowski-Górecki
dce3b609b4
qubesvm: do not try to define libvirt object in offline mode
The idea is to not touch libvirt at all.
2018-01-18 17:36:37 +01:00
Marek Marczykowski-Górecki
7905783861
qubesvm: PVH minor improvements
- use capital letters in acronyms in documentation to match upstream
documentation.
- refuse to start a PVH with without kernel set - provide meaningful
error message
2018-01-16 21:42:20 +01:00
Marek Marczykowski-Górecki
4ff53879a0
vm/qubesvm: default to PVH unless PCI devices are assigned
Fixes QubesOS/qubes-issues#2185
2018-01-15 03:34:46 +01:00
Marek Marczykowski-Górecki
d9da747ab0
vm/qubesvm: expose 'start_time' property over Admin API
It is useful at least for Qubes Manager.
2018-01-12 05:34:46 +01:00
Marek Marczykowski-Górecki
85e80f2329
vm/qubesvm: revert backup_timestamp to '%s' format
Human readable format `str(datetime.datetime)` is a nightmare for Admin
API level communication. Especially setting the property in a format
that it was read was not supported, and handling such format in
untrusted input handling code is a bad idea. Revert to a simple intiger
format.
2018-01-12 05:34:45 +01:00
Marek Marczykowski-Górecki
f0fe02998b
vm: remove doc for non-existing event monitor-layout-change 2018-01-06 15:10:54 +01:00
Marek Marczykowski-Górecki
50d34755fa
vm: include tag/feature name in event name
Rename events:
 - domain-feature-set -> domain-feature-set:feature
 - domain-feature-delete -> domain-feature-delete:feature
 - domain-tag-add -> domain-tag-add:tag
 - domain-tag-delete -> domain-tag-delete:tag

Make it consistent with property-* events. It makes more sense to
include tag/feature name in event name, so handler can watch a single
tag/feature - which is the most common case. Otherwise, most handlers
would begin with `if feature == '...'` anyway, wasting time on most
events.

In cases where multiple features/tags should be handled by a single
handler, it is now possible to register a handler with wildcard, for
example `domain-feature-set:*`.
2018-01-06 15:05:34 +01:00
Marek Marczykowski-Górecki
32c6083e1c
Make pylint happy
Fix thing detected by updated pylint in Travis-CI
2017-12-21 18:19:10 +01:00