Commit Graph

5460 Commits

Author SHA1 Message Date
Marek Marczykowski-Górecki
cf8b6219a9
tests: make use of vm.shutdown(wait=True) 2018-10-27 01:43:00 +02:00
Marek Marczykowski-Górecki
2c1629da04
vm: call after-shutdown cleanup also from vm.kill and vm.shutdown
Cleaning up after domain shutdown (domain-stopped and domain-shutdown
events) relies on libvirt events which may be unreliable in some cases
(events may be processed with some delay, of if libvirt was restarted in
the meantime, may not happen at all). So, instead of ensuring only
proper ordering between shutdown cleanup and next startup, also trigger
the cleanup when we know for sure domain isn't running:
 - at vm.kill() - after libvirt confirms domain was destroyed
 - at vm.shutdown(wait=True) - after successful shutdown
 - at vm.remove_from_disk() - after ensuring it isn't running but just
 before actually removing it

This fixes various race conditions:
 - qvm-kill && qvm-remove: remove could happen before shutdown cleanup
 was done and storage driver would be confused about that
 - qvm-shutdown --wait && qvm-clone: clone could happen before new content was
 commited to the original volume, making the copy of previous VM state
(and probably more)

Previously it wasn't such a big issue on default configuration, because
LVM driver was fully synchronous, effectively blocking the whole qubesd
for the time the cleanup happened.

To avoid code duplication, factor out _ensure_shutdown_handled function
calling actual cleanup (and possibly canceling one called with libvirt
event). Note that now, "Duplicated stopped event from libvirt received!"
warning may happen in normal circumstances, not only because of some
bug.

It is very important that post-shutdown cleanup happen when domain is
not running. To ensure that, take startup_lock and under it 1) ensure
its halted and only then 2) execute the cleanup. This isn't necessary
when removing it from disk, because its already removed from the
collection at that time, which also avoids other calls to it (see also
"vm/dispvm: fix DispVM cleanup" commit).
Actually, taking the startup_lock in remove_from_disk function would
cause a deadlock in DispVM auto cleanup code:
 - vm.kill (or other trigger for the cleanup)
   - vm.startup_lock acquire   <====
     - vm._ensure_shutdown_handled
       - domain-shutdown event
         - vm._auto_cleanup (in DispVM class)
           - vm.remove_from_disk
             - cannot take vm.startup_lock again
2018-10-26 23:54:08 +02:00
Marek Marczykowski-Górecki
5be003d539
vm/dispvm: fix DispVM cleanup
First unregister the domain from collection, and only then call
remove_from_disk(). Removing it from collection prevent further calls
being made to it. Or if anything else keep a reference to it (for
example as a netvm), then abort the operation.

Additionally this makes it unnecessary to take startup lock when
cleaning it up in tests.
2018-10-26 23:54:08 +02:00
Marek Marczykowski-Górecki
e1f65bdf7b
vm: add shutdown_timeout property, make vm.shutdown(wait=True) use it
vm.shutdown(wait=True) waited indefinitely for the shutdown, which makes
useless without some boilerplate handling the timeout. Since the timeout
may depend on the operating system inside, add a per-VM property for it,
with value inheritance from template and then from global
default_shutdown_timeout property.

When timeout is reached, the method raises exception - whether to kill
it or not is left to the caller.

Fixes QubesOS/qubes-issues#1696
2018-10-26 23:54:04 +02:00
Marek Marczykowski-Górecki
b65fdf9700
storage: convert lvm driver to async version
LVM operations can take significant amount of time. This is especially
visible when stopping a VM (`vm.storage.stop()`) - in that time the
whole qubesd freeze for about 2 seconds.

Fix this by making all the ThinVolume methods a coroutines (where
supported). Each public coroutine is also wrapped with locking on
volume._lock to avoid concurrency-related problems.
This all also require changing internal helper functions to
coroutines. There are two functions that still needs to be called from
non-coroutine call sites:
 - init_cache/reset_cache (initial cache fill, ThinPool.setup())
 - qubes_lvm (ThinVolume.export()

So, those two functions need to live in two variants. Extract its common
code to separate functions to reduce code duplications.

Fixes QubesOS/qubes-issues#4283
2018-10-23 16:53:35 +02:00
Marek Marczykowski-Górecki
299c514647
tests: fix asyncio usage in storage_lvm.TC_01_ThinPool
Both vm.create_on_disk() and vm.start() are coroutines. Tests in this
class didn't run them, so basically didn't test anything.

Wrap couroutine calls with self.loop.run_until_complete().

Additionally, don't fail if LVM pool is named differently.
In that case, the test is rather sily, as it probably use the same pool
for source and destination (operation already tested elsewhere). But it
isn't a reason for failing the test.
2018-10-23 16:53:35 +02:00
Marek Marczykowski-Górecki
6170edb291
storage: allow import_data and import_data_end be coroutines
On some storage pools this operation can also be time consuming - for
example require creating temporary volume, and volume.create() already
can be a coroutine.
This is also requirement for making common code used by start()/create()
etc be a coroutine, otherwise neither of them can be and will block
other operations.

Related to QubesOS/qubes-issues#4283
2018-10-23 16:53:35 +02:00
Marek Marczykowski-Górecki
295705a708
doc: document features, qvm-features-request and services
Fixes QubesOS/qubes-issues#2829
2018-10-23 16:53:35 +02:00
Marek Marczykowski-Górecki
d1f5cb5d15
ext/services: mechanism for advertising supported services
Support 'supported-service.*' features requests coming from VMs. Set
such features directly (allow only value '1') and remove any not
reported in given call. This way uninstalling package providing given
service will automatically remove related 'supported-service...'
feature.

Fixes QubesOS/qubes-issues#4402
2018-10-23 16:47:39 +02:00
Marek Marczykowski-Górecki
58bcec2a64
qubesvm: improve error message about same-pool requirement
Make it clear that volume creation fails because it needs to be in the
same pool as its parent. This message is shown in context of `qvm-create
-p root=MyPool` for example and the previous message didn't make sense
at all.

Fixes QubesOS/qubes-issues#3438
2018-10-18 00:03:05 +02:00
Marek Marczykowski-Górecki
ba210c41ee
qubesvm: don't crash VM creation if icon symlink already exists
It can be leftover from previous failed attempt. Don't crash on it, and
replace it instead.

QubesOS/qubes-issues#3438
2018-10-18 00:01:45 +02:00
Marek Marczykowski-Górecki
c01ae06fee
tests: add basic ServicesExtension tests 2018-10-17 17:37:02 +02:00
Marek Marczykowski-Górecki
133219f6d3
Do not generate R3 compat firewall rules if R4 format is supported
R3 format had limitation of ~40 rules per VM. Do not generate compat
rules (possibly hitting that limitation) if new format, free of that
limitation is supported.

Fixes QubesOS/qubes-issues#1570
Fixes QubesOS/qubes-issues#4228
2018-10-15 06:05:05 +02:00
Marek Marczykowski-Górecki
e8dc6cb916
tests: use smaller root.img in backupcompatibility tests
1GB image easily exceed available space on openQA instances. Use 100MB
instead.
2018-10-15 05:24:24 +02:00
Marek Marczykowski-Górecki
fd9f2e2a6c
tests: type commands into specific found window
Make sure events are sent to specific window found with xdotool search,
not the one having the focus. In case of Whonix, it can be first
connection wizard or whonixcheck report.
2018-10-15 05:18:42 +02:00
Marek Marczykowski-Górecki
9e81087b25
tests: use improved wait_for_window in various tests
Replace manual `xdotool search calls` with wait_for_window(), where
compatible.
2018-10-15 05:16:29 +02:00
Marek Marczykowski-Górecki
f1621c01e9
tests: add search based on window class to wait_for_window
Searching based on class is used in many tests, searching by class, not
only by name in wait_for_window will allow to reduce code duplication.
While at it, improve it additionally:
 - avoid active waiting for window and use `xdotool search --sync` instead
 - return found window id
 - add wait_for_window_coro() for use where coroutine is needed
 - when waiting for window to disappear, check window id once and wait
   for that particular window to disappear (avoid xdotool race
   conditions on window enumeration)

Besides reducing code duplication, this also move various xdotool
imperfections handling into one place.
2018-10-15 05:08:25 +02:00
Marek Marczykowski-Górecki
3f5618dbb0
tests/integ/network: make the tests independend of default netvm
Network tests create own temporary netvm. Make it disconnected from the
real netvm, to not interefere in tests.
2018-10-15 00:39:32 +02:00
Marek Marczykowski-Górecki
15140255d5
tests/integ/network: few more code style improvement
Remove unused imports and unused variables, add some more docstrings.
2018-10-15 00:38:38 +02:00
Marek Marczykowski-Górecki
375688837c
tests/integ/network: add type annotations
Make PyCharm understand what mixin those objects are for.
2018-10-15 00:30:17 +02:00
Marek Marczykowski-Górecki
fc3b28608e
version 4.0.32 2018-10-14 06:05:04 +02:00
Marek Marczykowski-Górecki
3b6703f2bd
tests: fix race condition in gui_memory_pinning test
Don't rely on top update timing, pause it updates for taking
screenshots.
2018-10-14 05:48:25 +02:00
Marek Marczykowski-Górecki
9887b925b4
tests: increase timeout for vm shutdown
start_standalone_with_cdrom_vm test essence is somewhere else, let not
fail it for just slow shutdown (LVM cleanup etc).
2018-10-14 05:29:05 +02:00
Marek Marczykowski-Górecki
29a26e7d69
tests: make timeout it shutdown test even longer
Reduce false positives when testing on busy machine.
2018-10-14 03:32:17 +02:00
Marek Marczykowski-Górecki
3e28ccefde
tests: fix cleanup after backup compatibility tests
Allow removing VMs based on multiple prefixes at once. Removing them
separately doesn't handle all the dependencies (default_netvm, netvm)
correctly. This is needed for backup compatibility tests, where VMs are
created with `test-` prefix and `disp-tests-`. Additionally backup code
will create `disp-no-netvm`, which also may need to be removed.
2018-10-14 03:29:30 +02:00
Marek Marczykowski-Górecki
00c0b4c69f
tests: cleanup tracebacks also for expectedFailure exception
Continuation of 5bc0baea "tests: do not leak objects in object leaks
checking function".
2018-10-12 14:41:35 +02:00
Marek Marczykowski-Górecki
c45ce78ee4
version 4.0.31 2018-10-10 02:02:27 +02:00
Marek Marczykowski-Górecki
d23636fa02
tests: migrate qvm-block tests to core3 2018-10-10 00:44:15 +02:00
Marek Marczykowski-Górecki
399d2138bd
tests: remove old hardware.py tests, migrated to devices_pci.py 2018-10-10 00:08:57 +02:00
Marek Marczykowski-Górecki
1245606453
tests: fix cleanup of dom0_update tests
Reset updatevm to None before removing VMs, otherwise removing updatevm
will fail.
2018-10-10 00:08:57 +02:00
Marek Marczykowski-Górecki
8dab298b89
tests: create testcases on module import if environment variable is set
If QUBES_TEST_TEMPLATES or QUBES_TEST_LOAD_ALL is set, create testcases
on modules import, instead of waiting until `load_tests` is called.
The `QUBES_TEST_TEMPLATES` doesn't require `qubes.xml` access, so it
should be safe to do regardless of the environment. The
`QUBES_TEST_LOAD_ALL` force loading tests (and reading `qubes.xml`)
regardless.

This is useful for test runners not supporting load_tests protocol. Or
with limited support - for example both default `unittest` runner and
`nose2` can either use load_tests protocol _or_ select individual tests.
Setting any of those variable allow to run a single test with those
runners.

With this feature used together load_tests protocol, tests could be
registered twice. Avoid this by not listing already defined test classes
in create_testcases_for_templates (according to load_tests protocol,
those should already be registered).
2018-10-10 00:08:29 +02:00
Marek Marczykowski-Górecki
c8929cfee9
tests: improve handling backups in core3 2018-10-07 19:51:55 +02:00
Marek Marczykowski-Górecki
7a607e3731
tests: add QUBES_TEST_TEMPLATES env variable
Allow easily list templates to be tested, without enumerating all the
test classes. This is especially useful with nose2 runner which can't
use load tests protocol _and_ select subset of tests.
2018-10-07 19:51:55 +02:00
Marek Marczykowski-Górecki
35c66987ab
tests: improve clearing tracebacks from Qubes* objects
Clear also tracebacks of chained exceptions.
2018-10-07 15:46:52 +02:00
Marek Marczykowski-Górecki
f33eca1e3f
tests: remove old 'hvm' test file
Those cases are already covered with basic core3 tests or are irrelevant
now.
2018-10-07 15:46:52 +02:00
Marek Marczykowski-Górecki
7c91e82365
tests: handle KWrite editor in DispVM tests 2018-10-07 15:46:52 +02:00
Marek Marczykowski-Górecki
02f9661169
tests: migrate mime handlers test to core3 2018-10-07 15:46:52 +02:00
Marek Marczykowski-Górecki
5bc0baeafa
tests: do not leak objects in object leaks checking function
If any object is leaked, QubesTestCase.cleanup_gc() raises an exception,
which have leaked objects list referenced in its traceback. This happens
after cleanup_traceback(), so isn't cleaned, causing cleanup_gc() fail
for all the further tests in the same test run.

Avoid this, by dropping list just before checking if any object is
leaked.
2018-09-29 02:40:43 +02:00
Marek Marczykowski-Górecki
b88fa39894
Fix mock-based build 2018-09-29 02:40:28 +02:00
Marek Marczykowski-Górecki
8b75b264ae
Merge remote-tracking branch 'origin/pr/235'
* origin/pr/235:
  Add default_qrexec_timeout to qubes-prefs
2018-09-19 06:03:10 +02:00
Marek Marczykowski-Górecki
1efac373dd
version 4.0.30 2018-09-19 05:46:07 +02:00
Marek Marczykowski-Górecki
b2387389d0
Update documentation for device-attach event 2018-09-19 05:44:02 +02:00
Rusty Bird
bee69a98b9
Add default_qrexec_timeout to qubes-prefs
When a VM (or its template) does not explicitly set a qrexec_timeout,
fall back to a global default_qrexec_timeout (with default value 60),
instead of hardcoding the fallback value to 60.

This makes it easy to set a higher timeout for the whole system, which
helps users who habitually launch applications from several (not yet
started) VMs at the same time. 60 seconds can be too short for that.
2018-09-16 18:42:48 +00:00
Marek Marczykowski-Górecki
eed2076722
Merge branch 'tests-20180913'
* tests-20180913:
  tests: fix time sync test
  tests: wait for DispVM's qubes.VMShell exit
  tests: exclude whonixcheck and NetworkManager from editor window search
  tests: reenable some qrexec tests, convert them to py3k/asyncio
  tests: skip tests not relevant on Whonix
  tests: improve shutdown timeout handling
  tests: drop qvm-prefs tests
2018-09-16 05:24:50 +02:00
Marek Marczykowski-Górecki
7feed2f680
Handle qubes.skip_autostart option on kernel command line
This allows to prevent automatically starting VMs at boot, mostly for
troubleshooting.

Fixes QubesOS/qubes-issues#4312
2018-09-16 05:22:30 +02:00
Marek Marczykowski-Górecki
e26655bc82
tests: fix time sync test
qvm-sync-clock no longer fetches time from the network, by design.
So, lets not break clockvm's time and check only if everything else
correctly synchronize with it.
2018-09-16 04:43:50 +02:00
Marek Marczykowski-Górecki
c4a84b3298
tests: wait for DispVM's qubes.VMShell exit
It isn't enough to wait for window to disappear, the service may still
be running. And if it is, test cleanup logic will complain about FD
leak.
To avoid deadlock on some test failure, do it with a timeout.
2018-09-16 04:43:50 +02:00
Marek Marczykowski-Górecki
240b1dd75e
tests: exclude whonixcheck and NetworkManager from editor window search
Those may pop up before actual editor is found, which fails the test as
it can't handle such "editor".
2018-09-16 04:43:50 +02:00
Marek Marczykowski-Górecki
ac8b8a3ad4
tests: reenable some qrexec tests, convert them to py3k/asyncio 2018-09-16 04:43:50 +02:00
Marek Marczykowski-Górecki
576bcb158e
tests: skip tests not relevant on Whonix 2018-09-15 05:12:41 +02:00