Commit Graph

1818 Commits

Author SHA1 Message Date
Marek Marczykowski-Górecki
51af2ed27c
vm: add domain-shutdown-failed event
Similar to domain-start-failed, add an event fired when
domain-pre-shutdown was fired but the actual operation failed.
Note it might not catch all the cases, as shutdown() may be called with
wait=False, which means it won't wait fot the actual shutdown. In that
case, timeout won't result in domain-shutdown-failed event.

QubesOS/qubes-issues#5380
2019-10-08 23:35:24 +02:00
Marek Marczykowski-Górecki
0300e895f2
Fix domain-start-failed event documentation. 2019-10-08 23:35:18 +02:00
Marek Marczykowski-Górecki
cc1ac0b859
tests: log stderr of paplay command 2019-09-29 06:43:34 +02:00
Marek Marczykowski-Górecki
7def96c248
tests: register syslog logger, log test start
Move this functionality from our custom runner (qubes.tests.run),
into base test class. This is very useful for correlating logs, so lets
have it with nose2 runner too.
2019-09-29 06:43:34 +02:00
Marek Marczykowski-Górecki
e0e0c7eaf9
tests: remove VM under startup_lock
Prevent starting a VM while it's being removed. Something could try to
start a VM just after it's being killer but before removing it (Whonix
example from previous commit is real-life case). The window specifically
is between kill() call and removing it from collection
(`del app.domains[vm.qid]`). Grab a startup_lock for the whole operation
to prevent it.
2019-09-29 06:14:21 +02:00
Marek Marczykowski-Górecki
732231efb0
vm: refuse to start a VM not present in a collection
Check early (but after grabbing a startup_lock) if VM isn't just
removed. This could happen if someone grabs its reference from other
places (netvm of something else?) or just before removing it.
This commit makes the simple removal from the collection (done as the
first step in admin.vm.Remove implementation) efficient way to block
further VM startups, without introducing extra properties.

For this to be effective, removing from the collection, needs to happen
with the startup_lock held. Modify admin.vm.Remove accordingly.
2019-09-29 06:14:21 +02:00
Marek Marczykowski-Górecki
6cfda328bf
tests: add workaround for Whonix re-starting VMs
Workaround for https://phabricator.whonix.org/T930
For now, unregister all the VMs to be killed manually.
2019-09-29 06:14:21 +02:00
Marek Marczykowski-Górecki
34e2f3a322
tests: fix sorting kernel version
Debian now has 4.9 and 4.19 kernels installed, so `sort -n` sorts them
wrong.
2019-09-27 16:29:21 +02:00
Marek Marczykowski-Górecki
5fa75d73be
Make pylint happy 2019-09-27 16:29:20 +02:00
Marek Marczykowski-Górecki
eb39f69882
api: improve handling destination removed just before the call
There are cases when destination domain doesn't exist when the call gets
to qubesd. Namely:
 1. The call comes from dom0, which bypasses qrexec policy
 2. Domain was removed between checking the policy and here
Handle the the same way as if the domain wouldn't exist at policy
evaluation stage either - i.e. refuse the call.

On the client side it doesn't change much, but on the server call it
avoids ugly, useless tracebacks in system journal.

Fixes QubesOS/qubes-issues#5105
2019-09-27 16:29:20 +02:00
Marek Marczykowski-Górecki
8ecf00bd0e
tests: add helpful decorator to wait before test cleanup
Allow to manual inspect test environment after test fails. This is
similar to --do-not-clean option we had in R3.2.

The decorator should be used only while debugging and should never be
applied to the code committed into repository.
2019-09-27 16:29:19 +02:00
Marek Marczykowski-Górecki
4fa45dbc91
tests: fix (hopefully) too short delay in 201_shutdown_event_race test
Domain shutdown handling may take extended amount of time, especially on
slow machine (all the LVM teardown etc). Take care of it by
synchronizing using vm.startup_lock, instead of increasing constant
delay. This way, the shutdown event handler needs to be started within
3s, not finish in this time.
2019-09-27 16:29:19 +02:00
Marek Marczykowski-Górecki
dfa0626cea
tests: check event handler re-registration after libvirt restart
QubesOS/qubes-issues#5303
2019-09-27 16:29:19 +02:00
Marek Marczykowski-Górecki
8e36a2ac61
app: re-register event handler after libvirt daemon restart
When libvirt daemon is restarted, qubesd attempt to re-connect to the
new instance transparently (through virConnect object wrapper). But the
code lacked re-registering event handlers.
Fix this by adding reconnect callback argument to virConnectWrappper, to
be called after new connection is established. This callback will
additionally get old connection as an argument, if any cleanup is
needed. The old connection is closed just after callback returns.

Use this to re-register event handler, but also unregister old handler
first. While full unregister wont work on since old libvirt daemon
instance is dead already, it will still cleanup client structures.

Since the old libvirt connection is closed now, adjust also domain
reconnection logic, to handle stale connection object. In that case
isAlive() call throws an exception.

Fixes QubesOS/qubes-issues#5303
2019-09-26 01:57:59 +02:00
Marek Marczykowski-Górecki
273238bd2a
tests: fix qrexec abort test
Exit code depends on when exactly the other end was terminated.
2019-09-22 23:30:38 +02:00
Marek Marczykowski-Górecki
784fd966a2
Merge branch 'devel20190819'
* devel20190819:
  tests: make libvirt mockup more robust
  tests: update for not needing custom kernel modules anymore
2019-09-21 02:20:00 +02:00
Marta Marczykowska-Górecka
9d20877a43
Fixed unexpected error on empty default template and incorrect error handling
Qubesd wrongly required default_template global property to be not None.
Furthermore, even without hard failure set, require_property method
raised an exception in case of a property having incorrect None value.
It now logs an error message instead, as designed.

fixes QubesOS/qubes-issues#5326
2019-09-19 20:20:36 +02:00
Marek Marczykowski-Górecki
c5aaf3abd9
tests: make libvirt mockup more robust
If not in offline_mode, return actual mock for libvirt connection object
instead of always raising exception.
2019-09-10 03:34:11 +02:00
Marek Marczykowski-Górecki
05e48748d2
tests: update for not needing custom kernel modules anymore
kernel-devel package isn't needed in VMs anymore.
2019-09-10 03:33:42 +02:00
Marek Marczykowski-Górecki
5d5f102378
Merge remote-tracking branch 'origin/pr/277'
* origin/pr/277:
  admin: add admin.deviceclass.List
  admin: replace single quote to double for docstring
2019-08-08 14:05:00 +02:00
Marek Marczykowski-Górecki
32b0d7df43
Merge remote-tracking branch 'origin/pr/275'
* origin/pr/275:
  api/stats: improve cpu_usage normalization, add cpu_usage_raw
  Use 128k blocks when importing volume data
2019-08-08 14:04:27 +02:00
ejose19
2602df5e19 Fix invalid timezone
This fixes an invalid response generated by get_timezone when the time zones are composed by 3 parts, for example:

America/Argentina/Buenos_Aires
America/Indiana/Indianapolis

Update utils.py
2019-08-06 18:23:00 -03:00
Frédéric Pierret (fepitre)
7ff01b631d
admin: add admin.deviceclass.List
QubesOS/qubes-issues#5213
2019-08-06 11:01:02 +02:00
Frédéric Pierret (fepitre)
7f670614c8
admin: replace single quote to double for docstring 2019-08-06 10:04:30 +02:00
Marek Marczykowski-Górecki
39d64eabc8
api/stats: improve cpu_usage normalization, add cpu_usage_raw
Give raw cpu_time value, instead of normalized one (to number of vcpus),
as documented.
Move the normalization to cpu_usage calculation. At the same time, add
cpu_usage_raw without it, if anyone needs it.

QubesOS/qubes-issues#4531
2019-08-01 04:51:05 +02:00
Brendan Hoar
a5378b5958
Update lvm.py 2019-07-31 19:40:33 -04:00
Marek Marczykowski-Górecki
205f6215b2
Merge remote-tracking branch 'origin/pr/271'
* origin/pr/271:
  Fix too long line
  Prevent 'qubesd' for crashing if any device backend is not available
2019-07-31 17:23:20 +02:00
Marek Marczykowski-Górecki
9e226ab5cf
Merge remote-tracking branch 'origin/pr/273'
* origin/pr/273:
  tests: check importing empty data into ReflinkVolume
  tests: check importing empty data into ThinVolume
  tests: check importing empty data into FileVolume
  tests: improve cleanup after LVM tests
2019-07-31 15:37:13 +02:00
Marek Marczykowski-Górecki
6c8bdfaa4b
Fix too long line 2019-07-31 15:36:03 +02:00
Marek Marczykowski-Górecki
afb0de43d4
tests: check importing empty data into ReflinkVolume
Verify if it really discards old content.

QubesOS/qubes-issues#5192
2019-07-28 22:08:52 +02:00
Marek Marczykowski-Górecki
19186f7840
tests: check importing empty data into ThinVolume
Verify if it really discards old content.

QubesOS/qubes-issues#5192
2019-07-28 22:08:37 +02:00
Marek Marczykowski-Górecki
790c2ad8cb
tests: check importing empty data into FileVolume
Verify if it really discards old content.

QubesOS/qubes-issues#5192
2019-07-28 22:06:30 +02:00
Marek Marczykowski-Górecki
8414d0153f
tests: improve cleanup after LVM tests
Remove test volumes - this way if a test fails, subsequent tests have a
chance to succeed.
2019-07-28 21:41:04 +02:00
Frédéric Pierret (fepitre)
74f8d4531e
Prevent 'qubesd' for crashing if any device backend is not available 2019-07-20 15:27:09 +02:00
Rusty Bird
aa931fdbf7
vm/qubesvm: call storage.remove() *before* removing dir_path
Give the varlibqubes storage pool, which places volume images inside the
VM directory, a chance to remove them (and e.g. to log that).
2019-06-28 10:29:34 +00:00
Rusty Bird
cfa1c7171e
storage: fix typo 2019-06-28 10:29:32 +00:00
Rusty Bird
6e592a56d7
storage/reflink: update comment and whitespace 2019-06-28 10:29:31 +00:00
Rusty Bird
1fe2aff643
storage/reflink: _fsync_path(path_from) in _commit()
During regular VM shutdown, the VM should sync() anyway. (And
admin.vm.volume.Import does fdatasync(), which is also fine.) But let's
be extra careful.
2019-06-28 10:29:30 +00:00
Rusty Bird
df23720e4e
storage/reflink: _fsync_dir() -> _fsync_path()
Make it work when passed a file, too.
2019-06-28 10:29:29 +00:00
Rusty Bird
35b4b915e7
storage/reflink: coroutinize pool setup() 2019-06-28 10:29:27 +00:00
Rusty Bird
1d89acf698
app: setup_pools() must be a coroutine
This is needed as a consequence of d8b6d3ef ("Make add_pool/remove_pool
coroutines, allow Pool.{setup,destroy} as coroutines"), but there hasn't
been any problem so far because no storage driver implemented pool
setup() as a coroutine.
2019-06-28 10:29:26 +00:00
Rusty Bird
5b52d23478
factor out utils.void_coros_maybe() 2019-06-28 10:29:25 +00:00
Rusty Bird
fe97a15d11
factor out utils.coro_maybe() 2019-06-28 10:29:24 +00:00
Marek Marczykowski-Górecki
883d324b6c
Revert "tests: do not use lazy unmount"
Revert to use umount -l in storage tests cleanup. With fixed permissions
in 4234fe51 "tests: fix cleanup after reflink tests", it shouldn't cause
issues anymore, but apparently on some systems test cleanup fails
otherwise.

Reported by @rustybird
This reverts commit b6f77ebfa1.
2019-06-25 05:51:33 +02:00
Rusty Bird
66e44e67de
api/admin: always save after self.dest.storage.resize() 2019-06-23 12:48:01 +00:00
Rusty Bird
2b4b45ead8
storage/reflink: preferably get volume size from image size
There were (at least) five ways for the volume's nominal size and the
volume image file's actual size to desynchronize:

- loading a stale qubes.xml if a crash happened right after resizing the
  image but before saving the updated qubes.xml (-> previously fixed)
- restarting a snap_on_start volume after resizing the volume or its
  source volume (-> previously fixed)
- reverting to a differently sized revision
- importing a volume
- user tinkering with image files

Rather than trying to fix these one by one and hoping that there aren't
any others, override the volume size getter itself to always update from
the image file size. (If the getter is called though the storage API, it
takes the volume lock to avoid clobbering the nominal size when resize()
is running concurrently.)
2019-06-23 12:48:00 +00:00
Rusty Bird
de159a1e1e
storage/reflink: don't run verify() under volume lock 2019-06-23 12:47:59 +00:00
Rusty Bird
4c9c0a88d5
storage/reflink: split @_unblock into @_coroutinized @_locked
And change the volume lock from an asyncio.Lock to a threading.Lock -
locking is now handled before coroutinization.

This will allow the coroutinized resize() and a new *not* coroutinized
size() getter from one of the next commits ("storage/reflink: preferably
get volume size from image size") to both run under the volume lock.
2019-06-23 12:47:58 +00:00
Rusty Bird
ab929baa38
Revert "storage/reflink: update volume size in __init__()"
Fixed more generally in one of the next commits ("storage/reflink:
preferably get volume size from image size").

This reverts commit 9f5d05bfde.
2019-06-23 12:47:57 +00:00
Rusty Bird
58a7e0f158
Revert "storage/reflink: update snap_on_start volume size in start()"
Fixed more generally in one of the next commits ("storage/reflink:
preferably get volume size from image size").

This reverts commit c43df968d5.
2019-06-23 12:47:56 +00:00