Registering event implementation in libvirt and then not calling it is
harmful, because libvirt expects it working. Known drawbacks:
- keep-alives are advertised as supported but not really sent (cause
dropping connections)
- connections are not closed (sockets remains open, effectively leaking
file descriptors)
So call libvirt.virEventRegisterDefaultImpl only when it will be really
used (libvirt.virEventRunDefaultImpl called), which means calling it in
QubesWatch. Registering events implementation have effect only on new
libvirt connections, so start a new one for QubesWatch.
FixesQubesOS/qubes-issues#1380
QubesVM.start() first creates domain as paused, completes its setup
(including starting qubesdb-daemon and creating appropriate entries),
then resumes the domain. So wait for that resume to be sure that
`qubesdb-daemon` is already running and populated.
QubesOS/qubes-issues#1110
QubesWatch._register_watches is called from libvirt event callback,
asynchronously to qvm-start. This means that `qubesdb-daemon` may
not be running or populated yet.
If first QubesDB connection (or watch registration) fails, schedule next
try using timers in libvirt event API (as it is base of QubesWatch
mainloop), instead of some sleep loop. This way other events will be
processed in the meantime.
QubesOS/qubes-issues#1110
Since libvirt do not support such events (at least for libxl driver), we
need some way to notify qubes-manager when device is attached/detached.
Use the same protocol as for connect/disconnect but on the target
domain.
Define it only when really needed:
- during VM creation - to generate UUID
- just before VM startup
As a consequence we must handle possible exception when accessing
vm.libvirt_domain. It would be a good idea to make this field private in
the future. It isn't possible for now because block_* are external for
QubesVm class.
This hopefully fixes race condition when Qubes Manager tries to access
libvirt_domain (using some QubesVm.*) at the same time as other tool is
removing the domain. Additionally if Qubes Manage would loose that race, it could
define the domain again leaving some unused libvirt domain (blocking
that domain name for future use).
Provide vm.refresh(), which will force to reconnect do QubesDB daemon,
and also get new libvirt object (including new ID, if any). Use this
method whenever QubesDB call returns DisconnectedError exception. Also
raise that exception when someone is trying to talk to not running
QubesDB - instead of returning None.
This makes easier to import right objects in submodules (only one
object). This also implement lazy connection - at first access, not at
module import, which speeds up tools, which doesn't need runtime
information (like qvm-prefs or qvm-service). In the future this will
ease migration from xenstore to QubesDB.
Also implement "offline mode" - operate on qubes.xml without connecting
to VMM - raise exception at such try.
This is needed to run tools during installation, where only minimal
set of services are started, especially no libvirt.
loop device parsing should have "dXpY_style = True" in order to
correctly parse partitions on loop devices.
Reasoning:
==========
Using losetup to create a virtual SD card disk into a loop device and
creating partitions for it results in new devices within an AppVM that
look like: /dev/loop0p1 /dev/loop0p2 and so on.
However as soon as they are created, Qubes Manager rises an exception
and becomes blocked with the following message (redacted):
"QubesException: Invalid device name: loop0p1
at line 639 of file /usr/lib64/python2.7/site-
packages/qubesmanager/main.py
Details:
line: raise QubesException....
func: block_name_to_majorminor
line no.: 181
file: ....../qubes/qubesutils.py
Simply get device major-minor from /dev/ device file.
This is only partial solution, because this will work only for dom0
devices, but the same problem can apply to VM.
Also some major cleanups: Reduce some more code duplication
(verify_hmac, simplify backup_restore_prepare). Rename
backup_dir/backup_tmpdir variables to better match its purpose. Rename
backup_do_copy back to backup_do. Require QubesVm object (instead of VM
name) as appvm param.
This way backup process won't need more than 1GB for temporary files and
also will give more precise progress information. For now it looks like
the slowest element is qrexec, so without such limit, all the data would
be prepared (basically making second copy of it in dom0) while only
first few files would be transfered to the VM.
Also backup progress is calculated based on preparation thread, so when
it finishes there is some other time needed to flush all the data to the
VM. Limiting this amount makes progress somehow more accurate (but still
off by 1GB...).
We can't wait for tar next volume prompt using stderr.readline(),
because tar don't output EOL marker after this prompt. The other way
would be switching file descriptor to non-blocking mode and using lower
level os.read(), but this looks like more error-prone way (races...).
So change idea of handling such archives: after switching to next
archive volume, simply send '\n' to tar (which will receive when
needed). When getting "*.000" file, assume that previous archive was
over and wait for previous tar process. Then start the new one.
Also don't give explicit tape length, only turn multi-volume mode on. So
will correctly handle all multi-volume archives, regardless of its size.
This is mostly revert of "3d1b40f backups: keep file without path in
inner tar archive" in terms of archive format, but the code is more
robust than old one. Especially reuse already computed dir paths. Also
restore only requested files (based on selected VMs and its qubes.xml
data). Change the restore workflow to restore files first to temporary
directory, then move to final dirs. This approach:
- will be compatible with hashed vm name in the archive path
- is required to handle dom0 home backup (directory outside of
/var/lib/qubes)
- it should be also more defensive - make any changes in /var/lib/qubes
only after successful extraction of files and creating Qubes*Vm object
Second change in this commit is implement of dom0 home backup/restore.
As qubes.xml now contains data about dom0, we have information whether
it is included in the backup (before getting actual files).
Ensure that outer tar/encryptor gets all the data *and EOF* before
signalling inner tar to continue. Previously it could happen that inner
tar begins to write next data chunk, while qvm-backup still holds
previous data chunk open.
It is senseless to have full file path in multiple locations:
- external archive
- qubes.xml
- internal archive
Also it is more logical to have only "private.img" file in archive
placed in "appvms/untrusted/private.img.000". Although this is rather
cosmetic change for VMs data, it is required to backup arbitrary
directory, like dom0 user home.
Also use os.path.* instead of manual string operations (split,
partition). It is more foolproof.