For a long time Qubes backup did not include symlinked files, which
apparently is quite common practice for users with multiple disks (for
example HDD + SSD). It is covered in documentation
(https://www.qubes-os.org/doc/secondary-storage/), but better solution
would be to simply include symlinked files.
Restore of such files would (of course) not preserve the symlinks -
normal files will be restored instead. But that's fine. If the user want
to move the data to another location, he/she can do that and restore the
symlink.
The only possible breakage from this change is having a copy (instead of
symlink) to a VM icon. But storing that symlink in a backup was broken
for some time (because of --xform usage) and it is handled during
restore, so not a real problem.
This doesn't cover all the problems with symlinked VM images - the other
one is qvm-block behaviour, which would treat such images as non-system
disks, so easily detachable (which would break VM operation). But that's
another story.
FixesQubesOS/qubes-issues#1384
In most cases it would be some leftover after failed restore, or even
the reason why the user is restoring a VM in the first place. Move it to
nearby directory, but do not remove - backup tool should _never_ remove
any data.
When the pre-existing directory would not be moved, restore utility
(`shutil.move`) would place the data inside of that directory, with
additional directory level (for example `/var/lib/qubes/appvms/work/work`),
which would be wrong and would later fail on `vm.verify_files`. And more
importantly - such VM would not work.
FixesQubesOS/qubes-issues#1386
Registering event implementation in libvirt and then not calling it is
harmful, because libvirt expects it working. Known drawbacks:
- keep-alives are advertised as supported but not really sent (cause
dropping connections)
- connections are not closed (sockets remains open, effectively leaking
file descriptors)
So call libvirt.virEventRegisterDefaultImpl only when it will be really
used (libvirt.virEventRunDefaultImpl called), which means calling it in
QubesWatch. Registering events implementation have effect only on new
libvirt connections, so start a new one for QubesWatch.
FixesQubesOS/qubes-issues#1380
There are some circular dependencies (TemplateVM.appvms,
NetVM.connected_vms, and probably more), which prevents garbage
collector from cleaning them.
FixesQubesOS/qubes-issues#1380
QubesVM.start() first creates domain as paused, completes its setup
(including starting qubesdb-daemon and creating appropriate entries),
then resumes the domain. So wait for that resume to be sure that
`qubesdb-daemon` is already running and populated.
QubesOS/qubes-issues#1110
QubesWatch._register_watches is called from libvirt event callback,
asynchronously to qvm-start. This means that `qubesdb-daemon` may
not be running or populated yet.
If first QubesDB connection (or watch registration) fails, schedule next
try using timers in libvirt event API (as it is base of QubesWatch
mainloop), instead of some sleep loop. This way other events will be
processed in the meantime.
QubesOS/qubes-issues#1110
This makes easier to handle some corner cases. One of them is having
entry without `dir_path` defined. This may happen when migrating from R2
(using backup+restore or in-place) while some DisposableVM was running
(even if not included in the backup itself).
Fixesqubesos/qubes-issues#1124
Reported by @doncohen, thanks @wyory for providing more details.
We use only one device-mapper layer for HVMs, and this isn't the same as
for PV - it is that one, which PV does in initramfs.
Device-mapper layers summary for template-based VMs:
PV: root.img+root-cow.img (dom0) -> xvda, xvda+volatile.img (VM)
HVM: root.img+volatile.img (dom0)
Since libvirt do not support such events (at least for libxl driver), we
need some way to notify qubes-manager when device is attached/detached.
Use the same protocol as for connect/disconnect but on the target
domain.
Define it only when really needed:
- during VM creation - to generate UUID
- just before VM startup
As a consequence we must handle possible exception when accessing
vm.libvirt_domain. It would be a good idea to make this field private in
the future. It isn't possible for now because block_* are external for
QubesVm class.
This hopefully fixes race condition when Qubes Manager tries to access
libvirt_domain (using some QubesVm.*) at the same time as other tool is
removing the domain. Additionally if Qubes Manage would loose that race, it could
define the domain again leaving some unused libvirt domain (blocking
that domain name for future use).
Provide vm.refresh(), which will force to reconnect do QubesDB daemon,
and also get new libvirt object (including new ID, if any). Use this
method whenever QubesDB call returns DisconnectedError exception. Also
raise that exception when someone is trying to talk to not running
QubesDB - instead of returning None.
The statement that unlock_db() is always called directly after save() is
no longer true - tests holds the lock all the time, doing multiple saves
in the middle.
When qfile-dom0-unpacker detects an error, it sends error report to
stdout and terminate (so stdout is closed). That close should be
transferred to the VM process (as EOF on its stdin), which will signal
it to stop sending the data and handle error report.
Also qrexec-client holds the connection until both stdin and
stdout are closed.
So when that EOF is missing, tar2qfile will not detect error report and
still tries to send the data and qrexec-client will hold the
connection while receiving process is long dead.
To prevent that deadlock from happening, close FD in python code, so
qfile-dom0-unpacker will be the last owner of write end of the pipe.
When it closes its stdout, qrexec-client will receive EOF at its stdin.
Otherwise deadlock could happen - the script will try to get read lock
on qubes.xml, while the calling tool can already hold the lock. If that
was write lock (which is in case of qfile-daemon-dvm), the deadlock
occurs.