When restarting VM (starting it just after it was shut down), it may
happen that previous `qubesdb-daemon` instance is still running - if VM
doesn't properly terminate the connection, dom0 part will not terminate
immediately, but at next alive check (every 10s). Such `qubesdb-daemon`,
when terminating, will remove pid file and socket file. In case of new
daemon already running it would be those of the new daemon, making the
whole QubesDB of this VM inaccessible for dom0 (`qubesdb-daemon` is
running, but its socket is removed).
To prevent this race, ensure that previous instance is terminated before
starting the new one.
There is no need to manually removing socket file, because if some stale
socket exists, it will be replaced by the new one when new
`qubesdb-daemon` starts up.
QubesOS/qubes-issues#1241
This makes easier to handle some corner cases. One of them is having
entry without `dir_path` defined. This may happen when migrating from R2
(using backup+restore or in-place) while some DisposableVM was running
(even if not included in the backup itself).
Fixesqubesos/qubes-issues#1124
Reported by @doncohen, thanks @wyory for providing more details.
This is required to create VMs in process of building Live system, where
libvirt isn't running.
Additionally there is no udev in the build environment, so needs to
manually create /dev/loop*p* based on sysfs info.
This way even when qrexec agent would timeout on connection, guid will
be already running.
Also use new -K guid option to terminate stubdom guid when the real guid
is connected (unless in debug mode - then both guid will be running).
There were two bugs:
1. Firewall configuration wasn't copied during qvm-clone (it is in
separate file, so now it is included in vm.clone_disk_files).
2. Non-default firewall configuration wasn't stored in qubes.xml. This
means that initially DispVM got proper configuration (inherited from
calling VM), but if anything caused firewall reload (for example
starting another VM), the firewall rules was cleared to default state
(allow all).
Fixesqubesos/qubes-issues#1032
Most of them do not need GUI (especially those started from dom0), so
speed the things up a little (no need to wait for guid). But if some
service will need GUI access, there is "gui" parameter.
We use only one device-mapper layer for HVMs, and this isn't the same as
for PV - it is that one, which PV does in initramfs.
Device-mapper layers summary for template-based VMs:
PV: root.img+root-cow.img (dom0) -> xvda, xvda+volatile.img (VM)
HVM: root.img+volatile.img (dom0)
Remove artificial attribute '_start_guid_first' and use
guiagent_installed directly. This way starting guid for stubdom in debug
mode, even if guiagent_installed is set is much clearer.
This allows to assign PCI device to the VM, even if it doesn't support
proper reset. The default behaviour (when the value is True) is to not
allow such attachment (VM will not start if such device is assigned).
Require libvirt patch for this option.
Using QID for DispVM ID was a bad idea in terms of anonymity:
1. It gives some clue about VMs count in the system. In case of large
numbers, this can be quite unique.
2. If new DispVM is started just after closing previous one, it will get
the same ID, and in consequence the same IP. In case of using TorVM,
this leads to use the same circuit as just closed DispVM.
Fixesqubesos/qubes-issues#983
At least to have there info about its backup.
This was already done in commit
dc6fd3c8f3, but later was erroneously
reverted during migration to libvirt.
Fixesqubesos/qubes-issues#958
This can happen when initially there was no default netvm, some domain
was started, then default netvm was set and started - then
netvm.connected_vms will contain domains which aren't really connected
there.
Especially this was happening in firstboot.
Otherwise it would point at the same object and for example changing
vm.services[] in one VM will change that also for another. That link
will be severed after reloading the VMs from qubes.xml, but at least in
case of DispVM startup its too late - vm.service['qubes-dvm'] is set for
the DispVM template even during normal startup, not savefile preparation.
This allows to specify tight network isolation for a VM, and finally
close one remaining way for leaking traffic around TorVM. Now when VM is
connected to for example TorVM, its DispVMs will be also connected
there.
The new property can be set to:
- default (uses_default_dispvm_netvm=True) - use the same NetVM/ProxyVM as the
calling VM itself - including none it that's the case
- None - DispVMs will be network-isolated
- some NetVM/ProxyVM - will be used, even if calling VM is network-isolated
Closesqubesos/qubes-issues#862
Define it only when really needed:
- during VM creation - to generate UUID
- just before VM startup
As a consequence we must handle possible exception when accessing
vm.libvirt_domain. It would be a good idea to make this field private in
the future. It isn't possible for now because block_* are external for
QubesVm class.
This hopefully fixes race condition when Qubes Manager tries to access
libvirt_domain (using some QubesVm.*) at the same time as other tool is
removing the domain. Additionally if Qubes Manage would loose that race, it could
define the domain again leaving some unused libvirt domain (blocking
that domain name for future use).
Provide vm.refresh(), which will force to reconnect do QubesDB daemon,
and also get new libvirt object (including new ID, if any). Use this
method whenever QubesDB call returns DisconnectedError exception. Also
raise that exception when someone is trying to talk to not running
QubesDB - instead of returning None.
Libvirt will replace domain XML when trying to define the new one with
the same name and UUID - this is exactly what we need. This fixes race
condition with other processes (especially Qubes Manager), which can try
to access that libvirt domain object at the same time.
It have nothing to do with xenstore, so change the name to not mislead.
Also get rid of unused "xid" parameter - we should use XID as little as
possible, because it is not a simple task to keep it current.
It is used by just started DispVM to notice when restore process
completed. Alternatively it could watch its own domid, but lets do it in
Xen-independent way.
When VM is started by root, config file is created with root owner and
user has no write access to it. As the directory is user-writable,
delete the file first.
Conflicts:
core-modules/000QubesVm.py
Do not load qubes.xml again, it can cause race conditions between two
instances of the same VM objects.
Especially when VM is starting ProxyVM to which it is connected,
firewall rules could not be loaded.
Long time ago passio=True was used to replace current process with
qrexec-client directly (qvm-run --pass-io was the called), but this
behaviour is not used anymore (qvm-run was the only user). And this
option was left untouched, with misleading name - one would assume that
using passio=False should disallow any I/O, but this isn't the case.
Especially qvm-sync-clock is calling clockvm.run('...', wait=True),
default value for passio=False. This causes to output data from
untrusted VM, without sanitising terminal sequences, which can be fatal.
This patch changes passio semantic to actually do what it means - when
set to True - VM process will be able to interact with
stdin/stdout/stderr. But when set to False, all those FDs will be
connected to /dev/null.
Conflicts:
core-modules/000QubesVm.py
Otherwise deadlock could happen - the script will try to get read lock
on qubes.xml, while the calling tool can already hold the lock. If that
was write lock (which is in case of qfile-daemon-dvm), the deadlock
occurs.
This is the only place where ID was used - all other places uses name.
Linux qrexec-client accepts both ID and name, but sticking to one option
will simplify things (especially Windows qrexec-client/daemon).
Currently getting Stubdom XID is (the last one?) read directly from
Xenstore as there is no libvirt function for it.
This means that even if HVM is running it can have not connection to
Xenstore. For now give -1 in such situation.
None of found existing portable locking module does support RW locks.
Use lowlevel system locking support - both Windows and Linux support
such feature.
Drop locking code in write_firewall_conf() b/c is is called with
QubesVmCollection lock held anyway.
Currently <vm-dir>/<vm-name>.conf file is used only for debugging
purposes - the real one is passed directly to libvirt, without storing
on disk for it.
In some cases (e.g. qvm-clone) QubesVM.create_config_file() can be
called before VM directory exists and in this case it would fail.
Because it isn't critical fail in any means (the config file will be
recreated on next occasion) just ignore this error.
Final version most likely will have this part of code removed
completely.
Mostly done. Things still using xenstore/not working at all:
- DispVM
- qubesutils.py (especially qvm-block and qvm-usb code)
- external IP change notification for ProxyVM (should be done via RPC
service)
libvirt_domain object needs to be recreated, so force it. Also fix
config path setting (missing extension) - create_config_file
uses it as custom config indicator (if such detected, VM settings -
especially name, would not be updated).
1. Fake dom0 object doesn't need proper maxmem nor vcpus - set
statically to 0 instead of getting from physical host.
2. QubesHVM doesn't preserve maxmem setting, so set it to self.memory
earlier (to suppress default total_memory/2 calculation).