Core3 keep information whether property have default value for all the
properties (not only few like netvm or kernel). Try to use this feature
as much as possible.
When user included/excluded some VMs for restoration, it may be
neceesarry to fix dependencies between them (for example when default
template is no longer going to be restored).
Also fix handling conflicting names.
Now, when file name is also integrity protected (prefixed to the
passphrase), we can make sure that input files are given in the same
order. And are parts of the same VM.
QubesOS/qubes-issues#971
This prevent switching parts of backup of the same VM between different
backups made by the same user (or actually: with the same passphrase).
QubesOS/qubes-issues#971
`openssl dgst` and `openssl enc` used previously poorly handle key
stretching - in case of `openssl enc` encryption key is derived using
single MD5 iteration, without even any salt. This hardly prevent
brute force or even rainbow tables attacks. To make things worse, the
same key is used for encryption and integrity protection which ease
brute force even further.
All this is still about brute force attacks, so when using long, high
entropy passphrase, it should be still relatively safe. But lets do
better.
According to discussion in QubesOS/qubes-issues#971, scrypt algorithm is
a good choice for key stretching (it isn't the best of all existing, but
a good one and widely adopted). At the same time, lets switch away from
`openssl` tool, as it is very limited and apparently not designed for
production use. Use `scrypt` tool, which is very simple and does exactly
what we need - encrypt the data and integrity protect it. Its archive
format have own (simple) header with data required by the `scrypt`
algorithm, including salt. Internally data is encrypted with AES256-CTR
and integrity protected with HMAC-SHA256. For details see:
https://github.com/tarsnap/scrypt/blob/master/FORMAT
This means change of backup format. Mainly:
1. HMAC is stored in scrypt header, so don't use separate file for it.
Instead have data in files with `.enc` extension.
2. For compatibility leave `backup-header` and `backup-header.hmac`. But
`backup-header.hmac` is really scrypt-encrypted version of `backup-header`.
3. For each file, prepend its identifier to the passphrase, to
authenticate filename itself too. Having this we can guard against
reordering archive files within a single backup and across backups. This
identifier is built as:
backup ID (from backup-header)!filename!
For backup-header itself, there is no backup ID (just 'backup-header!').
FixesQubesOS/qubes-issues#971
Have a generic function `handle_streams`, instead of
`wait_backup_feedback` with open coded process names and manual
iteration over them.
No functional change, besides minor logging change.
Use just introduced tar writer to archive content of LVM volumes (or
more generally: block devices). Place them as 'private.img' and
'root.img' files in the backup - just like in old format. This require
support for replacing file name in tar header - another thing trivially
supported with tar writer.
tar can't write archive with _contents_ of block device. We need this to
backup LVM-based disk images. To avoid dumping image to a file first,
create a simple tar archiver just for this purpose.
Python is not the fastest possible technology, it's 3 times slower than
equivalent written in C. But it's much easier to read, much less
error-prone, and still process 1GB image under 1s (CPU time, leaving
along actual disk reads). So, it's acceptable.
Old backup metadata (old qubes.xml) does not contain info about
individual volume sizes. So, extract it from tar header (using verbose
output during restore) and resize volume accordingly.
Without this, restoring volumes larger than default would be impossible.