Bug #17422
Broken upgrade path from 4.2 to 4.2.1: GDM doesn't start
100%
Description
While testing the 4.2.1 release, it appears the UDFv2 thing isn’t ready yet: starting from a 4.2 IMG, started on either baremetal (X230, Intel, 8086:0166) or in VM (QEMU, Cirrus, 1013:00b8), switching it to use the test
channel, deploying the upgrade to 4.2.1 seems to work fine. But upon reboot, GDM doesn’t start, listing the card and instructing users to troubleshoot using the https://tails.boum.org/gdm page.
At this stage, I’m considering it a blocker for 4.2.1 (which would ship Tor Browser 9.0.4).
Subtasks
History
#1 Updated by CyrilBrulebois 2020-01-11 00:28:20
Warning: You can safely pretend I don’t know anything about IUKs, upgrades, stack filesystems, and the like.
That being said, looking around in the devices, I’m seeing this after an upgrade:
4.2.1.squashfs
filesystem.squashfs
while a pristine 4.2 image only has:
filesystem.squashfs
Using unsquashfs
as root on my host system, I’m getting similar contents in the resulting squashfs-root.{original,upgraded}/var/lib/gdm3
:
drwxr-xr-x 4 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.original/var/lib/gdm3/
drwxr-xr-x 3 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.original/var/lib/gdm3/.local
drwxr-xr-x 3 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.original/var/lib/gdm3/.local/share
drwxr-xr-x 2 112 200 4096 Feb 9 2019 squashfs-root-filesystem.original/var/lib/gdm3/.local/share/applications
drwxr-xr-x 3 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.original/var/lib/gdm3/.config
drwxr-xr-x 2 112 200 4096 Feb 9 2019 squashfs-root-filesystem.original/var/lib/gdm3/.config/dconf
drwxr-xr-x 4 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.upgraded/var/lib/gdm3/
drwxr-xr-x 3 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.upgraded/var/lib/gdm3/.local
drwxr-xr-x 3 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.upgraded/var/lib/gdm3/.local/share
drwxr-xr-x 2 112 200 4096 Feb 9 2019 squashfs-root-filesystem.upgraded/var/lib/gdm3/.local/share/applications
drwxr-xr-x 3 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.upgraded/var/lib/gdm3/.config
drwxr-xr-x 2 112 200 4096 Feb 9 2019 squashfs-root-filesystem.upgraded/var/lib/gdm3/.config/dconf
with 112:200
matching Debian-gdm:Debian-gdm
in the Tails environment.
But the 4.2.1 which I guess ends up being put on top of the probably-untouched “main” filesystem.squashfs
has:
drwxr-xr-x 4 root root 4096 Jan 9 17:09 squashfs-root-4.2.1/var/lib/gdm3/
drwxr-xr-x 3 root root 4096 Jan 9 17:09 squashfs-root-4.2.1/var/lib/gdm3/.local
drwxr-xr-x 2 root root 4096 Jan 9 17:09 squashfs-root-4.2.1/var/lib/gdm3/.local/share
drwxr-xr-x 2 root root 4096 Jan 9 17:09 squashfs-root-4.2.1/var/lib/gdm3/.config
which I guess shadows the permissions in the original filesystem, which in turn explains why GDM cannot open the X log file under the xorg
subdirectory (that it has to create) under /var/lib/gdm3/.local/share
. (As seen by setting a root password, and inspecting the reasons for the failed gdm.service
unit.)
→ Fix the ownership issues, save the day?
#2 Updated by CyrilBrulebois 2020-01-11 00:36:06
(Same warning applies.)
And it seems Tails_amd64_4.2_to_4.2.1.iuk
ships the file I was just looking at (seen under squashfs-root/overlay/live/4.2.1.squashfs
after unsquashfs-ing the IUK), which seems to indicate fixing the IUK generation might be sufficient to unblock us, without having to tell users to fiddle with their un-automatically-upgradable 4.2 release?
#3 Updated by CyrilBrulebois 2020-01-11 01:41:17
(Same warning applies.)
Toying with config/chroot_local-includes/usr/src/iuk/lib/Tails/IUK.pm
a little, particularly the create_squashfs_diff
sub, it seems permissions are all fine with the mounting of the old and new root filesystems ($old_squashfs_mount
and $new_squashfs_mount
), also with the union mount ($union_mount
).
But then, I’m seeing mksquashfs
called, which happens to use those options:
method _build_mksquashfs_options () { [
qw{-no-progress -noappend},
qw{-all-root},
qw{-comp xz -Xbcj x86 -b 1024K -Xdict-size 1024K},
]}
Looks to me -all-root
is not our friend:
-all-root
make all files owned by root.
Looking at the history of iuk.git
(since that was introduced before the merge into the main Tails repository), this dates back to:
commit b578872ba2167da7dbc106a9d20c88e523ab73b3
Author: intrigeri <intrigeri@boum.org>
Date: Sun Nov 24 21:44:50 2019 +0000
Start implementing IUK format v2 (<del><a class='issue tracker-2 status-3 priority-6 priority-default closed child' href='/code/issues/6876' title='Have the incremental upgrade process use less RAM'>Feature #6876</a></del>)
which seems consistent with a seemingly IUKv2-specific problem, which didn’t exist with IUKv1.
#4 Updated by CyrilBrulebois 2020-01-11 02:09:29
To verify, I’ve implemented these steps:
- dropped
-all-root
from mksquashfs options; - dropped the last line, with compression options, so that I would get the default and quicker gzip compression (I was being a little impatient to know whether that would work);
- generated a new IUK from 4.2 to 4.2.1;
- extracted its
/live/4.2.1.squashfs
; - tweaked the filesystem of the virtual machine that was already upgraded from 4.2 to 4.2.1, removing the broken
/live/4.2.1.squashfs
and replacing it with the brand new one extracted from the hacked IUK.
I’m seeing GDM start up just fine, Tails pretend it’s running 4.2.1 (looking at /etc/os-release
), Tor Browser 9.0.4 running properly, etc.
#5 Updated by intrigeri 2020-01-11 07:34:47
- Target version set to Tails_4.2.2
This matches the analysis I did last night (couldn’t sleep…).
Thanks a lot for testing the fix, this will save me some time!
I’m now preparing a branch with a fix based on this (we need -all-root
is one place, just not everywhere).
#6 Updated by intrigeri 2020-01-11 07:45:12
- Status changed from Confirmed to In Progress
- Target version deleted (
Tails_4.2.2) - Feature Branch set to bugfix/17422-iuk-v2-ownership-in-squashfs-diff
#7 Updated by intrigeri 2020-01-11 07:53:06
- Status changed from In Progress to Needs Validation
- Assignee changed from intrigeri to CyrilBrulebois
- Target version set to Tails_4.2.2
I have pushed a tentative (really simple and tiny) fix for this (really stupid) bug, along with the test cases I wrote last night.
#8 Updated by CyrilBrulebois 2020-01-11 08:48:21
- Description updated
- Assignee changed from CyrilBrulebois to intrigeri
- Target version deleted (
Tails_4.2.2) - Private changed from Yes to No
Warning: You can safely pretend I don’t know anything about IUKs, upgrades, stack filesystems, and the like.
That being said, looking around in the devices, I’m seeing this after an upgrade:
4.2.1.squashfs
filesystem.squashfs
while a pristine 4.2 image only has:
filesystem.squashfs
Using unsquashfs
as root on my host system, I’m getting similar contents in the resulting squashfs-root.{original,upgraded}/var/lib/gdm3
:
drwxr-xr-x 4 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.original/var/lib/gdm3/
drwxr-xr-x 3 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.original/var/lib/gdm3/.local
drwxr-xr-x 3 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.original/var/lib/gdm3/.local/share
drwxr-xr-x 2 112 200 4096 Feb 9 2019 squashfs-root-filesystem.original/var/lib/gdm3/.local/share/applications
drwxr-xr-x 3 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.original/var/lib/gdm3/.config
drwxr-xr-x 2 112 200 4096 Feb 9 2019 squashfs-root-filesystem.original/var/lib/gdm3/.config/dconf
drwxr-xr-x 4 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.upgraded/var/lib/gdm3/
drwxr-xr-x 3 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.upgraded/var/lib/gdm3/.local
drwxr-xr-x 3 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.upgraded/var/lib/gdm3/.local/share
drwxr-xr-x 2 112 200 4096 Feb 9 2019 squashfs-root-filesystem.upgraded/var/lib/gdm3/.local/share/applications
drwxr-xr-x 3 112 200 4096 Jan 6 17:25 squashfs-root-filesystem.upgraded/var/lib/gdm3/.config
drwxr-xr-x 2 112 200 4096 Feb 9 2019 squashfs-root-filesystem.upgraded/var/lib/gdm3/.config/dconf
with 112:200
matching Debian-gdm:Debian-gdm
in the Tails environment.
But the 4.2.1 which I guess ends up being put on top of the probably-untouched “main” filesystem.squashfs
has:
drwxr-xr-x 4 root root 4096 Jan 9 17:09 squashfs-root-4.2.1/var/lib/gdm3/
drwxr-xr-x 3 root root 4096 Jan 9 17:09 squashfs-root-4.2.1/var/lib/gdm3/.local
drwxr-xr-x 2 root root 4096 Jan 9 17:09 squashfs-root-4.2.1/var/lib/gdm3/.local/share
drwxr-xr-x 2 root root 4096 Jan 9 17:09 squashfs-root-4.2.1/var/lib/gdm3/.config
which I guess shadows the permissions in the original filesystem, which in turn explains why GDM cannot open the X log file under the xorg
subdirectory (that it has to create) under /var/lib/gdm3/.local/share
. (As seen by setting a root password, and inspecting the reasons for the failed gdm.service
unit.)
→ Fix the ownership issues, save the day?
#9 Updated by CyrilBrulebois 2020-01-11 08:54:05
- Target version set to Tails_4.2.2
(If only Redmine wouldn’t change metadata when one is just updating description + unmarking as private…)
The proposed change looks good to me: compared to what I tried, it should indeed take care of what you mentioned as needed → keeping -all-root
for one of the two calls (while generating the IUK itself, and not while generating the SquashFS diff).
Feel free to merge, and to redo the IUKv2 dance.
#10 Updated by intrigeri 2020-01-11 09:02:23
- Status changed from Needs Validation to In Progress
Applied in changeset commit:tails|90d6dc1edb60de251a1e6f694b6925aaf01520da.
#11 Updated by intrigeri 2020-01-11 09:03:11
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset commit:tails|28c7de61354aa3ea33da0e95c10dc52c14851dc2.
#12 Updated by intrigeri 2020-01-11 09:03:57
Thank you for reviewing.
> Feel free to merge
Done.
> and to redo the IUKv2 dance.
Let’s coordinate this on XMPP.