Bug #15988

Set unique UUID on first boot & adapt bootloader

Added by Anonymous 2018-09-28 10:21:52 . Updated 2018-12-02 20:53:05 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Target version:
Start date:
2018-09-28
Due date:
% Done:

100%

Feature Branch:
feature/15292-repartition
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Deliverable for:
316

Description

to avoid issues when plugging 2 Tails sticks at the same time in one computer.


Files


Subtasks


History

#1 Updated by Anonymous 2018-09-28 10:35:34

  • blocked by Bug #15991: Code review & rubber-duck for USB Image added

#2 Updated by intrigeri 2018-10-09 09:22:59

  • blocks deleted (Bug #15991: Code review & rubber-duck for USB Image)

#3 Updated by intrigeri 2018-10-09 12:01:33

Is this a duplicate of Feature #15317?

#4 Updated by intrigeri 2018-10-09 12:16:59

intrigeri wrote:
> Is this a duplicate of Feature #15317?

Actually not, Feature #15317 was about the disk GUID, while this is probably about the system partition’s UUID. Note that sgdisk has a --randomize-guids switch that does both.

#5 Updated by segfault 2018-11-20 09:57:16

  • Status changed from Confirmed to In Progress

Applied in changeset commit:tails|bbbe90de8fc4308a4ae2b090c8c64f5422af4d87.

#6 Updated by intrigeri 2018-11-26 10:59:16

Oops, I’ve been misleading above when I wrote “this is probably about the system partition’s UUID” and suggested using sgdisk --randomize-guids, which you then implemented in commit:bbbe90de8fc4308a4ae2b090c8c64f5422af4d87. I think this change is useful: code review passes and I confirm that it does what it’s supposed to (looking at the symlinks in /dev/disk/by-partuuid/). But I think it solves only a part of what this ticket is about.

This ticket is about “avoid issues when plugging 2 Tails sticks at the same time in one computer”: when we wrote that, I think we rather meant the filesystem UUID, as found in /dev/disk/by-uuid/. According to my tests, this one is fixed at image creation time and not updated on first boot. So for example:

  • If Tails 3.12 and Tails 3.13 are plugged before I start the computer and I choose to boot from the 3.13 device, we’ll enter undefined behavior territory, e.g. our partitioning initramfs script could very well repartition the 3.12 stick (which would be a regression and arguably a violation of our design goals) instead of the 3.13 one (which will boot but without being repartitioned, which will make it behave in surprising and unsupported ways).
  • If I have started Tails and then plug a 2nd Tails USB stick, some parts of our storage management stack may start to behave in a way that’s confusing to users: I don’t know if udev will replace the existing symlink in /dev/disk/by-uuid/, nor if GNOME Files’ “Tails” shortcut will be updated, and regardless of how this currently works, I suspect it’s undefined behavior that can change at any time.

So I think we do need to change the FAT filesystem’s UUID on first boot; I suspect we might need to trigger processing of the relevant events by udev and wait for it to have updated the symlinks before resuming the partitioning process, in order to get consistent & predictible behavior. The good news is that I don’t think we’ll need to adapt the bootloader config thanks to your (clever!) usage of FSUUID.

#7 Updated by intrigeri 2018-11-26 12:46:39

intrigeri wrote:
> […] sgdisk --randomize-guids, which you then implemented in commit:bbbe90de8fc4308a4ae2b090c8c64f5422af4d87. I think this change is useful: code review passes and I confirm that it does what it’s supposed to (looking at the symlinks in /dev/disk/by-partuuid/).

I’m not sure anymore, see Bug #16153 which could be related to this commit.

#8 Updated by intrigeri 2018-11-26 14:26:08

  • related to Bug #16153: System partition is not an EFI System Partition added

#9 Updated by intrigeri 2018-11-26 14:27:29

  • related to deleted (Bug #16153: System partition is not an EFI System Partition)

#10 Updated by segfault 2018-11-26 23:41:03

  • Feature Branch set to feature/15292-usb-image

Should be fixed on the branch, but I didn’t rebuild and test it yet.

#11 Updated by intrigeri 2018-11-27 08:03:24

segfault wrote:
> Should be fixed on the branch, but I didn’t rebuild and test it yet.

I’m not sure about passing "${PARENT_DEVICE}" (as opposed to the system partition) to mlabel but I’ll try this now.

#12 Updated by intrigeri 2018-11-27 08:03:43

  • Assignee changed from segfault to intrigeri
  • % Done changed from 0 to 10
  • QA Check set to Ready for QA

#13 Updated by intrigeri 2018-11-27 08:42:29

  • QA Check changed from Ready for QA to Dev Needed

intrigeri wrote:
> segfault wrote:
> > Should be fixed on the branch, but I didn’t rebuild and test it yet.
>
> I’m not sure about passing "${PARENT_DEVICE}" (as opposed to the system partition) to mlabel but I’ll try this now.

Indeed, mlabel spits out errors during boot. Trying to fix it.

#14 Updated by intrigeri 2018-11-27 10:58:41

I made some progress but mlabel -n (or something after it) still fails on boot, see attached screenshot. I’m starting to think that it might be simpler to write the bytes we want at the right location as you did previously in the IMG generation script. Not sure if that’ll trigger udev to update its symlinks though.

Also, once we get mlabel to succeed, the value of $SYSTEM_PARTITION will become outdated so the next steps will fail. I think we should compute a suitable random UUID, store it, pass it to mlabel -N, and then update $SYSTEM_PARTITION accordingly.

Possibly related: the filesystem has no label anymore after 1st boot; this will break the Upgrader, Persistence Setup, and possibly more, so that’s higher priority than the FS UIID thing. I didn’t check if that’s a side effect of running mlabel -n, or of mlabel failing, or if the original USB image already has this problem.

#15 Updated by intrigeri 2018-11-27 16:16:59

  • Feature Branch changed from feature/15292-usb-image to feature/15292-repartition

#16 Updated by intrigeri 2018-11-28 13:52:34

intrigeri wrote:
> I made some progress but mlabel -n (or something after it) still fails on boot, see attached screenshot.

I’ll retry with nls_cp850.ko in the initramfs.

#17 Updated by intrigeri 2018-11-28 15:06:06

  • Assignee changed from segfault to intrigeri

I think I understood the problem and am working on a fix.

#18 Updated by intrigeri 2018-11-28 15:23:47

intrigeri wrote:
> Also, once we get mlabel to succeed, the value of $SYSTEM_PARTITION will become outdated

That’s probably wrong, thanks to using readlink -f in the first place.

#19 Updated by segfault 2018-11-28 19:00:35

Sorry, I should have updated this ticket earlier with my findings. I already found out yesterday (using strace) that mlabel uses /usr/lib/x86_64-linux-gnu/gconv/IBM850.so, aka code page 850.

I tried adding it to the initramfs via copy_exec, but mlabel still fails with:

Error converting to codepage 850 Invalid argument

#20 Updated by intrigeri 2018-11-29 07:23:35

> Sorry, I should have updated this ticket earlier with my findings. I already found out yesterday (using strace) that mlabel uses /usr/lib/x86_64-linux-gnu/gconv/IBM850.so, aka code page 850.

I’ve rediscovered this independently yesterday.

> I tried adding it to the initramfs via copy_exec, but mlabel still fails with:

Indeed, that’s what I did initially but it was not sufficient.

I’ve pushed commit:80ae91bae0ef627cc1ac2a12dfc7d805c6a4e13e which solves this FS UUID problem but there’s more work needed here:

  • I’m copying too much stuff to the initrd, see XXX in there.
  • The FS still seems to lose its label somewhere along the way. I did not check if that’s already the case in the .img or if the partitioning boot time script is the culprit.

I might have time to work on this again today so keeping on my plate for now, but let’s coordinate on XMPP if/when we can :)

#21 Updated by intrigeri 2018-11-29 09:03:03

> * The FS still seems to lose its label somewhere along the way. I did not check if that’s already the case in the .img or if the partitioning boot time script is the culprit.

At least the .img was affected and given why it was, I’m pretty sure mlabel would mess things up in the partitioning script as well ⇒ fixed with commit:769cfebf4bb74367bd1899b1851348646ab25a32.

> * I’m copying too much stuff to the initrd, see XXX in there.

TBD. If I find time to work on this today, my approach will be to chroot into our current unpacked initrd and to move away files until I find out which ones are really needed for mlabel to do its job.

#22 Updated by intrigeri 2018-11-29 09:08:49

> At least the .img was affected and given why it was, I’m pretty sure mlabel would mess things up in the partitioning script as well ⇒ fixed with commit:769cfebf4bb74367bd1899b1851348646ab25a32.

Crap, ideally I would have pushed the image creation change to feature/15292-generate-usb-image instead. It’s too late to rewrite history since I’ve already merged feature/15292-repartition into another branch (Bug #16003) that depends on it, so I’ve cherry-picked only the relevant change into feature/15292-generate-usb-image. Please review it and if happy, merge it into master→stable→devel.

Everything that’s left to do here is about the initrd so shall live on feature/15292-repartition.

#24 Updated by segfault 2018-12-01 14:44:45

  • % Done changed from 10 to 0
  • QA Check deleted (Dev Needed)

intrigeri wrote:
> Please review it and if happy, merge it into master→stable→devel.

Done.

#25 Updated by segfault 2018-12-01 16:09:34

> TBD. If I find time to work on this today, my approach will be to chroot into our current unpacked initrd and to move away files until I find out which ones are really needed for mlabel to do its job.

Doing that now.

#26 Updated by segfault 2018-12-01 16:21:40

segfault wrote:
> > TBD. If I find time to work on this today, my approach will be to chroot into our current unpacked initrd and to move away files until I find out which ones are really needed for mlabel to do its job.
>
> Doing that now.

It seems to be enough to copy IBM850.so and gconv-modules. Building and testing now before I push my commit.

#27 Updated by intrigeri 2018-12-01 20:48:20

> It seems to be enough to copy IBM850.so and gconv-modules.

Great!

#28 Updated by segfault 2018-12-02 17:40:53

  • Assignee changed from segfault to intrigeri
  • QA Check set to Ready for QA

#29 Updated by intrigeri 2018-12-02 20:53:05

  • Status changed from In Progress to Resolved
  • Assignee deleted (intrigeri)
  • % Done changed from 0 to 100
  • QA Check changed from Ready for QA to Pass

Perfect, we’re done here! \o/