Bug #14622

Have the Installer detect when "Clone the current Tails" won't fit in the destination partition before deleting anything

Added by intrigeri 2017-09-11 18:26:50 . Updated 2018-01-23 19:50:29 .

Status:
Resolved
Priority:
High
Assignee:
Category:
Installation
Target version:
Start date:
2017-09-11
Due date:
% Done:

100%

Feature Branch:
bugfix/14622-installer-check-space-before-deleting, installer:bugfix/14622-installer-check-space-before-deleting, installer:debian_bugfix/14622-installer-check-space-before-deleting
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Installer
Deliverable for:

Description

When working on Feature #12707 we noticed that “Clone & Upgrade” does that:

  1. delete all the known Tails system files on the target system partition
  2. check for the available free space (gui.py:self.live.check_free_space())
  3. copy the source file to the target system partition

So if the source system partition is larger than the target one, and enough automatic upgrades have been installed on the source system partition, then the check for the available free space will fail and we will abort the upgrade… after having deleted Tails from the target partition. This is not ideal, to say the least :] This situation will happen once the landspace of Tails devices is made of system partitions with a variety of sizes, filled by IUKs to various levels.

Thankfully we have some time to fix this after Feature #12707 goes live, and before it becomes a problem in practice:

  • “the source system partition is larger than the target one” is unlikely before Feature #12707 is deployed in the wild (there are surely Tails devices somewhere with a 1.5 GiB system partition, but then those are already affected by this problem anyway if they “Clone & Upgrade” from their friend’s more recent Tails device so it’s not a regression);
  • It’ll take some time until a newly installed Tails 3.2, with a 4 or 8 GiB system partition, gathers “enough automatic upgrades have been installed on the source system partition”.

Next step: figure out how much time we have, adjust “Target version” accordingly and set “Type of work” to “Code”. I want to do this during the 3.3 cycle.

I’ll handle this ticket with my Foundations Team hat.


Files

too small.png (32461 B) sajolida, 2018-01-19 18:17:50

Subtasks


Related issues

Related to Tails - Feature #12615: Consider basing Tails on quarterly snapshots of Debian testing In Progress 2017-06-13
Related to Tails - Bug #14958: Tails Installer 5.0 proposes to upgrade even if the destination stick is already up-to-date Confirmed 2017-11-11
Blocked by Tails - Feature #12705: Update the size of the system partition to >= 4 GiB Resolved 2017-06-15

History

#1 Updated by intrigeri 2017-09-12 09:26:43

#2 Updated by intrigeri 2017-09-12 09:26:55

  • blocked by Feature #12705: Update the size of the system partition to >= 4 GiB added

#3 Updated by intrigeri 2017-09-27 10:43:44

  • Target version changed from Tails_3.3 to Tails_3.5

anonym, can you please double-check my reasoning below? (I’m not explicitly reassigning to you as I don’t want to risk realizing late in the 3.4 cycle that this is still waiting on your plate and I still have a big chunk of dev work to do.)

intrigeri wrote:
> Thankfully we have some time to fix this after Feature #12707 goes live, and before it becomes a problem in practice:
> […]
> * It’ll take some time until a newly installed Tails 3.2, with a 4 or 8 GiB system partition, gathers “enough automatic upgrades have been installed on the source system partition”.
>
> Next step: figure out how much time we have

So the problem will happen once a source device has more than 2.5G of Tails data on its system partition. The earliest it can happen is when a newly installed Tails 3.2 (with a system partition that’s at least 4G big) will have accumulated enough automatic upgrades. I’ll reason for the worst case.

  • Let’s assume that installing a given IUK takes exactly that IUK’s size on the device (in practice it takes less space is we replace some files when installing it).
  • Let’s assume that any IUK to upgrade to a major release is 350M big if we stick to Debian stable: I don’t see any that’s bigger than 300M on http://dl.amnesia.boum.org/tails/stable/iuk/ currently. But if we base Tails on quarterly snapshots of Debian testing, we should expect larger IUKs: I’ve built a 3.0~beta2_to_3.0.iuk locally (3 months passed between these two releases) and it takes 588M.
  • Let’s assume a user that tests all our release candidates using automatic upgrades, i.e. for each major release they install, they instead of having the “previous release to major release” update installed, they have “previous release to RC + RC to major release” updates installed. The 3.1 → 3.2~rc1 → 3.2 path is 35% bigger than the 3.1 → 3.2 one.
  • Let’s assume that any IUK to upgrade to a bugfix release is 300M big: 3.0 to 3.1 was 200M, 2.10 to 2.11 was 300M but we upgraded the kernel for two archs back then.
  • Let’s assume we put out one emergency release every three months, and its IUK takes the same size as the one for a bugfix release i.e. 300M.
  • Tails Upgrader requires 3 times the size of an IUK to install it, so a 4G can’t be filled more than 4G - 3 * size of the smallest possible IUK. We’ve already published IUKs smaller than 100M so the resulting number will be bigger than 2.5G anyway, so forget it.

So, if we stick to Debian stable until at least next January:

  • after upgrading to 3.3: 1150 + 300 = 1450M
  • after upgrading to 3.3.1: 1450 + 300 = 1750M
  • after upgrading to 3.4~rc1 and then to 3.4: 1750 + 350*1.35 = 2225M
  • so after upgrading to 3.4.1 or 3.5 it will be larger than 2500M => we have to implement a fix for this bug in time for 3.4 (in case we have to release 3.4.1, it’ll be too late to do that on short notice)

But if we force everyone to manually upgrade to 3.4 based on Debian testing in January, then we have some more time; so depending on the outcome of Feature #12615 we may be able to postpone this.

#4 Updated by anonym 2017-09-27 13:16:01

I believe you have missed stating an assumption: you assume that the 4 GiB stick was upgraded to 3.2 immediately, and not 3.2~rc1 in between. Let’s check how your timeline would look if we include 3.2~rc1:

  • after upgrading to 3.2~rc1 and then to 3.3: 1150 + 350*1.35 = 1622.5M
  • after upgrading to 3.3.1: 1622.5 + 300 = 1922.5M
  • after upgrading to 3.4~rc1 and then to 3.4: 1922.5 + 350*1.35 = 2395.0M which is still < 2500M => we’re good!

So 3.4 indeed seems like the last release we can fix this in with the above expectations of emergency releases. However, I think it is worth noting that a single additional emergency release (which would not be unprecedented) would mean we’d have to plan to fix this in Tails 3.3, otherwise users that are testing our RC IUKs will get into trouble in Tails 3.4.

Any way, I think we’re running ahead of ourselves. It’s not clear to me what the exact implications (= risks for our users) of this Tails Installer bug. In the description you say:

> […] the check for the available free space will fail and we will abort the upgrade… after having deleted Tails from the target partition […]

So, to clarify: the next time the device is scanned by Tails Installer, it will be treated as an “empty” drive so the only operation allowed will be an “Install”, and that will remove any persistence partition without warning (I’m disregarding users that would click “Reinstall (delete all data)” since they literally are asking for the persistence partition to be overwritten). If so, we’re talking about data loss (although it should be doable to completely recover the partition), and then I’m not sure if we want to bet on only a single emergency release.

In other words: does Tails Installer fail closed in this instance w.r.t. data loss (or removal of the TailsData part from the partition table)?

OTOH, if we have two emergency releases, we know this happens, and we can warn our users. So, in the end I think I’m good with your plan, as long as we pay attention to this!

#5 Updated by anonym 2017-09-27 13:25:57

anonym wrote:
> In other words: does Tails Installer fail closed in this instance w.r.t. data loss (or removal of the TailsData part from the partition table)?

tl;dr: Tails Installer fails closed!

I actually had a test set up around so I could easily test this. When the problematic upgrade is done, only the contents of the target Tails partition is removed, so next time Tails Installer looks at it it will still suggest an “Upgrade”. That Tails installation is now broken, of course, but it can be restored simply by upgrading from a source that is small enough (e.g. the ISO, an install consisting of fewer IUKs, …). The persistence partition is kept at all times, and at no point is the users facing its removal without a warning.

So, IMHO, your plan is completely sane!

#6 Updated by anonym 2017-09-27 16:34:41

anonym wrote:
> you assume that the 4 GiB stick was upgraded to 3.2 immediately, and not 3.2~rc1 in between.

Gah, this is just wrong. It should be: “you assume that the (8 GiB) stick was initially installed with 3.2, and not 3.2~rc1”. So…

> * after upgrading to 3.2~rc1 and then to 3.3: 1150 + 350*1.35 = 1622.5M
> * after upgrading to 3.3.1: 1622.5 + 300 = 1922.5M
> * after upgrading to 3.4~rc1 and then to 3.4: 1922.5 + 350*1.35 = 2395.0M which is still < 2500M => we’re good!

… this is also wrong. You original calculation only misses this initial step:

  • after upgrading from 3.2~rc1 to 3.2: 1150 + 92 = 1242M

The “92” comes from the actual size of Tails_amd64_3.2~rc1_to_3.2.iuk. :) In the end this will just bump your numbers with 92, so in the end we have 2225 + 92 = 2317M which is < 2500M so we are still good. :)

#7 Updated by intrigeri 2017-09-28 05:50:37

> I believe you have missed stating an assumption:

Thanks a lot for finding issues in my reasoning! :)

> OTOH, if we have two emergency releases, we know this happens, and we can warn our users. So, in the end I think I’m good with your plan, as long as we pay attention to this!

ACK

#8 Updated by intrigeri 2017-09-28 05:52:43

  • related to Feature #12615: Consider basing Tails on quarterly snapshots of Debian testing added

#9 Updated by intrigeri 2017-09-28 05:54:18

> The “92” comes from the actual size of Tails_amd64_3.2~rc1_to_3.2.iuk. :) In the end this will just bump your numbers with 92, so in the end we have 2225 + 92 = 2317M which is < 2500M so we are still good. :)

Great.

I think we’ll be in a better position to take bets about Feature #12615 on October 9, and then if we feel it’s unlikely that our January release is based on Buster, I’ll probably try to work on this during the 3.3 cycle, just to be on the safe side. And then we’ll see if the required changes qualify for a bugfix release.

#10 Updated by intrigeri 2017-11-27 11:49:16

  • Feature Branch set to bugfix/14622-installer-check-space-before-deleting, installer:bugfix/14622-installer-check-space-before-deleting, installer:debian_bugfix/14622-installer-check-space-before-deleting

#11 Updated by intrigeri 2017-11-27 13:42:49

  • Status changed from Confirmed to In Progress
  • Assignee changed from intrigeri to sajolida
  • % Done changed from 0 to 10
  • QA Check set to Info Needed

sajolida, this is high priority because it really really needs to be shipped in Tails 3.5. If you can’t answer my question below by December 10, please reassign to me and I’ll go ahead with my proposal.

To reliably know whether an upgrade by cloning will work space-wise, we need to mount the candidate destination Tails system partition: checking its size is not enough, as we need to know how much room the OS files we’re going to delete occupy (some users put non-Tails files on their drive even if we discourage against it). So I see three options:

  1. don’t bother fixing the problem this ticket is about in case the user has added non-Tails files to their Tails system partition; then I think this ticket should be very easy to fix for everyone else with minimal code changes
  2. for each candidate destination Tails system partition, mount it before adding it to the list of validate candidates, so we can check how much room the OS files occupy; in many cases that would trigger a desktop notification, which is not great for a UX point of view I guess; note that Bug #14958 will require mounting these filesystems as well, if we ever want to address it; I seem to remember there’s a way to temporary disable these notifications (that we already display sometimes), not sure if it’s real and suitable here; in any case I’m pretty sure the resulting code will make an ugly, hard to understand/debug part of the code even worse;
  3. postpone the space check and do it after the user has selected a destination drive and clicked Upgrade; I guess it sucks from a UX perspective, because we’re proposing to the user something that we’re not sure can work, and only later we tell them it cannot; it sucks only for the users affected by the problem this ticket is about but at some point there will be many of them

I dislike the 3rd option because it breaks the flow and introduces a UX regression, while this ticket is part of a project that’s supposed to improve UX.

I propose I quickly look into the 2nd option and evaluate how much more expensive to implement it is than the 1st one, and how deep into ugly code one has to dive in order to implement it. If my analysis concludes “it’s not that bad”, then I’ll do it. Otherwise I’ll fallback to option 1.

sajolida, what do you think?

#12 Updated by intrigeri 2017-11-28 09:05:43

intrigeri wrote:
> I propose I quickly look into the 2nd option and evaluate how much more expensive to implement it is than the 1st one, and how deep into ugly code one has to dive in order to implement it. If my analysis concludes “it’s not that bad”, then I’ll do it. Otherwise I’ll fallback to option 1.
>
> sajolida, what do you think?

Additional notes:

  • if you tell me “don’t bother with option 2, go directly with option 1”, I’ll certainly won’t complain
  • an automated test case for option 2 will definitely be much more expensive to implement than one for option 1

#13 Updated by Anonymous 2018-01-15 16:00:46

@sajolida: your input is needed on this ticket it seems. See https://labs.riseup.net/code/issues/14622#note-12

#14 Updated by sajolida 2018-01-19 15:54:47

  • related to Bug #14958: Tails Installer 5.0 proposes to upgrade even if the destination stick is already up-to-date added

#15 Updated by sajolida 2018-01-19 17:22:32

  • Assignee changed from sajolida to intrigeri

Recap: We’re trying to clone a source Tails with a 4 GB system onto a destination Tails with a 2.5 GB system that don’t fit.

The question is how do we let the user know.

I think that all USB sticks should be listed as target USB sticks, whether or not it’s possible to clone to them. Otherwise how do we let the user know that it’s this particular USB stick is problematic, why, and what to do next?

I’m proposing the following interaction: Gray out the “Upgrade” button when a problematic USB stick is selected (or autoselected) from the target USB sticks and display some helpful message in the logs.

In your previous note you only considered doing the space check either when building the list of target USB sticks or clicking “Upgrade”, but not when selecting (or autoselecting) a USB stick in the list. Is there any reason for that?

If I understand your proposal correctly, there could be two different space checks:

  • Superficial: That doesn’t require mounting the system partition and only looks at the size of the partition to do the calculation. This superficial check would delete non-Tails files on the system partition.
  • Deep: That requires mounting the system partition and calculating beforehand how much space will be freed by removing Tails files.

But I think that the interactions could be the same independently of the type of space check, right?

If so, I propose to implement the superficial check first and the deep check whenever possible.

And honestly, I’m not really worried about non-Tails files on the system partition. I’m not 100% but I think that it’s hard to do (on Windows a final Tails doesn’t appear as mountable, right?) so people doing that are making extra effort to do so. It’s not about a flaky UX allowing people to copy stuff there without really knowing what they do, isn’t it?

#16 Updated by sajolida 2018-01-19 18:22:49

Over XMPP intrigeri clarified that my proposal implied too much work for 3.5 and that we should instead rely on the mechanism that we already have to dismiss USB sticks that are too small (< 8 GB) with a custom log message. See the picture in attachment.

Recap: Who are we talking to?

  • People who are in a Tails with a system partition of 4 GB and > 2.5 GB
    of system content. So people cloning from friends or people who have multiple Tails.But not people doing a manual upgrade.
  • “Cloning” is not impossible to the destination USB stick in general. It’s just not possible to clone this particular source to this particular destination.
  • The solution is to upgrade from ISO.

My proposal log message:

To upgrade device "2.0 GB USB DISK 2.0 device (/dev/sdc/)" from this Tails, you need
to use a downloaded Tails ISO image:
https://tails.boum.org/install/download

But that made me realize that a different list of USB sticks should be displayed whether the option “Clone the current Tails” is selected (“not listed”) or the “Use a downloaded Tails ISO image” is selected (“listed”). How would that work with the current code?

#17 Updated by intrigeri 2018-01-20 07:00:13

  • QA Check changed from Info Needed to Dev Needed

> Recap: Who are we talking to?

> * People who are in a Tails with a system partition of 4 GB and > 2.5 GB of system content. So people cloning from friends or people who have multiple Tails.But not people doing a manual upgrade.
> * “Cloning” is not impossible to the destination USB stick in general. It’s just not possible to clone this particular source to this particular destination.
> * The solution is to upgrade from ISO.

Agreed. It seems too complicated to explain that one could actually clone another Tails whose system partition is less full.

> My proposal log message:

Sounds great, thanks.

> But that made me realize that a different list of USB sticks should be displayed whether the option “Clone the current Tails” is selected (“not listed”) or the “Use a downloaded Tails ISO image” is selected (“listed”). How would that work with the current code?

It seems easy to refresh the list of target devices when the user toggles one of the ISO/clone radio buttons. I’ll try it and see how it works in the real world.

#18 Updated by intrigeri 2018-01-20 08:16:47

  • QA Check changed from Dev Needed to Info Needed

sajolida wrote:
> But I think that the interactions could be the same independently of the type of space check, right?

Yes, if the “deep” check can be done when populating or refreshing the list of candidate target devices without causing too much UX and/or implementation trouble.

> And honestly, I’m not really worried about non-Tails files on the system partition. I’m not 100% but I think that it’s hard to do (on Windows a final Tails doesn’t appear as mountable, right?) so people doing that are making extra effort to do so. It’s not about a flaky UX allowing people to copy stuff there without really knowing what they do, isn’t it?

I think there are 2 very different cases:

Tails devices installed with Tails Installer

Yes, we set the hidden flag on the system partition; but I didn’t verify this was actually the case for many years; in particular, I’ve not heard any of us reporting about this topic for recent versions of Windows. Crossing fingers.

Tails devices installed with UUI or similar

I don’t think these tools hide the partition where they install Tails; and they don’t empty it so if the user does not check the “format” checkbox, that partition can very well already hold non-Tails files. I believe this explains why most of the problematic situations users have met wrt. non-Tails files on the system partition were about devices that were installed with UUI or similar. Now, I’ve just realized that for such devices, the problem this ticket is about has always been a possibility because we don’t control the size of the system partition used/created by these tools: technically they could very well be used to install Tails to a 1.5 GiB partition even if that’s unlikely in practice. Still, even if Feature #12705 did not introduce the possibility of such problems for these sticks, it does increase the chances the problem happens.

So in this case, I suspect it is “about a flaky UX allowing people to copy stuff there without really knowing what they do”.

The cheapest way to solve this would be to improve the existing check we do after one has clicked “Upgrade”: it could be done after one has clicked “Upgrade” but before deleting anything, and it would check that free space + how much would be freed by deleting Tails system files >= total size of files we want to clone. That’s clearly worse UX than the “deep” check proposed earlier if that one can be implemented without itself causing UX trouble when populating or refreshing the list of candidate target devices.

Anyway, for now I’ll complete my WIP implementation of the “superficial” check and then we can prioritize the possible next steps (implementing a cheap check as suggested in the previous paragraph; taking a look at how the deep check could be implemented without harming UX; revamping the interactions as you’ve proposed).

#19 Updated by intrigeri 2018-01-20 08:47:02

Hold on, I have good news. I’ve taken a step back and thought “why exactly am I assuming that we have to mount a candidate target filesystem in order to know how much free space is in there?” and found the fatcat program that can output such info about a FAT filesystem without mounting it. It does require direct read access to the partition block device, which we have in Tails (/etc/udev/rules.d/99-make-removable-devices-user-writable.rules), which is what matters because we only care about cloning here (luckily, as we cannot count on such access outside of Tails). It only works reliably on unmounted filesystems, but when the filesystem is already mounted we can get the free space in another way already.

So my current plan is:

  1. complete my WIP implementation of the superficial check for Tails 3.5, including an automated test case
  2. then, either for Tails 3.5 (if time allows), or later, or never (we need to discuss that): use fatcat to implement the deep check and/or revamp the interactions as designed by sajolida

#20 Updated by intrigeri 2018-01-20 08:48:58

  • Subject changed from Have the Installer detect when "Clone & Upgrade" won't fit in the destination partition before deleting anything to Have the Installer detect when "Clone the current Tails" won't fit in the destination partition before deleting anything

#21 Updated by intrigeri 2018-01-20 10:07:21

  • QA Check changed from Info Needed to Dev Needed

intrigeri wrote:
> Hold on, I have good news. I’ve taken a step back and thought “why exactly am I assuming that we have to mount a candidate target filesystem in order to know how much free space is in there?” and found the fatcat program that can output such info about a FAT filesystem without mounting it.

It’s far from trivial to use fatcat to compute the size of system files on an (unmounted) candidate system partition: one has to run fatcat multiple times, recursively, parsing its output every time. Too bad, it felt this could be a better approach than fixing bugs in our code base, which I have to do in order to get the partition size I need :/

#22 Updated by intrigeri 2018-01-20 14:38:50

  • % Done changed from 10 to 20

@anonym: my branch will have much bigger changes than I would have hoped. Just like you noticed last time you touched this code base iirc, the way the code is designed (sic) and organized makes it basically impossible to modify anything without having to fix or clean up tons of other stuff. Welcome to domino effect land. I think it’ll make your review harder than I would have hoped, sorry! The good news is that the resulting code is cleaner and better documented.

#23 Updated by intrigeri 2018-01-20 16:27:15

sajolida wrote:
> * “Cloning” is not impossible to the destination USB stick in general. It’s just not possible to clone this particular source to this particular destination.
> * The solution is to upgrade from ISO.

@sajolida: while testing I realized that it’s a tiny bit more subtle than that: what’s impossible is upgrading by cloning. As long as the target device is 8GB or bigger, and has no persistent volume, another solution would be to “force reinstall”, which would be a nice way for people who don’t use persistence to reset their Tails to use the new default system partition size. My branch does not offer that option right now (whereas Tails 3.4 would) because as opposed to your proposed design, in the cheap approach I went with, target devices that have a too small system partition for upgrading by cloning are not listed at all, whereas we display the “Force reinstall” button only when a device we consider as valid is selected. I don’t think I can fix that for Tails 3.5. But it would be good to keep this in mind for when someone implements the new design you’ve proposed. So basically we’re making a trade-off here: we introduce a limitation in the upgrade path in some cases (without the check I’m adding, a device with a too small system partition for upgrading by cloning would be listed, and then either the user would have the possibility to force-reinstall it… even though chances are that the user would instead just click Upgrade and get a non-bootable stick as a result) in order to ensure we don’t make a Tails stick unbootable. IMO it’s a sensible trade-off but you’re the boss here.

It seems to me that this situation (having to make a trade off) is a consequence of the removal of the splash screen (that would have allowed us to tell the user "too small but you can ‘clone and install’): restricting the user’s choice, is setting the bar pretty high wrt. how clever our code must be to handle corner cases… and when we don’t, or can’t, meet this level of expectations, we get bugs and usability regressions. That’s tough given the current state of this code base. Anyway, we’re here and we’ll deal with it somehow.

#24 Updated by intrigeri 2018-01-20 21:18:28

  • Assignee changed from intrigeri to anonym
  • % Done changed from 20 to 50
  • QA Check changed from Dev Needed to Ready for QA

Passes the (non-full, but only the UEFI scenario is fragile in the usb_* tests) test suite on my Jenkins.

#25 Updated by anonym 2018-01-22 14:12:31

  • Status changed from In Progress to Fix committed
  • Assignee deleted (anonym)
  • % Done changed from 50 to 100
  • QA Check changed from Ready for QA to Pass

intrigeri wrote:
> Passes the (non-full, but only the UEFI scenario is fragile in the usb_* tests) test suite on my Jenkins.

Same! FWIW I also tested it manually (I installed an image built from the feature branch to USB, filled the Tails partition, inserted a Tails 3.0 install, and got the expected message). The code changes looks marvelous (kill, kill, kill)! I’ve merged this + built and uploaded Tails Installer 5.0.4.

#26 Updated by anonym 2018-01-23 04:52:43

  • Status changed from Fix committed to In Progress

Applied in changeset commit:7a5ca42da0df91730ffb53a39771098cd129dff0.

#27 Updated by intrigeri 2018-01-23 08:58:10

  • Status changed from In Progress to Fix committed

#28 Updated by anonym 2018-01-23 19:50:29

  • Status changed from Fix committed to Resolved