Bug #17425

“double-upgrades” after 4.1.1 → 4.2 (IUKv1) → 4.2.1 (IUKv2)

Added by CyrilBrulebois 2020-01-11 17:37:29 . Updated 2020-01-13 15:03:56 .

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Target version:
Start date:
Due date:
% Done:

100%

Feature Branch:
bugfix/17425-dont-propose-upgrading-to-running-version
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

Here’s another issue from the “tests are important, really” department, with this scenario:

  • install an old version of Tails, which contains an “upgrader v1”, from scratch (in my test: 4.1.1)
  • an upgrade to the published 4.2 is offered, which contains an “upgrader v2”
  • upgrade, reboot, verify the advertised version is 4.2 (/etc/os-release)
  • enable the test channel
  • an upgrade to the to-be-published-one-day-maybe 4.2.1 is offered
  • upgrade, reboot, verify the advertised version is 4.2.1 (/etc/os-release)
  • enable the test channel just to be on the safe side
  • the upgrade to 4.2.1 is offered again!

I’ll re-run the whole thing to be extra sure.


Subtasks


Related issues

Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed

History

#1 Updated by intrigeri 2020-01-11 18:16:48

  • Target version set to Tails_4.3
  • Feature Branch set to bugfix/17425-dont-propose-upgrading-to-running-version

Thank you for doing the extra test. It gives us the chance to fix this problem in 4.3 (latest).
I regret not having included this case in the tests I did while working on Feature #15281. I realize it was a serious mistake on my part.
I’m tentatively setting metadata under the (blind) assumption that we’ll release 4.2.1 as-is, document the known issue, and ship the fix in a month (4.3).

> I’ll re-run the whole thing to be extra sure.

Looking at the code, it’s pretty clear to me why this happens, so I don’t think it’s worth spending time on reproducing this once more.
It’s a bug in the Upgrader that’s in 4.2 and in current 4.2.1.

I’m working on a fix as we speak, which is the best I can do at the moment in order to leave all available options open, in case someone can do extra RM’ing & manual testing work in the next few days/weeks (putting further efforts into this myself would bring me to a very risky place so I cannot reasonably do that).

#2 Updated by intrigeri 2020-01-11 18:22:37

> I’m tentatively setting metadata under the (blind) assumption that we’ll release 4.2.1 as-is, document the known issue, and ship the fix in a month (4.3).

While thinking about whether it could possibly be useful that I prepare on a fix today, and before giving up on this line of reasoning and instead deciding I’ll work on a fix no matter what, I took some notes, which I’ll dump as-is:

  • (A) Release 4.2.1 as-is and move on until 4.3
    • Pros:
      • We fix ASAP the nasty security vuln 4.2.1 is all about.
      • I can breathe and focus on other important matters.
    • Cons:
      • Users who have initially installed Tails << 4.2 will have to refuse this upgrade offer every time they start Tails, until they upgrade to 4.3.
      • Our tech writers have to document this known issue.
  • (B) Wait for my fix and redo 4.2.1 from scratch
    • Pros:
      • Users who upgrade to 4.2.1 won’t be affected by this bug. Users who are still running 4.2 can upgrade to 4.2.1 to fix this bug.
      • No extra work for tech writers.
    • Cons:
      • Time to remediation grows further.
      • Someone has to RM 4.2.1 again.
      • Manual testers have to test 4.2.1 again.
      • I have to do the IUK+UDF v1 dance for 4.2.1 again (probably cheaper for me than documenting it or providing live support to the RM), diverting time I would otherwise spend to improve other things for our users and community.
  • © Ship 4.2.1 as-is, then do a 4.2.2 with the fix for this bug, this time skipping the IUK+UDF v1 dance, i.e. forcing users of Tails << 4.2 to first upgrade to 4.2 or 4.2.1.
    • Pros:
      • Users who upgrade to 4.2.2 won’t be affected by this bug.
      • Same as (A)
    • Cons:
      • Someone has to RM & manually test 4.2.2.
      • No direct upgrade path from << 4.2 to 4.2.2: users need to first go through 4.2.1 or 4.2.2.
  • (D) Ship 4.2.1 as-is, then do a 4.2.2 with the fix for this bug and provide direct upgrades, once again, from Tails << 4.2.
    • I don’t think I’ll have time to do the IUK+UDF v1 dance at that time, so it’s probably not a realistic option.

#3 Updated by CyrilBrulebois 2020-01-11 18:33:40

  • Subject changed from “double-upgrades” when following a IUKv1→IUKv2 path to “double-upgrades” after 4.1.1 → 4.2 (IUKv1) → 4.2.1 (IUKv2)

Adjusting title to better reflect the use case, as I filed Bug #17426 separately.

Given the similarity in results, I suppose it might be that the “upgrader v2” fails to determine the actual running version when the system was first installed with a version << 4.2.

I haven’t tested a 4.2 → 4.2.1 upgrade (including the final switch to the test channel) with the rebuilt IUKv2s (Bug #17422) myself yet though. Another possibility could be that the new upgrader cannot determine the version it’s running, regardless of the upgrade path that was followed.

Enough with speculations, I’ll get back to actual tests.

#4 Updated by CyrilBrulebois 2020-01-11 19:24:05

Given @intrigeri confirmed Bug #17426 to be a duplicate, filing my results here: 4.1.1 → 4.2.1 also exhibits the same issue.

#5 Updated by intrigeri 2020-01-11 20:40:46

  • Status changed from Confirmed to In Progress
  • Type of work changed from Research to Code

#6 Updated by CyrilBrulebois 2020-01-11 20:47:26

I’ve tried to take a step back to think a bit more about it, and I think (B) has my strong preference. Not because I like it fundamentally, but because I found it to be less undesirable than the other options.

  • (A) Leaving users out in the cold doesn’t seem like a reasonable option, plus pressure on technical writers/frontdesk.
  • ©+(D) Leaving fixing the issue to a later 4.2.2 means we have to do that work anyway (granted, on a different timescale, but the more we wait, the more people might hit the upgrade issue).
  • © Special-casing releases seems like a recipe for more issues.

I’m absolutely not happy with the again-increasing time-to-remediation that comes with the (B) solution. But I don’t buy any of the other cons: if the upgrader is the only moving part, that doesn’t invalidate emmapeel’s work regarding manual testing; we already have to redo the TR part because of this morning’s IUKv2 regeneration; regarding developer time, it should have been spent before this was deployed into a release, and while I can sympathize with your desire to work on other topics, I’d rather see the broken pieces fixed now.

[Refraining from commenting further; that can wait for a later wrap-up of this hmm let’s say suboptimal situation.]

#7 Updated by intrigeri 2020-01-11 22:26:30

Hi,

> I’ve tried to take a step back to think a bit more about it, and I think (B) has my strong preference.

If we go this way, I would suggest calling it 4.2.2, because trying to emulate “from scratch”, given the starting point we’re at, feels pretty risky.
I have very bad memories of attempts to redo a release, that was almost completed already, while keeping the same version number.

> * © Special-casing releases seems like a recipe for more issues.

FWIW, I don’t see where © implies any special casing: to me it seems that © is about following the release process doc as-is, without non-standard work.
Non-standard work is only required to do IUK+UDF v1 dance — that is, in option (B) and (D).

> I’m absolutely not happy with the again-increasing time-to-remediation that comes with the (B) solution. But I don’t buy any of the other cons: if the upgrader is the only moving part, that doesn’t invalidate emmapeel’s work regarding manual testing; we already have to redo the TR part because of this morning’s IUKv2 regeneration;

OK!

> while I can sympathize with your desire to work on other topics, I’d rather see the broken pieces fixed now.

I appreciate your desire to not leave things in a broken state for our users. I’m fine with dismissing option (A) for this sole reason, if that’s more sustainable for us than (A). And I understand that you would be really unhappy if we did (A), so yeah, let’s fix the Upgrader. I guessed that much, which is why I’ve just spent my evening working on a fix.

The question I’m asking myself is less about my desires than about where my time is better spent for our users and our community.

I can live with doing significant amounts of unpleasant work (I never dreamed of doing accounting or fundraising, for example). I’m fine with re-prioritizing my tasks to spend more time dealing with the fallout of my mistakes in the next couple days, if that’s best for our users & community. To me it’s not a given: IMO, fixing a mistake should not necessarily become top-priority as soon as it’s discovered. We need to carefully balance the short-term urgent and the long-term important. For example:

  • I have other Upgrader work on my plate that will have a beneficial UX impact repeatedly and on the long-term (vs. during a few weeks). I believe this would make a bigger difference to our users, on the mid-term, than a one-off UX problem like the one caused by the option (E) I’m proposing below.
  • I could work on some of the RM-impacting issues you’ve raised during the 4.2.1 release process. As you’re well aware, this is high-priority in order to make RM’ing work sustainable.

Another option would be (E) = (B) without doing the IUK+UDF +1 dance. Compared to (B):

  • cons: users who did not upgrade to 4.2 yet have to first upgrade to 4.2 (using the bits that are already in place), before they can upgrade to 4.2.x.
  • pros:
    • I can work on long-term important things instead of dealing with short-term urgent 4.x << 4.2 → 4.2.x upgrades.
    • No coordination needed between 2 RMs who each have an existing schedule already for the next few days, and have already spent a week RM’ing stuff, coordinating, and dealing with the fallout of Feature #15281, with significant impact on patience, communication and work atmosphere.

This alternate option would have my preference; it looks like a reasonable trade-off to me. How does it sound to you?

But I’d like @sajolida’s assessment: how bad it would it be to require users of Tails << 4.2 to first apply the existing automatic upgrade to 4.2, before they can apply the automatic upgrade to 4.2.x? That is, same thing as the plan you’ve already ACK’ed for 4.3, except it affects more users, because it happens earlier so more folks did not upgrade to 4.2 yet.

Finally, I can live with (B) but the coordination will be tricky: I have a few other commitments in the next few days, that I can’t all reschedule.

#8 Updated by intrigeri 2020-01-11 23:48:13

> I’m working on a fix as we speak

Pushed. I wrote regression tests but as all the incremental upgrade tests we have, they closely simulate a real situation, but it’s not a real situation.

Before submitting this for review, first thing tomorrow I’ll check what Jenkins thinks, and will do some manual testing:

  1. follow the steps kibi did when reporting this ticket and Bug #17426
  2. see the bug happen
  3. the output of sudo journalctl | grep up-to-date should be empty
  4. apply the code changes from this branch to the running system
  5. systemctl --user restart tails-upgrade-frontend.service
  6. ~1 minute later: sudo journalctl | grep up-to-date should tell that the system is up-to-date and I should see no Upgrader dialog

Any other test that could help us gain confidence in this branch?

#9 Updated by intrigeri 2020-01-12 12:10:53

  • Status changed from In Progress to Needs Validation
  • Target version changed from Tails_4.3 to Tails_4.2.2

(Adjusting metadata so that deciding how to proceed is on the 4.2.x radar.)

> Before submitting this for review, first thing tomorrow I’ll check what Jenkins thinks,

Looks good.

> and will do some manual testing:

OK for:

I got tired of waiting 2 extra minutes while rebooting after each upgrade (both during my manual tests and while running the corresponding automated tests), so I included a fix for Bug #17026 in the branch.

And I grew tired of having to re-run the full test suite every time I wanted to retry a single failed test on Jenkins, so I added the ability to pass arbitrary arguments to Cucumber when triggering a run of test_Tails_ISO_* manually (commit 521b5d212d77c216f4425050b3e4926334af7f32 in our jenkins-jobs repo).

#10 Updated by intrigeri 2020-01-12 12:21:04

  • Assignee deleted (intrigeri)

#11 Updated by intrigeri 2020-01-12 20:08:21

#12 Updated by intrigeri 2020-01-13 08:13:50

  • Status changed from Needs Validation to Resolved
  • % Done changed from 0 to 100

Applied in changeset commit:tails|7fe156258d457f684707e3d4bd0cb6ddb05b343c.

#13 Updated by sajolida 2020-01-13 15:03:56

> But I’d like @sajolida’s assessment: how bad it would it be to require users of Tails << 4.2 to first apply the existing automatic upgrade to 4.2, before they can apply the automatic upgrade to 4.2.x? That is, same thing as the plan you’ve already ACK’ed for 4.3, except it affects more users, because it happens earlier so more folks did not upgrade to 4.2 yet.

I’ve discussed this with intrigeri yesterday and, given the state of our
workers, it seems like a very smart compromise.