Bug #10720

Tails Installer freezes when calling system_partition.call_set_name_sync in partition_device

Added by intrigeri 2015-12-07 08:32:39 . Updated 2016-07-29 06:55:45 .

Status:
Resolved
Priority:
Elevated
Assignee:
Category:
Installation
Target version:
Start date:
2015-12-07
Due date:
% Done:

100%

Feature Branch:
bugfix/10720-installer-freezes-on-jenkins
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Installer
Deliverable for:

Description

This ticket is now superseded by Bug #11590 and Bug #11588.

As reported on Bug #10717#note-1:

The root cause might live in UDisks, in QEMU, or in the Linux kernel, but realistically our best option is probably to act as if it was merely a race condition in UDisks, and add yet another workaround in Tails Installer. Doing it on the 4.x branch should be enough.


Files


Subtasks


Related issues

Related to Tails - Bug #10717: Concerning amount of test suite runs aborted on Jenkins due to timeout Rejected 2015-12-06
Related to Tails - Bug #9691: Tails Installer has to workaround race conditions in UDisks2 Resolved 2015-07-05
Blocked by Tails - Bug #10907: usb_install.feature fails when run as part of the entire test suite Resolved 2016-01-12

History

#1 Updated by intrigeri 2015-12-07 08:32:51

  • related to Bug #10717: Concerning amount of test suite runs aborted on Jenkins due to timeout added

#2 Updated by intrigeri 2015-12-07 08:33:02

  • related to Bug #9691: Tails Installer has to workaround race conditions in UDisks2 added

#3 Updated by intrigeri 2015-12-07 08:47:32

  • Feature Branch set to bugfix/10720-installer-freezes-on-jenkins

#4 Updated by intrigeri 2015-12-09 14:05:22

I think I should tell the test suite to run the Installer with DEBUG=1, and to gather the debug log as part of the Jenkins artifacts somehow.

#5 Updated by intrigeri 2015-12-14 16:33:28

  • Status changed from Confirmed to In Progress

#6 Updated by intrigeri 2015-12-18 13:25:07

intrigeri wrote:
> I think I should tell the test suite to run the Installer with DEBUG=1, and to gather the debug log as part of the Jenkins artifacts somehow.

Done, so next runs of https://jenkins.tails.boum.org/view/Raw/job/test_Tails_ISO_bugfix-10720-installer-freezes-on-jenkins/ should hopefully tell me more about this problem :)

#7 Updated by intrigeri 2016-01-11 02:16:15

Strangely, I don’t see this failure anymore. But all recent builds fail “Scenario: Booting Tails from a USB drive without a persistent partition and creating one” and the next scenarios that use the “I have started Tails without network from a USB drive without a persistent partition and stopped at Tails Greeter’s login screen” snapshot: the video only shows “No bootable device”. What’s interesting is that “I can view and print a PDF file stored in persistent /home/amnesia/Persistent” and “Watching MP4 videos stored on the persistent volume should work as expected given our AppArmor confinement” pass, while they use a snapshot that is a child of the one that we fail to restore. I suspect there is a problem with the platform, possibly triggered only when we run the entire test suite at once.

#8 Updated by intrigeri 2016-01-11 02:19:13

Next debugging step: reorder features to run usb_install.feature first.

#9 Updated by intrigeri 2016-01-12 11:21:45

intrigeri wrote:
> Next debugging step: reorder features to run usb_install.feature first.

Done in the topic branch, and then the USB install tests passed for the first time in a while, while a couple other persistence -using tests started failing again like they used to fail a few weeks ago, before these “No bootable device” issues started to show up.

#10 Updated by intrigeri 2016-01-12 13:17:38

  • blocked by Bug #10907: usb_install.feature fails when run as part of the entire test suite added

#11 Updated by intrigeri 2016-01-16 13:54:11

  • Target version changed from Tails_2.0 to Tails_2.2

I still don’t see this bug anymore. I’ll look at the results again during the 2.2 cycle.

#13 Updated by intrigeri 2016-01-24 18:53:46

The line that triggers this error is: system_partition.call_set_name_sync(self.label, GLib.Variant('a{sv}', None))

#14 Updated by intrigeri 2016-01-24 18:57:18

  • related to Bug #10987: Tails Installer sometimes fails with: No support for modifying a partition a table of type `PMBR' added

#15 Updated by intrigeri 2016-01-24 19:17:05

> The line that triggers this error is: […]

I’ve asked Alan, who added this line in a commit that didn’t really document why, if he had any idea what it is useful for.

#16 Updated by intrigeri 2016-01-24 19:23:28

#17 Updated by intrigeri 2016-01-24 19:23:51

  • Subject changed from Tails Installer freezes on Jenkins to Tails Installer freezes in partition_device on Jenkins

#18 Updated by intrigeri 2016-01-24 19:25:18

  • related to deleted (Bug #10987: Tails Installer sometimes fails with: No support for modifying a partition a table of type `PMBR')

#19 Updated by intrigeri 2016-01-24 19:35:36

  • Subject changed from Tails Installer freezes in partition_device on Jenkins to Tails Installer freezes when calling system_partition.call_set_name_sync in partition_device

#20 Updated by intrigeri 2016-01-24 19:38:58

The affected code is:

        # XXX: sometimes fails (https://labs.riseup.net/code/issues/10987)
        system_partition.call_set_type_sync(ESP_GUID, GLib.Variant('a{sv}', None))
        # XXX: sometimes fails (https://labs.riseup.net/code/issues/10720)
        system_partition.call_set_name_sync(self.label, GLib.Variant('a{sv}', None))

… and given the error message, I wonder if we need to wait for something between these two statements.

#21 Updated by intrigeri 2016-02-20 11:17:43

intrigeri wrote:
> > The line that triggers this error is: […]
>
> I’ve asked Alan, who added this line in a commit that didn’t really document why, if he had any idea what it is useful for.

… and he tells me that this line was added just in case it might be useful, but according to him, after reading udisks code it doesn’t seem to be useful. So I think I’ll remove that line, and upload a snapshot package to the topic branch’s APT suite, so we can see on Jenkins if it helps or not.

#22 Updated by intrigeri 2016-03-08 13:45:58

  • Target version changed from Tails_2.2 to Tails_2.3

#23 Updated by intrigeri 2016-04-16 15:41:06

  • Target version changed from Tails_2.3 to Tails_2.4

#24 Updated by intrigeri 2016-05-16 13:23:17

  • Target version changed from Tails_2.4 to Tails_2.5

#25 Updated by anonym 2016-06-16 08:09:11

I’ve refreshed the feature branch. I also pinned the fixed tails-installer version so we actually test this with intrigeri’s fix, so we should revert commit:5107c485ff29e48bf983c1432e84089ed706e6ab before potentially merging this branch.

#26 Updated by BitingBird 2016-06-26 09:51:31

  • % Done changed from 0 to 40

#27 Updated by intrigeri 2016-07-19 02:58:00

anonym wrote:
> I’ve refreshed the feature branch. I also pinned the fixed tails-installer version so we actually test this with intrigeri’s fix, so we should revert commit:5107c485ff29e48bf983c1432e84089ed706e6ab before potentially merging this branch.

Tried to fix that pinning with commit:3a243660ed3dc67b389fe57b6ca621807ef0f156, that should therefore be reverted as well.

#28 Updated by intrigeri 2016-07-20 01:48:22

> So I think I’ll remove that line, and upload a snapshot package to the topic branch’s APT suite, so we can see on Jenkins if it helps or not.

With the system_partition.call_set_name_sync call removed, the next statement (_set_partition_flags) fails (see attached screenshot). It’s interesting that this one can fail after call_set_type_sync has worked.

Next steps:

  • sync/settle/wait and retrieve a fresh partition object after call_set_type_sync;
  • fix my tweak to include Tails Installer’s debug log in the test suite’s, as it does not work.

#29 Updated by intrigeri 2016-07-20 02:16:14

> Next steps:
>
> * sync/settle/wait and retrieve a fresh partition object after call_set_type_sync;

Done in 4.4.10+dfsg-0tails1+bugfix.10720~1.gbp1473f2, building on Jenkins.

> * fix my tweak to include Tails Installer’s debug log in the test suite’s, as it does not work.

Still the case.

#30 Updated by intrigeri 2016-07-20 09:22:56

intrigeri wrote:
> > * sync/settle/wait and retrieve a fresh partition object after call_set_type_sync;
> Done in 4.4.10+dfsg-0tails1+bugfix.10720~1.gbp1473f2, building on Jenkins.

This makes partition_device exit successfully, but then switch_drive_to_system_partition is called, and calls _set_drive, which fails with “Cannot find device /dev/sda1”, that happens whenever self.drives.has_key(drive) is false.

#31 Updated by intrigeri 2016-07-20 10:02:53

> > * fix my tweak to include Tails Installer’s debug log in the test suite’s, as it does not work.

Done in 4.4.10+dfsg-0tails1+bugfix.10720~3.gbpe9be10.

#32 Updated by intrigeri 2016-07-21 02:57:41

  • related to Bug #11582: Some upgrade test scenarios fail due to lack of disk space on Jenkins added

#33 Updated by intrigeri 2016-07-22 01:34:31

  • related to deleted (Bug #11582: Some upgrade test scenarios fail due to lack of disk space on Jenkins)

#34 Updated by intrigeri 2016-07-22 01:34:41

  • blocks Bug #11582: Some upgrade test scenarios fail due to lack of disk space on Jenkins added

#35 Updated by intrigeri 2016-07-22 03:26:08

Status: I’ve seen a couple successful test suite runs from that branch on Jenkins! I’d like to get the installer robustness improvements produced here into 2.6, so that they benefit human users even though they might not be good enough to mark the tests as non-fragile on Jenkins yet.

#36 Updated by intrigeri 2016-07-22 04:00:10

  • blocked by deleted (Bug #11582: Some upgrade test scenarios fail due to lack of disk space on Jenkins)

#37 Updated by intrigeri 2016-07-22 04:17:56

  • Target version changed from Tails_2.5 to Tails_2.6
  • % Done changed from 40 to 50

intrigeri wrote:
> Status: I’ve seen a couple successful test suite runs from that branch on Jenkins! I’d like to get the installer robustness improvements produced here into 2.6, so that they benefit human users even though they might not be good enough to mark the tests as non-fragile on Jenkins yet.

Merging most of these bits is now tracked as Bug #11590. I have not seen Tails Installer fail on Jenkins yet with these changes in, so technically this ticket should be closed at some point. I’ll do that once we have new tickets tracking the next set of robustness issues, that are unveiled now that the Installer works (and then the comments about the Bug #10720 fragile tags in *.feature will need an update). I’ll wait to have a bit more data before I start creating these tickets.

#38 Updated by intrigeri 2016-07-28 08:45:43

  • blocks Bug #11588: Sometimes fails to boot from USB on Jenkins with I/O errors added

#39 Updated by intrigeri 2016-07-29 06:51:42

  • blocked by deleted (Bug #11588: Sometimes fails to boot from USB on Jenkins with I/O errors)

#40 Updated by intrigeri 2016-07-29 06:54:39

  • Description updated
  • Status changed from In Progress to Resolved
  • Target version changed from Tails_2.6 to Tails_2.5
  • % Done changed from 50 to 100

This ticket is now superseded by Bug #11590 and Bug #11588.

#41 Updated by intrigeri 2016-07-29 06:55:45

  • Assignee deleted (intrigeri)