Bug #9691
Tails Installer has to workaround race conditions in UDisks2
100%
Description
Tails installer with UDisks2 has race conditions in partition_device
: the partition table of partition often doesn’t exist after their creation.
Files
Subtasks
Related issues
Related to Tails - |
Resolved | 2015-03-15 | |
Related to Tails - |
Rejected | 2015-11-24 | |
Related to Tails - |
Resolved | 2015-12-07 | |
Related to Tails - |
Rejected | 2015-12-07 |
History
#1 Updated by alant 2015-07-05 10:08:31
- Parent task set to
Feature #8290
#2 Updated by intrigeri 2015-07-05 10:53:15
- Subject changed from Tails installer with UDisks2 has race conditions to Tails Installer with UDisks2 has race conditions
- Status changed from New to Confirmed
- Priority changed from Normal to High
#3 Updated by intrigeri 2015-07-05 10:53:46
- Target version set to Tails_1.5
#4 Updated by intrigeri 2015-07-06 13:24:30
- Assignee set to intrigeri
#5 Updated by alant 2015-07-06 16:33:41
- File bug_dbus_nocache.py added
I recreted the dbus object before every method call and I still sometimes have an error when trying to work on the partition:
dbus.exceptions.DBusException: org.freedesktop.DBus.Error.UnknownMethod: No such interface
'org.freedesktop.UDisks2.Partition' on object at path /org/freedesktop/UDisks2/block_devices/sdb1
The example script is bug_dbus_nocache.py
#6 Updated by intrigeri 2015-07-08 08:14:09
- Parent task deleted (
)Feature #8290
#7 Updated by intrigeri 2015-07-08 08:28:34
- related to
Feature #8290: Port Tails Installer to UDisks2 added
#8 Updated by intrigeri 2015-08-08 10:28:20
- Target version changed from Tails_1.5 to Tails_1.6
To clarify, the task I’ve committed to do is to reproduce locally thanks to Alan’s amazing set of scripts, and then to report this upstream. I thought I could perhaps have time to do it in the last few days before 1.5, but it now seems to be completely impossible => postponing.
#9 Updated by romeopapa 2015-08-11 15:21:30
- File bug_dbus_nocache_sleep.py added
FYI I’ve been able to complete the process for bug_dbus_nocache.py
After the changes I would not get the following error anymore:
dbus.exceptions.DBusException: org.freedesktop.DBus.Error.UnknownMethod: No such interface
'org.freedesktop.UDisks2.Partition' on object at path /org/freedesktop/UDisks2/block_devices/sdb1
I’ve simply introduced some 3 seconds sleep (1 seconds didn’t seem to be enough) before calling SetType and SetName (both would fail in the same way).
Please note that this does not fix the following handled exception, triggered by a call to call_format_sync (that indeed did not happen at all times, but did happen most of time):
org.freedesktop.UDisks2.Error.Failed: Error wiping newly created partition /dev/sdb1: Command-line
`wipefs -a "/dev/sdb1"' exited with non-zero exit status 1: wipefs: error: /dev/sdb1: probing
initialization failed: No such file or directory
Don’t know if that helps or not. I’m going to try to have a look inside UDisks2’s code to see if I can see what’s going on there.
#10 Updated by intrigeri 2015-08-19 07:47:43
- Assignee changed from intrigeri to alant
Sorry to realize it this late, but these scripts don’t look like minimal test cases I can point upstream to (which was the part I committed to do). Could you please clean them up so that they do the smallest possible amount of work that demonstrates the UDisks2 bugs we’re hitting? (and remove the Tails-specific comments and workarounds)
#11 Updated by intrigeri 2015-08-19 08:44:36
romeopapa wrote:
> I’ve simply introduced some 3 seconds sleep (1 seconds didn’t seem to be enough) before calling SetType and SetName (both would fail in the same way).
Thanks. Sorry this ticket’s description was not very clear: we actually had to add many such sleep
, sync
etc. workarounds in the Tails Installer source code (feature/jessie branch). The purpose of this ticket is to have the underlying bugs in UDisk2 fixed for real.
Next steps:
- clean up the reproducers
- reproduce on Debian sid
- reproduce on Fedora 22
- reproduce on Fedora 23 alpha
- file a bug report upstream
#12 Updated by intrigeri 2015-08-19 09:48:43
I could not reproduce the bug on a Fedora 22 VM (4 vCPUs, udisks2-2.1.5-1.fc22.i686), with a USB stick redirected with Spice.
Same on Fedora 23 alpha 2 (4 vCPUs, udisks2-2.1.6-1.fc23.x86_64), with a USB stick redirected with Spice.
In a sid VM (both with 2 and 4 vCPUs), I see bug_libudisks_async.py
fail:
Creating partition table
...format: finished
Getting partition table
Creating partition
...create partition: finished
Rescanning device
...rescan: finished
Recreating object and reading partition table
partitions: []
Traceback (most recent call last):
File "./bug_libudisks_async.py", line 89, in rescan_finished
system_partition = partitions[0]
IndexError: list index out of range
#13 Updated by intrigeri 2015-08-21 06:48:33
- File deleted (
bug_libudisks.py)
#14 Updated by intrigeri 2015-08-21 06:49:17
- File bug_libudisks.py added
Replaced reproducer with one that has not the ugly workarounds that prevent the bug from occurring most of the time.
#15 Updated by intrigeri 2015-08-21 07:21:43
I’ve had another look, and actually it seems that all these reproducers:
- either only demonstrate the fact that udisks doesn’t know about changes it thinks it has failed to apply, which sounds right;
- or are about https://bugs.debian.org/767457 (aka. https://bugs.freedesktop.org/show_bug.cgi?id=85477) that I’ve already reported many months ago.
Maybe this ticket’s description is misleading me, but I could only reproduce the “No such interface ‘org.freedesktop.UDisks2.Partition’ on object […]” symptom when I’ve 1. kept our code that ignores partition creation errors; 2. commented out our follow-up workaround that rescans the partition table. In this case, it’s no big surprise that (2) is needed since udisks believes it has failed to create the partition.
So: could anyone please clarify what new bug these reproducers are supposed to demonstrate? In other words, if the aforementioned bug (partition creation) was fixed, what other bug would we be suffering from?
#16 Updated by alant 2015-08-23 07:25:44
- Assignee changed from alant to intrigeri
- QA Check set to Info Needed
intrigeri wrote:
> So: could anyone please clarify what new bug these reproducers are supposed to demonstrate?
None
> In other words, if the aforementioned bug (partition creation) was fixed, what other bug would we be suffering from?
There are actually two bugs:
- ‘Error synchronizing after initial wipe’ while calling
block.call_format_sync()
: https://bugs.freedesktop.org/show_bug.cgi?id=76178 - ‘Error wiping newly created partition’ while calling
partition_table.call_create_partition_sync()
: https://bugs.freedesktop.org/show_bug.cgi?id=85477
Do you still need cleaned up scripts, and if yes, how am I supposed to demonstarte the 2nd bug without workarounding the 1st?
#17 Updated by intrigeri 2015-09-22 12:15:48
- Target version changed from Tails_1.6 to Tails_1.7
#18 Updated by intrigeri 2015-10-05 03:08:06
- Priority changed from High to Normal
- Target version changed from Tails_1.7 to Tails_1.8
This is not a blocker for uploading Tails Installer to Debian (we’ve workaround’ed these bugs already), and these bugs are known upstream already => downgrading priority. I’m swamped => postponing once more time. If I don’t get to it in time for 1.8 we’ll need to reconsider how this can realistically happen.
#19 Updated by intrigeri 2015-10-05 03:08:19
- Status changed from Confirmed to In Progress
- % Done changed from 0 to 10
#20 Updated by intrigeri 2015-10-05 03:08:40
- Subject changed from Tails Installer with UDisks2 has race conditions to Tails Installer has to workaround race conditions in UDisks2
#21 Updated by intrigeri 2015-12-02 03:47:06
- related to
Feature #10637: Fix DBus bug in Ubuntu added
#22 Updated by intrigeri 2015-12-07 08:33:03
- related to
Bug #10720: Tails Installer freezes when calling system_partition.call_set_name_sync in partition_device added
#23 Updated by intrigeri 2015-12-13 05:43:04
- Status changed from In Progress to Resolved
- Assignee deleted (
intrigeri) - % Done changed from 10 to 100
- QA Check deleted (
Info Needed)
alant wrote:
> intrigeri wrote:
> > So: could anyone please clarify what new bug these reproducers are supposed to demonstrate?
>
> None
OK, I was confused about the goal of this ticket apparently.
> > In other words, if the aforementioned bug (partition creation) was fixed, what other bug would we be suffering from?
>
> There are actually two bugs:
>
> * ‘Error synchronizing after initial wipe’ while calling block.call_format_sync()
: https://bugs.freedesktop.org/show_bug.cgi?id=76178
This one was closed upstream with an improved error message, and “you can’t sanely wipe a partition when it overlaps with another partition or the entire drive” so we’ll just have to live with our workaround.
> * ‘Error wiping newly created partition’ while calling partition_table.call_create_partition_sync()
: https://bugs.freedesktop.org/show_bug.cgi?id=85477
I think I’ve already provided enough info on the upstream ticket.
> Do you still need cleaned up scripts,
Apparently not.
#24 Updated by intrigeri 2016-01-24 19:23:11
- related to
Bug #10988: Tails Installer workarounds for UDisks 2 bugs are not robust enough added