Bug #11592
Step "[...] has loaded in the Tor Browser" is fragile
100%
Description
We noticed while referencing failures Jenkins that this step sometimes fails (see Bug #11087#note-9). It seems that the Tor Browser has troubles loading the startup page, and that’s probably due to the Tor bootstrapping issues that the Chutney integration did not completely solved. A workaround could be to add a retry logic here.
Files
Subtasks
Related issues
Related to Tails - |
Resolved | 2015-10-15 | |
Related to Tails - Bug #17007: JavaScript sometimes blocked on Tor Browser first start ⇒ "Watching a WebM video over HTTPS" and "Playing an Ogg audio track" scenarios are fragile: blocked by NoScript click-to-play | Confirmed | ||
Blocks Tails - Feature #16209: Core work: Foundations Team | Confirmed |
History
#1 Updated by bertagaz 2016-07-22 05:22:23
- related to
Bug #10381: The "I open the address" steps are fragile added
#2 Updated by bertagaz 2016-07-22 05:28:59
- Feature Branch set to test/11592-load-page-in-torbrowser-is-fragile
Marking related scenarios as fragile.
#3 Updated by intrigeri 2016-07-22 05:31:13
- Target version deleted (
Tails_2.6)
#4 Updated by intrigeri 2016-07-22 05:54:42
- Feature Branch changed from test/11592-load-page-in-torbrowser-is-fragile to wip/test/11592-load-page-in-torbrowser-is-fragile
#5 Updated by intrigeri 2016-07-22 11:27:22
- Affected tool set to Browser
#6 Updated by intrigeri 2017-03-05 14:35:59
- Priority changed from Normal to Elevated
This affects most Tor Browser scenarios, which means we have to run them locally during the release process. Hence bumping priority.
#7 Updated by intrigeri 2018-07-03 12:28:17
Tor Browser 8 hides the stop/reload button. On the branch for Feature #15023 I’m applying a hack to display it in Tails in order to avoid breaking our test suite. Whenever we work on this ticket, let’s take this opportunity to drop the hack and stop relying on the Reload button.
#8 Updated by intrigeri 2019-08-19 05:33:40
While analyzing full test suite failures on the stable branch today, I noticed that this step is the primary reason for failures there, so I looked closer and here’s what I found out:
- I’ve not seen cases when the startup page did not load at all. That is, presumably, we’re not in these situations where some retrying /
retry_tor
would help. - Sometimes the startup page does load but the piece of JS that sets
document.title = "Tails"
does not run, and then the browser window’s title is still “Tails - Trying a testing version of Tails - Tor Browser”, which makes the step fail. I guess this can be explained by:- a network issue preventing the JS from being loaded → unlikely since all other resources are nicely loaded
- a bug in Tor Browser
- the VM being too busy to execute the JS fast enough to satisfy the expectations set by this step
- The Journal was not saved for any of these failures, which makes it hard to analyze what’s going on. My understanding of the code is that it’s because the remote shell is not “up”, i.e. it’s not replying within 3 seconds, which can indicate the VM is too busy or has crashed.
- In this step, we do
try_for(60) {
torbrowser.child?(expected_title, roleName: ‘frame’) }, which suggests we're inclined to retry the search if it fails. But in the debug log I see that the corresponding dogtail code is run only once, never returns, and then
try_for@ times out after 60 seconds. I suspect that dogtail keeps the VM busy, which would explain the lack of a Journal artifact, but also possibly the fact the JS did not run. I don’t know why dogtail is taking so long but I suspect the browser, while loading the page (and then changing its title), is creating/deleting/updating tons of a11y objects, which dogtail might have a hard time keeping track of, therefore causing either huge CPU usage or consuming enough memory that the remote shell and the browser are stuck.
So I’m tempted to add a stupid sleep
before this try_for
, to give the browser some time to run the JS before we overload the VM with a dogtail search. This might or might not work, but at least it should help validate/invalidate my hunch.
Thoughts?
#9 Updated by intrigeri 2019-08-19 07:47:10
- Status changed from Confirmed to In Progress
Applied in changeset commit:tails|c557e759cb7b6633f236c5868f706ae3406c2f48.
#10 Updated by intrigeri 2019-08-19 07:47:42
- Status changed from In Progress to Confirmed
- Feature Branch changed from wip/test/11592-load-page-in-torbrowser-is-fragile to test/11592-load-page-in-torbrowser-is-fragile+force-all-tests
intrigeri wrote:
> While analyzing full test suite failures on the stable branch today, I noticed that this step is the primary reason for failures there
Same on the devel branch.
> So I’m tempted to add a stupid sleep
before this try_for
, to give the browser some time to run the JS before we overload the VM with a dogtail search. This might or might not work, but at least it should help validate/invalidate my hunch.
Trying this on the branch, we’ll see how it goes on Jenkins.
#11 Updated by intrigeri 2019-08-19 08:43:18
- Status changed from Confirmed to In Progress
- Assignee set to intrigeri
- Target version set to Tails_3.16
(All these metadata changes are for evaluating the aforementioned workaround and if it works, get it merged.)
#12 Updated by intrigeri 2019-08-19 08:43:34
- blocks Feature #16209: Core work: Foundations Team added
#13 Updated by intrigeri 2019-08-21 13:37:18
- Status changed from In Progress to Confirmed
- Assignee deleted (
intrigeri) - Target version deleted (
Tails_3.16) - Feature Branch deleted (
test/11592-load-page-in-torbrowser-is-fragile+force-all-tests)
intrigeri wrote:
> intrigeri wrote:
> > So I’m tempted to add a stupid sleep
before this try_for
, to give the browser some time to run the JS before we overload the VM with a dogtail search. This might or might not work, but at least it should help validate/invalidate my hunch.
>
> Trying this on the branch, we’ll see how it goes on Jenkins.
Sleeping 5s does not seem to help at all.
#14 Updated by intrigeri 2019-08-26 17:19:16
- Status changed from Confirmed to In Progress
Applied in changeset commit:tails|b69a4dc6ab5094b4682ddf75f77d34d7fa51f330.
#15 Updated by intrigeri 2019-08-26 17:20:56
- Assignee set to intrigeri
- Target version set to Tails_4.0
- Feature Branch set to test/11592-load-page-in-torbrowser-is-fragile+force-all-tests
Giving it another try in the hope this allows us to keep running the full test suite by default on the testing & devel branches.
I’ve seen this step pass on this branch locally with the two page titles that one can see in practice, which was my goal. Let’s see how it goes on Jenkins :)
#16 Updated by intrigeri 2019-08-27 18:44:46
- Status changed from In Progress to Needs Validation
- Assignee deleted (
intrigeri)
Same on 5 runs on Jenkins, which is unheard of: recently, almost every test suite run there triggers this bug. So it looks like my workaround works!
#17 Updated by segfault 2019-08-28 12:38:32
LGTM
#18 Updated by segfault 2019-08-28 12:38:44
- Status changed from Needs Validation to Resolved
- % Done changed from 0 to 100
Applied in changeset commit:tails|3e032ce42561ba831e49301e7a04992fb657be0c.
#19 Updated by intrigeri 2019-09-14 05:41:57
- related to Bug #17007: JavaScript sometimes blocked on Tor Browser first start ⇒ "Watching a WebM video over HTTPS" and "Playing an Ogg audio track" scenarios are fragile: blocked by NoScript click-to-play added