Bug #11521

The check_tor_leaks hook is fragile

Added by intrigeri 2016-06-10 12:48:31 . Updated 2019-07-03 08:04:42 .

Status:
Resolved
Priority:
Elevated
Assignee:
Category:
Test suite
Target version:
Start date:
2016-06-10
Due date:
% Done:

100%

Feature Branch:
test/11521-always-power-off-vm
Type of work:
Research
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

https://jenkins.tails.boum.org/job/test_Tails_ISO_feature-5650-rngd/8/ failed (pcap attached) with:

calling as root: systemctl is-system-running
call returned: [0, "running\n", ""]
    And Tor is ready                                                                           # features/step_definitions/common_steps.rb:373
calling as root: date --rfc-2822
call returned: [0, "Fri, 10 Jun 2016 18:18:47 +0000\n", ""]
    Then Tails clock is less than 5 minutes incorrect                                          # features/step_definitions/time_syncing.rb:40
      Time was 0.436531902 seconds off
      Unexpected connections were made:
        #<OpenStruct mac_saddr="50:54:00:f3:87:dc", mac_daddr="52:54:00:53:3c:df", protocol="tcp", saddr="10.2.1.92", daddr="10.2.1.1", sport=55150, dport=5024>.
      <false> is not true. (Test::Unit::AssertionFailedError)
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:55:in `block in assert_block'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:1588:in `call'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:1588:in `_wrap_assertion'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:53:in `assert_block'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:163:in `block in assert'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:1593:in `call'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:1593:in `_wrap_assertion'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:130:in `assert'
      /var/lib/jenkins/workspace/test_Tails_ISO_feature-5650-rngd/features/support/helpers/firewall_helper.rb:68:in `assert_all_connections'
      /var/lib/jenkins/workspace/test_Tails_ISO_feature-5650-rngd/features/support/hooks.rb:274:in `After'

The pcap shows one initial packet (at time = 0) that’s precisely the one that violates the assertion, and then a DCHP transaction 53s later, and then what looks like a Tor bootstrap with chutney 30s later. So, my theory is that the initial packet, that triggers the failure, is a leftover of the previous scenario, or something, and has nothing to do with the current scenario.


Files


Subtasks


Related issues

Related to Tails - Bug #16148: ICMPv6 leaks detected by test suite Resolved 2018-11-23
Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed

History

#1 Updated by intrigeri 2016-06-10 12:48:59

IIRC it’s the first time I see this one, so no target version / deliverable for.

#2 Updated by intrigeri 2016-07-21 13:29:43

  • Priority changed from Normal to Elevated

It has happened 4 times in the last 50 days on Jenkins.

#3 Updated by bertagaz 2016-07-22 03:35:55

intrigeri wrote:
> It has happened 4 times in the last 50 days on Jenkins.

That’s mentioned on Bug #11087#note-9. Attaching the pcap just in case.

#4 Updated by intrigeri 2016-07-29 13:51:09

Here’s the pcap for another similar failure.

#5 Updated by anonym 2017-03-03 15:15:42

  • Target version set to Tails_2.12

#6 Updated by anonym 2017-04-19 17:40:18

  • Target version changed from Tails_2.12 to Tails_3.0~rc1

#7 Updated by intrigeri 2017-05-16 16:20:39

  • Target version changed from Tails_3.0~rc1 to Tails_3.0

#8 Updated by intrigeri 2017-05-27 09:23:14

  • Target version changed from Tails_3.0 to Tails_3.1

It would be nice to fix in 3.0 but that’s not a blocker IMO => postponing.

#9 Updated by anonym 2017-06-29 13:32:34

  • blocks Feature #13239: Core work 2017Q3: Test suite maintenance added

#10 Updated by anonym 2017-07-06 14:15:04

  • Target version changed from Tails_3.1 to Tails_3.2

#11 Updated by intrigeri 2017-09-07 06:35:24

  • Target version changed from Tails_3.2 to Tails_3.3

#12 Updated by intrigeri 2017-10-01 09:58:01

  • blocks Feature #13240: Core work 2017Q4: Test suite maintenance added

#13 Updated by intrigeri 2017-10-01 09:58:03

  • blocked by deleted (Feature #13239: Core work 2017Q3: Test suite maintenance)

#14 Updated by intrigeri 2017-11-10 14:16:29

  • Target version changed from Tails_3.3 to Tails_3.5

#15 Updated by anonym 2017-12-07 12:36:15

  • Target version deleted (Tails_3.5)

#16 Updated by intrigeri 2018-01-01 16:58:41

  • blocked by deleted (Feature #13240: Core work 2017Q4: Test suite maintenance)

#17 Updated by intrigeri 2019-02-16 06:06:30

  • related to Bug #16148: ICMPv6 leaks detected by test suite added

#18 Updated by anonym 2019-06-11 14:33:55

  • Status changed from Confirmed to In Progress

Applied in changeset commit:tails|3b9a83f0777d9845623dcb78d3f57831d0fd6e68.

#19 Updated by anonym 2019-06-11 14:37:21

  • Target version set to Tails_3.15
  • % Done changed from 0 to 40
  • Feature Branch set to test/11521-always-power-off-vm

First of all, I greped for "Unexpected connections were made" over all debug.log:s for the past three months, and it occurred 56 times. I randomly picked three of them, and all three had this bad first packet, so this bug is still affecting us.

intrigeri wrote:
> So, my theory is that the initial packet, that triggers the failure, is a leftover of the previous scenario, or something, and has nothing to do with the current scenario.

It is a great theory! :)

Note that this first packet is sent by the Tails under testing and not the router. Looking into our hooks (features/support/hooks.rb) I see that in none of the After hooks do we shut down the VM, instead we seem to rely on either one of the generated “snapshot restore” steps or the [Given] a computer step to implicitly get rid off the old VM… in the next scenario. That means the old VM is still running during the new scenarios Before hooks, where we start the sniffer. This finding is pretty conclusive: there is a window where the previous scenario’s network traffic leaks into the next one’s sniffer.

So, let’s just shut down the VM in the last After hook.

#20 Updated by anonym 2019-06-13 08:08:10

  • Status changed from In Progress to Needs Validation
  • Assignee deleted (anonym)

Out of 3 runs I only saw unrelated failures, so this branch doesn’t seem to introduce any regression (and I don’t see why @fragile scenarios would be any different so I’m ignoring that they are not full runs).

I propose we just merge this branch and close this ticket, and reopen (or file a new one) if the problem reappears.

#22 Updated by anonym 2019-06-13 08:15:59

#23 Updated by intrigeri 2019-06-14 08:46:43

  • Assignee set to intrigeri

#24 Updated by intrigeri 2019-06-14 08:52:43

  • Status changed from Needs Validation to In Progress
  • Assignee changed from intrigeri to anonym

LGTM but just one thing: the only other time in this hook that we use $vm, we guard it with if $vm; the rationale you provided in commit:1c387ab37779308980c870157c8d6d0aa752294b seems to apply here as well, so maybe add if $vm here too?

#25 Updated by anonym 2019-06-14 13:00:05

  • Status changed from In Progress to Fix committed
  • % Done changed from 40 to 100

Applied in changeset commit:tails|310e3d9898cf3c6bcdc9005aaa6b9d5c1534da04.

#26 Updated by anonym 2019-06-17 07:41:35

  • Assignee deleted (anonym)

#27 Updated by intrigeri 2019-07-03 08:04:42

  • Status changed from Fix committed to Resolved
  • Target version changed from Tails_3.15 to Tails_3.14.1