Bug #16471

Drop time synchronization hacks that tor 0.3.5 and 0.4.x made obsolete

Added by hefee 2019-02-17 14:39:40 . Updated 2020-05-08 14:01:13 .

Status:
In Progress
Priority:
Normal
Assignee:
anonym
Category:
Time synchronization
Target version:
Start date:
2019-02-17
Due date:
% Done:

0%

Feature Branch:
bugfix/16471-drop-time-synchronization-hacks+force-all-tests, https://salsa.debian.org/tails-team/tails/merge_requests/21
Type of work:
Research
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

And here’s a list of little-t-tor changes that might be relevant, e.g. they may break or time sync hacks or instead make some of them obsolete:

Some of them were fixed in 0.3.5, some will be in 0.4.x but might be backported to 0.3.5 when we come back to this ticket. As part of this ticket, I expect we will:

  • look for breakage and fix it (subtasks of this ticket + report to Tor Network Team if the problem is on their side)
  • look for changes that make our code useless but not harmful and file new FT tickets for that (not blockers to close this ticket)
  • if it looks like 0.4.x will make a bigger difference, file a ticket about upgrading to 0.4.x and explain your findings there, so we don’t do all the exact same work again once we’re there

Files


Subtasks


Related issues

Related to Tails - Feature #16348: Upgrade to tor 0.3.5 Resolved 2019-01-12
Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed
Blocked by Tails - Feature #16687: Upgrade to tor 0.4.x Resolved
Blocks Tails - Bug #9256: Don't restart Tor after setting the right clock Confirmed 2015-04-17
Blocked by Tails - Bug #16792: Upgrade our Chutney fork Resolved

History

#1 Updated by hefee 2019-02-17 14:40:00

#2 Updated by hefee 2019-02-17 14:40:20

#3 Updated by hefee 2019-02-17 14:45:33

  • Feature Branch set to hefee-bugfix-16349-tor-0.3.5-force-all-tests

As I still have no test setup I can’t really check, what will break if I remove time sync hacks.

Properly we can cleanup config/chroot_local-includes/etc/NetworkManager/dispatcher.d/20-time.sh as now Tor does not fail that often anymore. Maybe the function is_clock_way_off, and unverified-microdesc-consensus may also been not needed anymore. But form the tickets in the description it is not clea, if tor now never falls with such errors or if it just not fail that often into this error.

Jenkins fails for “Time syncing ǂ Clock is one day in the future in bridge mode”.

https://jenkins.tails.boum.org/job/test_Tails_ISO_hefee-bugfix-16349-tor-0.3.5-force-all-tests/lastCompletedBuild/cucumberTestReport/time-syncing/clock-is-one-day-in-the-future-in-bridge-mode/

#4 Updated by intrigeri 2019-02-18 18:03:04

  • Subject changed from Cleanup time sync hacks. to Drop time synchronization hacks that tor 0.3.5 made obsolete (if any)
  • Category set to Time synchronization
  • Status changed from New to Confirmed

#5 Updated by intrigeri 2019-02-18 18:03:31

#6 Updated by intrigeri 2019-02-18 18:03:36

  • blocked by deleted (Feature #15507: Core work 2019Q1: Foundations Team)

#7 Updated by intrigeri 2019-02-18 18:05:11

  • Assignee deleted (intrigeri)
  • Target version changed from Tails_3.13 to Tails_3.14
  • Feature Branch deleted (hefee-bugfix-16349-tor-0.3.5-force-all-tests)
  • Type of work changed from Code to Research

(There’s very little chance I tackle this during the 3.13 dev cycle, hence why I had left Feature #16348 unassigned until last FT meeting. And there’s a chance that @anonym finds this interesting enough to take it ⇒ let’s discuss this at the next FT meeting :)

#8 Updated by intrigeri 2019-04-05 13:12:56

  • Target version deleted (Tails_3.14)

#9 Updated by hefee 2019-04-16 07:30:31

  • Assignee set to hefee

#10 Updated by Anonymous 2019-04-16 11:46:28

  • Status changed from Confirmed to In Progress

Applied in changeset commit:tails|44d0d25ccee907e37db276047216c4a6574fbb95.

#11 Updated by hefee 2019-04-16 11:49:54

  • Feature Branch set to hefee/bugfix/16471-drop-time-synchronization-hacks+force-all-tests

#12 Updated by intrigeri 2019-05-03 15:20:57

  • Subject changed from Drop time synchronization hacks that tor 0.3.5 made obsolete (if any) to Drop time synchronization hacks that tor 0.3.5 and 0.4.x made obsolete (if any)

0.4.0.5 brings:

  o Minor bugfixes (client, clock skew):
    - Bootstrap successfully even when Tor's clock is behind the clocks
      on the authorities. Fixes bug 28591; bugfix on 0.2.0.9-alpha.
    - Select guards even if the consensus has expired, as long as the
      consensus is still reasonably live. Fixes bug 24661; bugfix
      on 0.3.0.1-alpha.

#13 Updated by hefee 2019-05-07 21:57:40

  • Assignee deleted (hefee)
  • Target version set to Tails_3.14
  • QA Check set to Ready for QA
  • Feature Branch changed from hefee/bugfix/16471-drop-time-synchronization-hacks+force-all-tests to hefee/bugfix/16471-drop-time-synchronization-hacks+force-all-tests, https://salsa.debian.org/tails-team/tails/merge_requests/21

Jenkins status looks fine. We may can even strip more stuff with 0.4.0.5.

#14 Updated by intrigeri 2019-05-08 08:24:59

  • Assignee set to intrigeri

#15 Updated by intrigeri 2019-05-08 08:57:28

  • Assignee changed from intrigeri to hefee
  • QA Check changed from Ready for QA to Dev Needed

Reviewed on Salsa! Once we have test coverage for the cases that this branch affects, I’m happy to merge it.

But that won’t fully resolve Bug #16471 as we still need to take benefit, in the more common cases (hardware clock is correct but in local timezone instead of UTC) from:

To be clear, ideally we would be able to get rid of maybe_set_time_from_tor_consensus() entirely: that’s the part of our time sync’ system that is problematic from a security PoV). I suspect we’ll need 0.4.x to do that.

In any case, even if we can’t remove maybe_set_time_from_tor_consensus() yet, we should adjust 20-time.sh so that we only change the system clock when it’s actually needed, i.e. expose our users to a dangerous situation less often: currently we do that all the time, as long as time_is_in_valid_tor_range returns false, which nowadays includes cases when tor will have actually judged the system clock was good enough, and accepted using the downloaded consensus, even if the system clock might not be in the expected range. So I guess that when time_is_in_valid_tor_range returns false, we should check (probably several times, waiting a little bit between each try) whether Tor has actually bootstrapped. If it has, we return without setting the system clock; if it hasn’t, we set the system clock to the middle of the consensus validity range, just like we’ve been doing so far.

#16 Updated by hefee 2019-05-16 18:44:48

intrigeri wrote:
> Reviewed on Salsa! Once we have test coverage for the cases that this branch affects, I’m happy to merge it.

As I created tests, but those are failing also for released versions 3.13.2 and 3.12.1. After investigating the issue with a manual test. I see that 20-time.sh expects the timerage for a certificate to be 3h and not 5min.
@anonym where we can change the certificate time rage for our test tor network?

Btw. the manual test was successful, so it is “only” an issue of our test network.

> But that won’t fully resolve Bug #16471 as we still need to take benefit, in the more common cases (hardware clock is correct but in local timezone instead of UTC) from:
>
> * https://trac.torproject.org/projects/tor/ticket/24661 (when the client’s clock is ahead of the network by up to 1 day); that’s in 0.3.5; I’m hoping this
> * https://trac.torproject.org/projects/tor/ticket/28591 (when the client’s clock is behind the network); that’s in 0.4.x
> * https://trac.torproject.org/projects/tor/ticket/23605 (another case of client’s clock ahead of the network); in 0.4.x
>
> To be clear, ideally we would be able to get rid of maybe_set_time_from_tor_consensus() entirely: that’s the part of our time sync’ system that is problematic from a security PoV). I suspect we’ll need 0.4.x to do that.

My tests show, that we can’t get rid of it completely atm. I removed the maybe_set_time_from_tor_consensus() and end up with the attached screenshot. I’m unsure if the mentioned tickets will help to get rid of it completely as they “only extend” the range of then a consensus is accepted. But if we have completely broken clock, that is reset to timestamp=0 we still need maybe_set_time_from_tor_consensus(). But we will se, if we have a working test for it, we can investigate further.

> In any case, even if we can’t remove maybe_set_time_from_tor_consensus() yet, we should adjust 20-time.sh so that we only change the system clock when it’s actually needed, i.e. expose our users to a dangerous situation less often: currently we do that all the time, as long as time_is_in_valid_tor_range returns false, which nowadays includes cases when tor will have actually judged the system clock was good enough, and accepted using the downloaded consensus, even if the system clock might not be in the expected range. So I guess that when time_is_in_valid_tor_range returns false, we should check (probably several times, waiting a little bit between each try) whether Tor has actually bootstrapped. If it has, we return without setting the system clock; if it hasn’t, we set the system clock to the middle of the consensus validity range, just like we’ve been doing so far.

That sounds like a very reasonable approach, but this would make the script even longer not shorter and this ticket is about removing code ;D

What I don’t understand, why you think, that the date -s "{vmid}", is that dangerous. A user will need to wait till htpdate updated the clock anyways, before they are able to access the internet. Or is this about, that it makes Tails unique behavior visible withing the tor network? And that it makes everything more slow?

#17 Updated by hefee 2019-05-16 18:46:23

@intrigeri - I spent 2h for this ticket - maybe another 2h to getting the tests working, some manual testing,…

#18 Updated by anonym 2019-05-17 09:16:10

hefee wrote:
> As I created tests, but those are failing also for released versions 3.13.2 and 3.12.1. After investigating the issue with a manual test. I see that 20-time.sh expects the timerage for a certificate to be 3h and not 5min.
> anonym where we can change the certificate time rage for our test tor network? nickm? You'll have to mess with the values in submodules/chutney/torrc_templates/authority.i. The tor@ man page is your friend. The little-t-tor people (perhaps Nick) can probably help.

#19 Updated by intrigeri 2019-05-18 06:46:09

  • Assignee changed from intrigeri to hefee

> anonym where we can change the certificate time rage for our test tor network?

You’ve got a reply so next step here is to use this info and make the tests pass.

> Btw. the manual test was successful, so it is “only” an issue of our test network.

Great!

> That sounds like a very reasonable approach, but this would make the script even longer not shorter and this ticket is about removing code ;D

Sorry I’ve been unclear: the goal here is to remove dangerous behavior. We can hope it allows us to remove code at the end of the day (once we have Tor 0.4.x) but it’s OK if it actually adds code, if it protects our users better :)

> What I don’t understand, why you think, that the date -s "{vmid}", is that dangerous. A user will need to wait till htpdate updated the clock anyways, before they are able to access the internet. Or is this about, that it makes Tails unique behavior visible withing the tor network? And that it makes everything more slow?

It essentially disables Tor’s check for consensus freshness, which is there for a reason. Sure, htpdate might fix this after the fact, but htpdate can be run while tor is using a replayed (obsolete) consensus. The consequences are not well understood but:

  • At the very least, indeed that behavior is unique to Tails.
  • It’s been pointed out by Tor developers that our security analysis is incorrect. I can’t find the corresponding discussion anymore. We never took time to check this properly, because for years we’ve been operating under the assumption that Feature #5774 would be solved soon.

#20 Updated by intrigeri 2019-05-18 06:46:33

  • related to Bug #9256: Don't restart Tor after setting the right clock added

#21 Updated by intrigeri 2019-05-18 06:47:01

  • Assignee changed from hefee to intrigeri
  • Estimated time set to 8 h

#22 Updated by intrigeri 2019-05-18 06:47:36

  • Assignee changed from intrigeri to hefee

#23 Updated by intrigeri 2019-05-18 06:48:01

> You’ll have to mess with the values in submodules/chutney/torrc_templates/authority.i. The tor man page is your friend. The little-t-tor people (perhaps Nick) can probably help.

hefee, I’m happy to help decipher this before you ask Tor folks, if you have a hard time with it.

#24 Updated by intrigeri 2019-05-18 08:31:22

Also, note that we have a feature/tor-nightly-master branch that can be used as a basis to try stuff based on a more recent tor. If it turns out that tor master allows us to get rid of maybe_set_time_from_tor_consensus() entirely, we can skip the intermediary solution I suggested earlier here.

#25 Updated by intrigeri 2019-05-18 17:40:49

Taking another step back, given this likely won’t make it into 3.14, and there’s a good chance 3.15 (or worst case 3.16) ships with tor 0.4.x, I’m starting to think our time would be better spend doing this directly with 0.4.x, using https://deb.torproject.org/torproject.org/dists/tor-experimental-0.4.0.x-stretch/ for now. A branch for Feature #16687 could upgrade to that version of tor (that will already teach us something) and the branch for this ticket would be based on it. In other words: I recommend you forget about 0.3.5 and focus on 0.4.x. What do you think?

#26 Updated by anonym 2019-05-21 12:00:02

intrigeri wrote:
> > You’ll have to mess with the values in submodules/chutney/torrc_templates/authority.i. The tor man page is your friend. The little-t-tor people (perhaps Nick) can probably help.
>
> hefee, I’m happy to help decipher this before you ask Tor folks, if you have a hard time with it.

Me too!

#27 Updated by hefee 2019-05-21 22:28:50

intrigeri wrote:
> > You’ll have to mess with the values in submodules/chutney/torrc_templates/authority.i. The tor man page is your friend. The little-t-tor people (perhaps Nick) can probably help.
>
> hefee, I’m happy to help decipher this before you ask Tor folks, if you have a hard time with it.

I found the solution to make the validity check in 20time.sh pass, also for 3.13.2. But I’m unsure how I can propose an update, as it is in chutney submodule and I do not have write permissions.

@intrigeri: should I extract the additional tests + chutney updates for stable branch? As those tests should pass also for current branch?

#28 Updated by hefee 2019-05-21 22:30:05

#29 Updated by hefee 2019-05-21 22:33:08

  • related to deleted (Bug #9256: Don't restart Tor after setting the right clock)

#30 Updated by hefee 2019-05-21 22:33:11

  • blocks Bug #9256: Don't restart Tor after setting the right clock added

#31 Updated by hefee 2019-05-21 22:37:05

intrigeri wrote:
> Taking another step back, given this likely won’t make it into 3.14, and there’s a good chance 3.15 (or worst case 3.16) ships with tor 0.4.x, I’m starting to think our time would be better spend doing this directly with 0.4.x, using https://deb.torproject.org/torproject.org/dists/tor-experimental-0.4.0.x-stretch/ for now. A branch for Feature #16687 could upgrade to that version of tor (that will already teach us something) and the branch for this ticket would be based on it. In other words: I recommend you forget about 0.3.5 and focus on 0.4.x. What do you think?

I use now feature/tor-nightly-master and whoo, we can get rid of maybe_set_time_from_tor_consensus() entirely. As we have now tests to set the system clock +40days and –10 years and they pass. I updated blocks/blocked by relations. To make it visible, that this is blocked by tor 4.x in tails.

#32 Updated by hefee 2019-05-21 22:37:22

  • Assignee changed from hefee to intrigeri

#33 Updated by intrigeri 2019-05-23 08:34:32

  • Target version changed from Tails_3.14 to Tails_3.15

#34 Updated by intrigeri 2019-05-24 08:30:19

  • QA Check changed from Dev Needed to Ready for QA

@hefee, I assume you meant to set this “Ready for QA”; otherwise, please clarify what dev you expect me to do :)

#35 Updated by hefee 2019-05-24 15:24:50

intrigeri wrote:
> @hefee, I assume you meant to set this “Ready for QA”; otherwise, please clarify what dev you expect me to do :)

yes “ready for QA” is the correct status, forgotten to update the status.

#36 Updated by intrigeri 2019-05-29 17:04:56

  • Assignee changed from intrigeri to hefee
  • QA Check changed from Ready for QA to Dev Needed

Great job! See review on Salsa :)

#37 Updated by intrigeri 2019-06-03 13:56:38

  • Status changed from In Progress to Needs Validation
  • Assignee changed from hefee to intrigeri

(The MR was reassigned to me.)

#38 Updated by intrigeri 2019-06-08 08:38:50

  • Status changed from Needs Validation to In Progress
  • Assignee changed from intrigeri to hefee

#39 Updated by intrigeri 2019-06-08 18:07:14

FWIW I’ve pushed bugfix/16471-drop-time-synchronization-hacks+force-all-tests that differs from your branch in two ways:

  • I’ve rebased it onto the branch for Feature #16687 (itself based on stable), so that this can indeed be a candidate for 3.15. I think you should do the same, i.e. hard reset your branch to this one, modulo perhaps:
  • I’ve reverted the Chutney change, in order to see what happens in the test suite without it: as explained on the corresponding MR, I still don’t understand why we still need these changes with the code updates you’re proposing for tails.git. I’m curious, we’ll see what CI thinks :)

#40 Updated by hefee 2019-06-08 21:27:58

@intrigeri I think we should schedule a short meeting about issue to clarify things about this issue, what we should concentrate on etc. and what new issues we may need to create. As it seems for me, we are not on the same page and this ping-pong over comments on salsa takes for ever and I starting to get frustrated about the progress. That’s why I want to try, if a jabber discussion may help to solve this.

#41 Updated by intrigeri 2019-06-09 15:55:40

intrigeri wrote:
> FWIW I’ve pushed bugfix/16471-drop-time-synchronization-hacks+force-all-tests

Unsurprisingly, the tests fail. This made me notice Bug #16793. So I’ve merged my branch for Bug #16793 into bugfix/16471-drop-time-synchronization-hacks+force-all-tests, in order to see what fails without the Chutney changes :)

#42 Updated by intrigeri 2019-06-09 15:56:57

hefee wrote:
> @intrigeri I think we should schedule a short meeting about issue to clarify things about this issue, what we should concentrate on etc. and what new issues we may need to create. As it seems for me, we are not on the same page and this ping-pong over comments on salsa takes for ever and I starting to get frustrated about the progress. That’s why I want to try, if a jabber discussion may help to solve this.

I totally agree in principle but my availability this month makes it very hard to schedule (more) meetings. Worst case, let’s discuss this during the upcoming Buster sprint?

#43 Updated by intrigeri 2019-06-14 12:31:30

  • blocked by Bug #16792: Upgrade our Chutney fork added

#44 Updated by intrigeri 2019-06-14 12:33:57

After a meeting between hefee, anonym and I, the plan is:

  1. do manual tests on bugfix/16471-drop-time-synchronization-hacks+force-all-tests (see https://salsa.debian.org/tails-team/tails/merge_requests/21#note_90775 where I documented how to do such tests), for +/- 12h, 24h, 48h, 1 week, 1 month, 1 year (stop when it starts failing) [intrigeri]
  2. automatic tests for the same scenarios using Chutney (we have that already): https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_bugfix-16471-drop-time-synchronization-hacks-force-all-tests/
  3. refresh our chutney sources (Bug #16792) + update network configuration to be more similar to the real Tor network [anonym]
  4. compare (1) and (2) and (3) so we can finally have a good idea of whether we can rely on Chutney for these things [anonym]
  5. from (1), infer what tor currently allows in terms of clock skew
  6. ask tor devs to confirm our findings from (5)
  7. schedule a meeting where we’ll discuss our findings and will try to reach a conclusion wrt. what part of the “set clock from tor consensus” dirty hack we can remove.

#45 Updated by intrigeri 2019-06-14 12:35:46

  • Assignee changed from hefee to intrigeri

(Next step is on my plate.)

#46 Updated by intrigeri 2019-06-14 15:08:29

  • Assignee changed from intrigeri to anonym

intrigeri wrote:
> # do manual tests on bugfix/16471-drop-time-synchronization-hacks+force-all-tests (see https://salsa.debian.org/tails-team/tails/merge_requests/21#note_90775 where I documented how to do such tests), for +/- 12h, 24h, 48h, 1 week, 1 month, 1 year (stop when it starts failing) [intrigeri]

Test results with an ISO built at commit:b13a5f18cb, that has tor 0.4.0.5-1~d90.stretch+1 installed (OK means that tor bootstraps successfully, FAIL means that tor fails to bootstrap:

  • direct connection (no bridge):
    • –12h: OK
    • +12h: OK
    • –24h: OK
    • +24h: OK
    • –48h: FAIL
    • +48h: FAIL
  • using a bridge:
    • –12h: initially FAIL (Tor Launcher displays an error about clock skew), but OK after retrying config + connect in Tor Launcher → bug in Tor, in Tor Launcher, or in their interaction?
    • +12h: OK
    • –24h: initially FAIL (Tor Launcher displays an error about clock skew), but OK after retrying config + connect in Tor Launcher → bug in Tor, in Tor Launcher, or in their interaction?
    • +24h: OK
    • –48h: FAIL
    • +48h: FAIL

Apart of the weird behavior when using bridges on a client whose clock is in the past, which should be investigated and reported upstream, at first glance this essentially means that:

  • We don’t need to set the clock according to the tor consensus (dangerous) anymore, for clocks that are a bit off (up to 24h is fine, maybe a bit more); this is great as it covers the “clock is set to local time, not to UTC” big problem, i.e. anyone with an accurate hardware clock, regardless of what timezone it’s in, will not be exposed to dangerous code paths anymore. Except it’s more complicated, see below.
  • As expected, tor 0.4.0.x does not magically fix clocks that are very wrong. So until we do Feature #5774 (i.e. ask the user what time it is), we have to either drop support for these systems (I’m not looking forward to debating this) or to keep setting time according to whatever tor consensus we receive.

Problem is, as long as we do trust time from tor consensus in some cases, we still expose all users to replay attacks: we can’t differentiate between “our clock is correct but a 1-week consensus is being replayed to us” and “the clock is one week in the future”. But at least we could:

  • Avoid triggering the dangerous code path when the clock appears to be in the future: I assume that most hardware clocks that are very wrong (i.e. more than due to “hardware clock is set to local time”, which won’t prevent tor from bootstrapping anymore) are in the past. This should protect against consensus replay attacks.
  • Ensure we don’t trigger the dangerous code path in cases when we don’t need to, i.e. only trigger it when tor indeed fails to bootstrap due to clock issues. I believe tor now exposes this kind of info on the control port (so that Tor Launcher can tell the user something hopefully useful).

Reassigning to anonym, because the next step is on his plate.

#47 Updated by intrigeri 2019-07-05 09:29:45

  • Target version changed from Tails_3.15 to Tails_3.16

I doubt we’ll finish all this by July 9 given Bug #16792 is not done yet => postponing :)

#48 Updated by intrigeri 2019-07-06 11:28:02

  • Subject changed from Drop time synchronization hacks that tor 0.3.5 and 0.4.x made obsolete (if any) to Drop time synchronization hacks that tor 0.3.5 and 0.4.x made obsolete

#49 Updated by intrigeri 2019-08-27 18:47:37

  • Target version changed from Tails_3.16 to Tails_3.17

#50 Updated by intrigeri 2019-09-05 10:53:13

  • Assignee deleted (anonym)

#51 Updated by intrigeri 2019-09-05 14:40:21

  • Target version changed from Tails_3.17 to Tails_4.0

(The branches for Bug #16792 were rebased on top of devel, whose test suite is more robust, in order to collect more useful data.)

#52 Updated by intrigeri 2019-09-12 12:03:50

  • Target version changed from Tails_4.0 to Tails_4.1

Postponing: we have lots of higher priority stuff to do in the next 1.5 month.

#53 Updated by intrigeri 2019-09-15 07:53:23

> 2. automatic tests for the same scenarios using Chutney (we have that already): https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_bugfix-16471-drop-time-synchronization-hacks-force-all-tests/

“we have that already” was incorrect so I’ve fixed that in commit:571ed4a1d35ca99964feed0e111e1574d2aa5699 so we have data next time we come back to it.

#54 Updated by intrigeri 2019-09-15 07:56:43

  • Feature Branch changed from hefee/bugfix/16471-drop-time-synchronization-hacks+force-all-tests, https://salsa.debian.org/tails-team/tails/merge_requests/21 to bugfix/16471-drop-time-synchronization-hacks+force-all-tests, https://salsa.debian.org/tails-team/tails/merge_requests/21

> 1. do manual tests on bugfix/16471-drop-time-synchronization-hacks+force-all-tests (see https://salsa.debian.org/tails-team/tails/merge_requests/21#note_90775 where I documented how to do such tests), for +/- 12h, 24h, 48h, 1 week, 1 month, 1 year (stop when it starts failing) [intrigeri]

Meanwhile, we’ve upgraded to tor 0.4.1 and Feature #16356 gives us a new Tor Launcher (might be relevant wrt. the weird behavior I’ve seen with bridges last time), so we should redo these tests in a branch that has both Tor Browser 9 and the changes brought on this ticket.

#55 Updated by intrigeri 2019-09-16 15:47:55

  • blocks deleted (Bug #16792: Upgrade our Chutney fork)

#56 Updated by intrigeri 2019-09-16 15:49:44

  • Assignee set to intrigeri

>> 2. automatic tests for the same scenarios using Chutney (we have that already): https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_bugfix-16471-drop-time-synchronization-hacks-force-all-tests/

> “we have that already” was incorrect so I’ve fixed that in commit:571ed4a1d35ca99964feed0e111e1574d2aa5699 so we have data next time we come back to it.

I see the exact same results on my local Jenkins as when I tested these scenarios manually in Bug #16471#note-46.

So, I think we have enough data to draw conclusions about this, even if we don’t have completed (3) yet:

> 4. compare (1) and (2) and (3) so we can finally have a good idea of whether we can rely on Chutney for these things [anonym]

I’ve seen the exact same results with Chutney as in my manual tests, so yes, it seems we can rely on Chutney. This is reassuring! :)

Step (3) is harder than expected and was kind of a bonus step as long as (1) == (2), so let’s not block on it.

I’ll still want to redo the manual tests without Chutney before as the last pre-merge step, but for now, let’s move on.

> 5. from (1), infer what tor currently allows in terms of clock skew

Apart of the weird behavior with bridges and a clock set 12-24h in the past: it seems that a ±24h clock skew is OK.

So, the next steps are:

> 6. ask tor devs to confirm our findings from (5)
> 7. schedule a meeting where we’ll discuss our findings and will try to reach a conclusion wrt. what part of the “set clock from tor consensus” dirty hack we can remove.

I’ll do both, either during the 4.1 dev cycle, or somewhat earlier as a structured procrastination opportunity :)

#57 Updated by intrigeri 2019-11-08 18:32:55

The branch for Bug #16792 is now in good shape so I’ve merged it into this one.

#58 Updated by intrigeri 2019-11-08 18:33:55

  • blocked by Feature #16782: Clarify who receives Code of Conduct reports added

#59 Updated by intrigeri 2019-11-08 18:34:02

  • blocks deleted (Feature #16782: Clarify who receives Code of Conduct reports)

#60 Updated by intrigeri 2019-11-08 18:34:07

  • blocks Bug #16792: Upgrade our Chutney fork added

#61 Updated by intrigeri 2019-11-08 18:34:12

  • blocked by deleted (Bug #16792: Upgrade our Chutney fork)

#62 Updated by intrigeri 2019-11-08 18:34:20

  • blocked by Bug #16792: Upgrade our Chutney fork added

#63 Updated by intrigeri 2019-11-09 07:25:12

Wrt. the weird behavior with bridges and a clock 12-24h in the past: I can reproduce this in a regular Debian sid system whose hardware clock is set 24h before current UTC time. The exact error message is:

Tor failed to establish a Tor network connection.

Loading authority certificates failed (Clock skew -81944 in microdesc flavor consensus from CONSENSUS - ?).

Attaching screenshot and Tor logs.

#64 Updated by intrigeri 2019-11-09 07:51:18

intrigeri wrote:
> Wrt. the weird behavior with bridges and a clock 12-24h in the past:

Reported upstream: https://trac.torproject.org/projects/tor/ticket/32438.

Note that I could reproduce this problem (outside of Tails) without using bridges, while in my first series of tests (Bug #16471#note-46) I saw the problem only when using bridges.

#65 Updated by intrigeri 2019-11-09 08:11:00

intrigeri wrote:
> After a meeting between hefee, anonym and I, the plan is:

> [… snipping steps 1-5 that have been done]
> 6. ask tor devs to confirm our findings from (5)

Done: https://lists.torproject.org/pipermail/tor-dev/2019-November/014076.html

If/once they confirm, next step will be:

> 7. schedule a meeting where we’ll discuss our findings and will try to reach a conclusion wrt. what part of the “set clock from tor consensus” dirty hack we can remove.

#66 Updated by intrigeri 2019-11-11 10:17:18

  • Target version changed from Tails_4.1 to Tails_4.2

intrigeri wrote:
> intrigeri wrote:
> > After a meeting between hefee, anonym and I, the plan is:
>
> > [… snipping steps 1-5 that have been done]
> > 6. ask tor devs to confirm our findings from (5)
>
> Done: https://lists.torproject.org/pipermail/tor-dev/2019-November/014076.html
>
> If/once they confirm,

asn (George Kadianakis) confirmed that “the ±24h value seems plausible”.
One exception is “v3 onion services only tolerate skews of maximum ±3 hours” but that’s fine in the context of this ticket: once tor has bootstrapped, we still have htpdate to set the clock more accurately.

So I think we’re good to go with the next step:

> > 7. schedule a meeting where we’ll discuss our findings and will try to reach a conclusion wrt. what part of the “set clock from tor consensus” dirty hack we can remove.

I think at least anonym and I will be pretty busy with other (non technical matters for the two of us, sprints and trips for me) during this cycle, so I’m postponing this to 4.2.

At first glance, my proposal there is likely to be that we should stop setting the clock from the consensus in two cases:

  • the clock appears to be in the future (by any value)
  • the clock appears to be in the past, by maximum 24h

And then:

  • If the clock is no more than 24h incorrect, tor will bootstrap successfully. This covers the case when the hardware clock is set to the correct local time, in any timezone. htpdate will later set the clock more accurately.
  • If the clock is more than 24h in the future, tor will fail to bootstrap (while as of Tails 4.0, we would set the clock to the consensus time and retry). This protects against consensus replay attacks. I think the security benefits are worth the UX regression: I expect most clocks that are wrong to be stuck in the past, e.g. due to a tired CMOS battery.
  • If the clock is more than 24h in the past, we keep fixing it using the consensus time. There’s no concern about a replayed old consensus so the security drawbacks are smaller in this case.

#67 Updated by intrigeri 2019-12-01 10:53:13

  • Target version changed from Tails_4.2 to Tails_4.3

I’d rather focus on other matters in December.

#68 Updated by intrigeri 2019-12-30 13:49:37

I’ve proposed meeting dates (late January) to @anonym and hefee.

#69 Updated by intrigeri 2020-01-31 14:27:58

  • Assignee changed from intrigeri to anonym

Most relevant starting points to catch up with what happened since June:

anonym, hefee and intrigeri met today.

We agreed that as far as Bug #16471 is concerned, we can’t fix All The Things: the best we can do is to make the security issues caused by our current time sync’ impementation affect less users and/or have less severe consequences.

That is the goal of the proposal drafted on https://redmine.tails.boum.org/code/issues/16471#note-66

Consequences of this proposal:

  • For most users, who have a mostly correct clock (±24h), even if it’s possibly set to local time (as opposed to UTC):
    • If they get a current, legit consensus, we won’t set the clock from the Tor consensus anymore ⇒ we don’t need to reason about whether what we’re doing is safe in this case anymore.
    • If they’re served a replayed, old consensus, then we fail to bootstrap, which protects users against the attack. But then, what’s the UX and what can the user do to fix it?
      • If an attacker is persisting on replaying the consensus, it is outside the power of the user to fix the situation (except moving to a network where the attackers has no control).
      • We can notify the user when we fail to bootstrap, suggesting them to restart, set the hardware clock to the correct time either in the BIOS config or in Tails via the command line (ideally UTC: we have doc for this — because when this failure mode occurs, 20-time.sh has no way to tell whether the hardware clock is in the future, or the user is under replayed consensus attack), and retry.
      • Ideally we would propose to fix the hardware clock ourselves, but that’s reaching a bit beyond the scope of this ticket, and belongs more to Feature #5774.
  • For users whose CMOS battery is tired and the hardware clock is lagging behind by more than 24h, nothing changes in terms of UX and security. They’re still exposed to a replayed old consensus, as long as it’s newer than their broken clock. We can’t fix this in the scope of Bug #16471 (possible solutions that have been floated around include: prompting the user; doing a htpdate-style time query without Tor).
  • Users whose hardware clock is set more than 24h after current UTC time, and receive a current, legit consensus: we fail to bootstrap (while previously we would succeed). This causes a UX regression, which is discussed above in the replayed consensus case (the UX is the same, because when this failure mode occurs, 20-time.sh can’t tell if it’s because the clock is incorrectly set in the future, or the user is under attack).

Next steps:

  1. anonym submits a proposal to sajolida and see where he’d like to draw the line between investing code/UX/doc resources into the cases when UX regresses in this proposal and investing them elsewhere.
  2. Likely we’ll need some UX/doc work done.
  3. intrigeri checks the “I expect most clocks that are wrong to be stuck in the past, e.g. due to a tired CMOS battery” assumption with folks who support people with a broad range of hardware. That will tell us how frequently the UX regression would trigger (hefee does not know and thinks that if we implement this proposal, we’ll learn about such issues — but that would be a bit too late).
  4. anonym will check if Windows 10 saves the time to the hardware clock (this would help minimize the frequency of the UX regression caused by hardware clocks set more than 24h in the future)

#70 Updated by anonym 2020-02-11 15:25:49

  • Target version changed from Tails_4.3 to Tails_4.4

#71 Updated by CyrilBrulebois 2020-03-12 09:55:52

  • Target version changed from Tails_4.4 to Tails_4.5

#72 Updated by CyrilBrulebois 2020-04-07 17:05:13

  • Target version changed from Tails_4.5 to Tails_4.6

#73 Updated by anonym 2020-05-05 09:51:53

  • Assignee deleted (anonym)

> Next steps:
>
> # anonym submits a proposal to sajolida and see where he’d like to draw the line between investing code/UX/doc resources into the cases when UX regresses in this proposal and investing them elsewhere.

Done: https://lists.autistici.org/message/20200427.125933.85cd319a.en.html

> # anonym will check if Windows 10 saves the time to the hardware clock (this would help minimize the frequency of the UX regression caused by hardware clocks set more than 24h in the future)

I tested on two Windows 10 machines (if it matters, one is <1yo, the other from the Windows 8 era) by first setting a bogus time and then manually make Windows sync the clock. This makes Windows 10 save my local time to the hardware clock, breaking Tails’ assumption that “hardware clock is in UTC”. So it seems Windows will continue to give us headaches on this front.

#74 Updated by CyrilBrulebois 2020-05-06 04:28:54

  • Target version changed from Tails_4.6 to Tails_4.7

#75 Updated by intrigeri 2020-05-07 08:43:53

>> # anonym submits a proposal to sajolida and see where he’d like to draw the line between investing code/UX/doc resources into the cases when UX regresses in this proposal and investing them elsewhere.
>
> Done: https://lists.autistici.org/message/20200427.125933.85cd319a.en.html

So next steps on this front are: ensure this discussion reaches a conclusion, then report back here.

#76 Updated by anonym 2020-05-08 14:01:13

  • Assignee set to anonym

intrigeri wrote:
> So next steps on this front are: ensure this discussion reaches a conclusion, then report back here.

I’ll try to make sure it happens