Bug #16972

tordate sometimes breaks obfs4 by messing with a correct clock

Added by intrigeri 2019-08-11 16:44:00 . Updated 2019-09-05 00:03:22 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Time synchronization
Target version:
Start date:
Due date:
% Done:

100%

Feature Branch:
bugfix/16972-tordate-dont-fix-correct-clock+force-all-tests
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

time_is_in_valid_tor_range returns false for clocks that are in the last 30 minutes of the 3 hours Tor consensus validity range, and then we’ll set the clock to the middle of the valid range. I’ve seen a case when the clock that we set in such a case, in the hope it’ll help, actually breaks obfs4 while it was otherwise very much looking like it would have worked, had we not set a wrong system time: it had downloaded a consensus successfully and could verify it. I don’t know if the original system clock was correct or not: the user says it was correct and in UTC but there might have been a misunderstanding. Anyhow, after tordate changed the clock and restarted tor, obfs4proxy could not connect anymore (apparently due to SSL problems that I can’t explain except by clock mismatch).

I could not find why time_is_in_valid_tor_range() treats a clock, that’s in the last 30 minutes of the 3 hours Tor consensus validity range, as wrong and in need of fixing: we’ve had this code since the very beginning of tordate (commit:58e1b1a3835dfa84c831ad2a4e205ae8c129494e, 2011! :) It could be that back then, tor wasn’t good at refreshing its consensus before it expired? In this case, for a clock set to 15:49:XY, I see “will expire at 2018-06-14 16:00:00; fetching the next one at 2018-06-14 15:54:23” in the logs.

In order to decrease the risk that we break stuff while trying to help, I think we should replace “30 minutes” with 5 or 10 minutes in this algorithm: it should still leave tor enough time to notice that its consensus is about to expire, and refresh it.


Subtasks


Related issues

Related to Tails - Bug #15548: Tails can't establish a connection with obfs4 bridges and a hardware clock too far away from UTC Confirmed 2018-05-09

History

#1 Updated by intrigeri 2019-08-11 16:49:46

  • Status changed from Confirmed to In Progress
  • Feature Branch set to bugfix/16972-tordate-dont-fix-correct-clock+force-all-tests

#2 Updated by intrigeri 2019-08-11 16:50:05

  • related to Bug #15548: Tails can't establish a connection with obfs4 bridges and a hardware clock too far away from UTC added

#3 Updated by intrigeri 2019-08-11 21:50:19

  • Description updated

#4 Updated by intrigeri 2019-08-11 21:51:27

  • Status changed from In Progress to Needs Validation
  • Assignee changed from intrigeri to anonym
  • Target version set to Tails_3.16

No relevant regressions in a full test suite run on my local Jenkins.

@anonym, what do you think?

#5 Updated by anonym 2019-08-16 14:03:50

  • Status changed from Needs Validation to Fix committed
  • Assignee deleted (anonym)
  • % Done changed from 0 to 100

intrigeri wrote:
> I could not find why time_is_in_valid_tor_range() treats a clock, that’s in the last 30 minutes of the 3 hours Tor consensus validity range, as wrong and in need of fixing: we’ve had this code since the very beginning of tordate (commit:58e1b1a3835dfa84c831ad2a4e205ae8c129494e, 2011! :) It could be that back then, tor wasn’t good at refreshing its consensus before it expired?

Wow, good question! I looked for clues in its Liberte linux origins, and the closet is a comment saying “Check whether current time is in (conservative) range” which I’m not even sure how to interpret.

But yes, IIRC tor used to misbehave if it was running a consensus during the last hour of its validity. I definitely remember that tor used to happily let such a consensus become invalid without attempting to fetch a fresher consensus until way later, leaving you unable to build circuits meanwhile. It was like tor had a minimum delay between fetching consensuses, which obviously was wrong in this situation, and probably fixed by the tor devs at some point.

> In this case, for a clock set to 15:49:XY, I see “will expire at 2018-06-14 16:00:00; fetching the next one at 2018-06-14 15:54:23” in the logs.

Nice!

> In order to decrease the risk that we break stuff while trying to help, I think we should replace “30 minutes” with 5 or 10 minutes in this algorithm: it should still leave tor enough time to notice that its consensus is about to expire, and refresh it.

Makes a lot of sense! But I guess this still only reduces the chance of the problem happening, right? Still an improvement, and no need to waste more time on this since we will drop it with Bug #16471.

> No relevant regressions in a full test suite run on my local Jenkins.

No bootstrapping issues on “our” Jenkins either. Merging!

#6 Updated by intrigeri 2019-08-16 15:14:42

hi @anonym,

> But I guess this still only reduces the chance of the problem happening, right?

Exactly.

> no need to waste more time on this since we will drop it with Bug #16471.

Note that my current conclusion on Bug #16471 is essentially: we’ll have to go through this code path in fewer situations but we still need it.

#7 Updated by CyrilBrulebois 2019-09-05 00:03:22

  • Status changed from Fix committed to Resolved