Feature #12411
Stop restarting tor if bootstrapping stalls
100%
Description
While investigating Bug #10238 preliminary tests indicate that tor
might have fixed the issues we used to experience with the bootstrap process stalling and requiring a restart to kickstart it (Feature #9516). Let’s see if we can get rid of this crap!
Subtasks
Related issues
Related to Tails - |
Rejected | 2015-09-23 | |
Related to Tails - Bug #13541: Tor still sometimes fails to bootstrap in the test suite | In Progress | 2017-07-31 |
History
#1 Updated by anonym 2017-03-29 15:18:36
- % Done changed from 0 to 20
- QA Check set to Ready for QA
- Feature Branch set to feature/12411-tor-bootstrap-fixes
Let’s see what Jenkins thinks. I intend to test this branch many times before considering it further for inclusion in Tails 2.12.
#2 Updated by anonym 2017-03-29 15:21:59
- Status changed from Confirmed to In Progress
Applied in changeset commit:729baa2b32b0d4c43c8cecbc5d5fc5a30ba3e40f.
#3 Updated by intrigeri 2017-03-30 10:28:56
anonym wrote:
> Let’s see what Jenkins thinks. I intend to test this branch many times before considering it further for inclusion in Tails 2.12.
Let me know if you want me to run it many times on a poorer Internet connection.
#4 Updated by anonym 2017-03-31 13:42:06
intrigeri wrote:
> anonym wrote:
> > Let’s see what Jenkins thinks. I intend to test this branch many times before considering it further for inclusion in Tails 2.12.
>
> Let me know if you want me to run it many times on a poorer Internet connection.
My branch seem to make /usr/local/sbin/tor-has-bootstrapped
always fail, so I have to try to understand how all that systemd state works again.
Until then I’d appreciate if you could build and run the test suite for these branches:
test/tor-bootstrap-robustness_tor-0.2.9.x
test/tor-bootstrap-robustness_tor-0.3.1.x
#5 Updated by intrigeri 2017-04-01 07:08:09
> My branch seem to make /usr/local/sbin/tor-has-bootstrapped
always fail, so I have to try to understand how all that systemd state works again.
I can give a hand if needed (over the week-end, or on Wednesday).
> Until then I’d appreciate if you could build and run the test suite for these branches:
ACK. On which ticket shall I report back?
#6 Updated by anonym 2017-04-01 11:41:23
intrigeri wrote:
> > My branch seem to make /usr/local/sbin/tor-has-bootstrapped
always fail, so I have to try to understand how all that systemd state works again.
>
> I can give a hand if needed (over the week-end, or on Wednesday).
I’ve fixed it, so please test the feature branch as well!
> > Until then I’d appreciate if you could build and run the test suite for these branches:
>
> ACK. On which ticket shall I report back?
Here is good, but I’ll treat those results with a grain of salt since they also kill tordate
. IMHO they still give good data about tor
-specific bootstrapping issues.
#7 Updated by intrigeri 2017-04-01 13:55:00
> I’ve fixed it, so please test the feature branch as well!
OK, I’m on it.
>> > Until then I’d appreciate if you could build and run the test suite for these branches:
>>
>> ACK. On which ticket shall I report back?
> Here is good, but I’ll treat those results with a grain of salt since they also kill tordate
. IMHO they still give good data about tor
-specific bootstrapping issues.
I’ve run the test suite from both branches and they completed successfully, with a (download) saturated and relatively slow Internet connection.
#8 Updated by intrigeri 2017-04-01 18:32:01
> I’ve fixed it, so please test the feature branch as well!
Passes!
#9 Updated by intrigeri 2017-04-04 05:56:36
Please reassign to me as soon as the branch builds again and is ready for code review (even if you want to run more tests on it).
#10 Updated by anonym 2017-04-06 09:25:14
- Assignee changed from anonym to intrigeri
- % Done changed from 20 to 50
So not a single bootstrap issue has been observed by you, me, sid-Jenkins and lizard-Jenkins — please review’n’merge!
#11 Updated by intrigeri 2017-04-06 09:33:04
- related to
Bug #10238: Investigate why Tor sometimes is slow at or even fails bootstrapping added
#12 Updated by intrigeri 2017-04-06 09:59:01
- Assignee changed from intrigeri to anonym
Code review at commit:4d3ba48037be1f42576c75b7092feda3079c8614:
- In
get_tor_control_socket_path()
, I think we should also look in/usr/share/tor/tor-service-defaults-torrc
(this is the place where the default socket is configured on Debian), otherwise looking in/etc/tor/torrc
is kinda futile. - It’s a bit sad to add a dependency on socat, while we already include nc that (iirc) can talk to Unix sockets.
- I’m curious why
tor_bootstrap_progress()
doesn’t need to pass-n
toecho
anymore. - I’d rather see extended regexp:s, because I would prefer not to have to learn another regexp language.
IMO none of these are real blockers for merging, but I would appreciate if the 3 first items on this list were fixed during the freeze.
So please make sure that Scenario: Clock with host's time in bridge mode
passes (it’s disabled on Jenkins), merge, and reassign to you for these fixes.
What’s “sid-Jenkins”?
#13 Updated by anonym 2017-04-06 11:48:38
- Assignee changed from anonym to intrigeri
- QA Check changed from Ready for QA to Info Needed
intrigeri wrote:
> Code review at commit:4d3ba48037be1f42576c75b7092feda3079c8614:
>
> * In get_tor_control_socket_path()
, I think we should also look in /usr/share/tor/tor-service-defaults-torrc
(this is the place where the default socket is configured on Debian), otherwise looking in /etc/tor/torrc
is kinda futile.
Good point! Will fix.
> * It’s a bit sad to add a dependency on socat, while we already include nc that (iirc) can talk to Unix sockets.
We install netcat-traditional
, which can’t talk to UNIX sockets. netcat-openbsd
can, though. I didn’t want to potentially break other stuff we use nc
/@netcat@ by switching implementation, but I realize that was perhaps a bit overly cautious since our only other uses are:
features/step_definitions/tor.rb: cmd = "echo | netcat #{host} #{port}"
features/step_definitions/tor.rb: cmd = "echo | netcat -u #{host} #{port}"
So, I could try switching to netcat-openbsd
and make sure that the tests affected still pass. Ok?
> * I’m curious why tor_bootstrap_progress()
doesn’t need to pass -n
to echo
anymore.
I’m curious why it was ever needed. Do you have an explanation?
IMHO, in general, if you need to worry about newlines in shell function “return values” (now I’m talking about what they write, not the return status) you are doing something wrong. At least with my current understanding of shell expansion rules. Am I missing something?
> * I’d rather see extended regexp:s, because I would prefer not to have to learn another regexp language.
As someone who learned regexes through via grep
and sed
without --extended-regexp
I could return your argument right back to you, but I get that probably more people than not just get confused by all those escapes. :)
> IMO none of these are real blockers for merging, but I would appreciate if the 3 first items on this list were fixed during the freeze.
I think I’ll try to do all this before merging any way. The amount of work needed to track this will be in the same order as actually fixing them. :)
> So please make sure that Scenario: Clock with host's time in bridge mode
passes (it’s disabled on Jenkins), merge, and reassign to you for these fixes.
Yes!
> What’s “sid-Jenkins”?
I meant si*b*-Jenkins. :)
#14 Updated by intrigeri 2017-04-06 12:01:30
- Assignee changed from intrigeri to anonym
- QA Check changed from Info Needed to Dev Needed
> intrigeri wrote:
>> * It’s a bit sad to add a dependency on socat, while we already include nc that (iirc) can talk to Unix sockets.
> We install netcat-traditional
, which can’t talk to UNIX sockets. netcat-openbsd
can, though.
FWIW we have switched to netcat-openbsd
on feature/stretch :)
> I didn’t want to potentially break other stuff we use nc
/@netcat@ by switching implementation,
Makes sense.
> but I realize that was perhaps a bit overly cautious since our only other uses are:
>
> features/step_definitions/tor.rb: cmd = "echo | netcat #{host} #{port}"
> features/step_definitions/tor.rb: cmd = "echo | netcat -u #{host} #{port}"
>
… and I bet we already have adapted the code to use netcat-openbsd
on feature/stretch :)
> So, I could try switching to netcat-openbsd
and make sure that the tests affected still pass. Ok?
Yes :)
>> * I’m curious why tor_bootstrap_progress()
doesn’t need to pass -n
to echo
anymore.
> I’m curious why it was ever needed. Do you have an explanation?
I guess someone (possibly me) wanted to ensure that we return an integer, not an integer followed by a newline.
> IMHO, in general, if you need to worry about newlines in shell function “return values” (now I’m talking about what they write, not the return status) you are doing something wrong. At least with my current understanding of shell expansion rules. Am I missing something?
I’m probably missing something wrt. how a function’s stdout is captured by $()
. Forget it.
>> * I’d rather see extended regexp:s, because I would prefer not to have to learn another regexp language.
> As someone who learned regexes through via grep
and sed
without --extended-regexp
I could return your argument right back to you, but I get that probably more people than not just get confused by all those escapes. :)
Forget it then (even though all the programming languages we use in Tails default to something like PCRE, so it’s not as if we were going to standardize on old-school regexps any time soon).
> I think I’ll try to do all this before merging any way. The amount of work needed to track this will be in the same order as actually fixing them. :)
OK.
#15 Updated by anonym 2017-04-07 14:56:26
- Assignee changed from anonym to intrigeri
- % Done changed from 50 to 60
- QA Check changed from Dev Needed to Ready for QA
intrigeri wrote:
> > intrigeri wrote:
> > but I realize that was perhaps a bit overly cautious since our only other uses are:
> > […]
>
> … and I bet we already have adapted the code to use netcat-openbsd
on feature/stretch :)
Good!
> > So, I could try switching to netcat-openbsd
and make sure that the tests affected still pass. Ok?
>
> Yes :)
Done, but then I found a bug in tor
, see commit:8b950d5902ad1a157a99cca234b5ddd32e661c89. So we’re back to TCP communication again, and I had to disable PrivateNetwork=yes
in a systemd unit file that were using the tor.sh
library.
> >> * I’d rather see extended regexp:s, because I would prefer not to have to learn another regexp language.
>
> > As someone who learned regexes through via grep
and sed
without --extended-regexp
I could return your argument right back to you, but I get that probably more people than not just get confused by all those escapes. :)
>
> Forget it then (even though all the programming languages we use in Tails default to something like PCRE, so it’s not as if we were going to standardize on old-school regexps any time soon).
Note that I was mostly teasing you! :) I have switched to extended regexes in this file, and will try to remember doing so for any future shell scripting I do.
> > I think I’ll try to do all this before merging any way. The amount of work needed to track this will be in the same order as actually fixing them. :)
>
> OK.
In the end I didn’t, and I wish I had at least tested “bridge mode” and Tor Launcher, cause it’s broken, and will be so in 2.12~rc1. The current state of the feature branch should fix this though => please review’n’re-merge into testing
and devel
!
#16 Updated by intrigeri 2017-04-09 11:30:43
- Assignee changed from intrigeri to anonym
- QA Check changed from Ready for QA to Dev Needed
Hi!
> Note that I was mostly teasing you! :) I have switched to extended regexes in this file, and will try to remember doing so for any future shell scripting I do.
Thanks! :)
> In the end I didn’t, and I wish I had at least tested “bridge mode” and Tor Launcher, cause it’s broken, and will be so in 2.12~rc1.
Ouch :/
> The current state of the feature branch should fix this though => please review’n’re-merge into testing
and devel
!
Merged, as I understand this does improve the current state of things.
Some bonus nitpicking (surprise!):
control_port="$(tor_rc_lookup ControlPort)"
assumes that no flag is set. Perhaps reintroduce| awk '{ print $1 }')
that you had added intor_control_socket_path
?- In theory,
ControlPort
acceptsunix:path
andauto
values: Maybe make sure it contains only digits?
Feel free to tell me that you have more important Foundations Team stuff to work on than addressing these, if that’s how you feel :) If that’s the case, mark this ticket as Fix Committed.
#17 Updated by anonym 2017-04-10 08:20:44
- Status changed from In Progress to Fix committed
- Assignee deleted (
anonym) - % Done changed from 60 to 100
- QA Check changed from Dev Needed to Pass
intrigeri wrote:
> > The current state of the feature branch should fix this though => please review’n’re-merge into testing
and devel
!
>
> Merged, as I understand this does improve the current state of things.
Thanks!
> Some bonus nitpicking (surprise!):
>
> * control_port="$(tor_rc_lookup ControlPort)"
assumes that no flag is set. Perhaps reintroduce | awk '{ print $1 }')
that you had added in tor_control_socket_path
?
> * In theory, ControlPort
accepts unix:path
and auto
values: Maybe make sure it contains only digits?
>
> Feel free to tell me that you have more important Foundations Team stuff to work on than addressing these, if that’s how you feel :) If that’s the case, mark this ticket as Fix Committed.
Yeah, I guess I should focus on other things: my inner compulsively refactoring perfectionist wanted me to do things like these, but I thought “we are not really trying to make a user-friendly general purpose library for others’ consumption, so let’s cheat by making some assumptions that likely will remain true for Tails in order to avoid over-engineering this”. If we are gonna do all this correctly, we should also be able to handle multiple ControlPort
directives, ControlPort unix:PATH
vs the equiavalent ControlSocket PATH
, ControlListenAddress
etc. Let’s not bother!
#19 Updated by anonym 2017-04-19 17:25:07
- Status changed from Fix committed to Resolved
#20 Updated by bertagaz 2017-07-31 10:41:05
- related to Bug #13541: Tor still sometimes fails to bootstrap in the test suite added