Bug #8686

Sometimes notification-daemon aborts, causing desktop notifications to not be displayed

Added by kytv 2015-01-13 13:47:03 . Updated 2017-06-28 14:46:46 .

Status:
Rejected
Priority:
Normal
Assignee:
Category:
Target version:
Start date:
2015-05-03
Due date:
% Done:

100%

Feature Branch:
Type of work:
Code
Blueprint:

Starter:
0
Affected tool:
Deliverable for:

Description

Due to a race condition(?), occasionally the notifications such as “Tor Is Ready” are not displayed. When this problem is seen, no notifications of any kind will be shown during the Tails session.

While this infrequent failure was last seen during the test suite with a Tails 1.2.2 ISO, I have also seen it “on bare metal”.


Subtasks

Feature #9332: Test that notification-daemon is running Resolved

100


History

#1 Updated by anonym 2015-01-13 14:18:09

  • related to Bug #8685: Adapt waiting for user notification facilities for Jessie added

#2 Updated by anonym 2015-01-13 14:19:53

  • Status changed from New to Confirmed

This seems like another instance of Bug #8685, which shows that it is actually a more general issue that doesn’t only affect the MAC spoofing panic mode notifications.

#3 Updated by intrigeri 2015-01-13 16:25:30

  • Target version changed from Tails_1.2.3 to Tails_1.3

#4 Updated by kytv 2015-01-16 11:10:43

As I run the test suite more often, I’m seeing this far more frequently than I used to.

#5 Updated by BitingBird 2015-03-12 21:02:36

  • Target version changed from Tails_1.3 to Tails_1.3.2

Finally moving to next milestone. Let’s hope someone adopts this lonely ticket :)

#6 Updated by intrigeri 2015-03-20 23:33:18

  • Assignee set to kytv
  • QA Check set to Info Needed

kytv wrote:
> When this problem is seen, no notifications of any kind will be shown during the Tails session.

Do you mean that the notifications handling is fully broken in such a situation? E.g. is a notification manually sent with notify-send(1) actually displayed?

#7 Updated by kytv 2015-03-20 23:47:43

  • Assignee changed from kytv to intrigeri

intrigeri wrote:
> kytv wrote:
> > When this problem is seen, no notifications of any kind will be shown during the Tails session.
>
> Do you mean that the notifications handling is fully broken in such a situation? E.g. is a notification manually sent with notify-send(1) actually displayed?

Correct. Well, at least I think notify-send will fail when run manually. I can say for certain that when I hit this bug, notifications will not be displayed when, for example, opening the I2P- or Unsafe-Browsers.

If/when I see this again, is there anything that I can do to generate useful information to debug this?

#8 Updated by intrigeri 2015-03-21 00:04:59

> Correct. Well, at least I think notify-send will fail when run manually. I can say for certain that when I hit this bug, notifications will not be displayed when, for example, opening the I2P- or Unsafe-Browsers.

Hmm, so that race condition isn’t just about our notifications being sent too early, before the notification handler is ready, and then not displayed (that was my understanding of the problem so far, glad I’ve asked for details). But rather, something we do (possibly sending notifications at the “wrong” time, or something) apparently just breaks the notification handler. Ooops.

Can you reproduce this in Tails/Jessie? (The notification handler there is GNOME Shell, which is probably less buggy than notification-daemon, that was itself basically abandonware way before Wheezy was out.)

> If/when I see this again, is there anything that I can do to generate useful information?

Yes: look in ~/.xsession-errors, and check if notification-daemon is running.

#9 Updated by intrigeri 2015-03-21 00:06:16

  • Assignee changed from intrigeri to kytv

#10 Updated by kytv 2015-03-23 02:38:22

  • Assignee changed from kytv to intrigeri
  • QA Check deleted (Info Needed)

intrigeri wrote:
> I wrote:
>
> > If/when I see this again, is there anything that I can do to generate useful information to debug this?
>
> Yes: look in ~/.xsession-errors, and check if notification-daemon is running.

First: I really like the script added by anonym, features/scripts/vm-execute. It is sooooooo handy!

Second, it looks like notification-daemon is not running.

# features/scripts/vm-execute 'ps aux |grep notification-daemon'
Return status: 1
STDOUT:
STDERR:

In ~/.xsession-errors:

(gnome-panel:4668): Gtk-CRITICAL **: gtk_accelerator_parse_with_keycode: assertion `accelerator != NULL' failed

** (gnome-panel:4668): WARNING **: Unable to parse mouse modifier '(null)'

Initializing nautilus-gdu extension
nautilus-wipe-Message: Initializing

(gnome-panel:4668): GLib-GObject-WARNING **: /build/buildd-glib2.0_2.33.12+really2.32.4-5-i386-eISom6/glib2.0-2.33.12+really2.32.4/./gobject/gsignal.c:2459: signal `size_request' is invalid for instance `0xa141ac0'
** Message: applet now embedded in the notification area

(nm-applet:4716): Gdk-CRITICAL **: gdk_window_thaw_toplevel_updates_libgtk_only: assertion `window->update_and_descendants_freeze_count > 0' failed

(notification-daemon:4719): Gdk-CRITICAL **: gdk_window_thaw_toplevel_updates_libgtk_only: assertion `window->update_and_descendants_freeze_count > 0' failed

(florence:4794): Gtk-WARNING **: gtk_widget_size_allocate(): attempt to allocate widget with width -23 and height 27
localuser:tails-upgrade-frontend being added to access control list
Prototype mismatch: sub Tails::IUK::Frontend::assert ($;$) vs none at /usr/share/perl5/Tails/IUK/Frontend.pm line 40
The system is up-to-date

(gnome-settings-daemon:4624): power-plugin-WARNING **: failed to turn the panel on: Display is not DPMS capable
localuser:tails-upgrade-frontend being removed from access control list
Window manager warning: Invalid WM_TRANSIENT_FOR window 0x220001f specified for 0x2200025 (Error Regi).

(gnome-settings-daemon:4624): power-plugin-WARNING **: failed to turn the panel on: Display is not DPMS capable
STDERR:

According to that, notification-daemon had a PID of 4719 but by this point it’s not running.

# features/scripts/vm-execute 'ps -fp 4719'
Return status: 1
STDOUT:
UID        PID  PPID  C STIME TTY          TIME CMD
STDERR:

As I suspected in Bug #8686#note-7, I cannot display any notifications with notify-send (which makes sense due to notification-daemon not running (I wasn’t familiar with notification-daemon, otherwise I would have reported that initially)).

#11 Updated by intrigeri 2015-03-23 14:36:15

Thanks. So this looks like one more bug in notification-daemon. If you can’t reproduce this with feature-jessie, I’m inclined to treat it as low-priority (and of course close once we move to Jessie) => Kill Your TV, can you reproduce this with feature-jessie?

#12 Updated by intrigeri 2015-03-23 14:36:54

  • Assignee changed from intrigeri to kytv
  • QA Check set to Info Needed

#13 Updated by kytv 2015-03-24 21:59:24

intrigeri wrote:
> Thanks. So this looks like one more bug in notification-daemon. If you can’t reproduce this with feature-jessie, I’m inclined to treat it as low-priority (and of course close once we move to Jessie) => Kill Your TV, can you reproduce this with feature-jessie?

Due to Bug #8778 I hadn’t done much with Jessie. Now that I run the test suite (woohoo), maybe I can use a custom, not-to-be-checked-in scenario to find out.

To be determined, though it may take a while.

#14 Updated by kytv 2015-03-25 14:12:46

The main reason I noticed this:

 Scenario: Anti test: Detecting IPv4 TCP leaks from the Unsafe Browser with the firewall leak detector # features/tor_enforcement.feature:22
    Given I capture all network traffic                                                                 # features/step_definitions/common_steps.rb:124
    When I successfully start the Unsafe Browser                                                        # features/step_definitions/unsafe_browser.rb:94
      FindFailed: can not find UnsafeBrowserStartNotification.png on the screen.
      Line ?, in File ? (RuntimeError)
      /usr/lib/ruby/vendor_ruby/cucumber/core_ext/instance_exec.rb:73:in `rescue in cucumber_run_with_backtrace_filtering'
      /usr/lib/ruby/vendor_ruby/cucumber/core_ext/instance_exec.rb:68:in `cucumber_run_with_backtrace_filtering'
      /usr/lib/ruby/vendor_ruby/cucumber/core_ext/instance_exec.rb:36:in `cucumber_instance_exec'
      /usr/lib/ruby/vendor_ruby/cucumber/rb_support/rb_step_definition.rb:97:in `invoke'
      /usr/lib/ruby/vendor_ruby/cucumber/step_match.rb:25:in `invoke'
      /usr/lib/ruby/vendor_ruby/cucumber/runtime/support_code.rb:60:in `invoke'
      /usr/lib/ruby/vendor_ruby/cucumber/rb_support/rb_world.rb:52:in `step'
      ./features/step_definitions/unsafe_browser.rb:98:in `/^I successfully start the Unsafe Browser$/'
      features/tor_enforcement.feature:24:in `When I successfully start the Unsafe Browser'
    And I open the address "https://check.torproject.org" in the Unsafe Browser                         # features/step_definitions/common_steps.rb:606
    And I see "UnsafeBrowserTorCheckFail.png" after at most 60 seconds                                  # features/step_definitions/common_steps.rb:418
    Then the firewall leak detector has detected IPv4 TCP leaks                                         # features/step_definitions/firewall_leaks.rb:1
Scenario failed at time 00:12:26

This problem (notification-daemon dying) affects the reliability of the test suite for me (at least) since I can reproduce this frequently.

#15 Updated by BitingBird 2015-04-06 21:14:11

  • Target version changed from Tails_1.3.2 to Tails_1.4

Postponing

#16 Updated by intrigeri 2015-04-25 05:48:10

kytv wrote:
> intrigeri wrote:
> > Kill Your TV, can you reproduce this with feature-jessie?
>
> Due to Bug #8778 I hadn’t done much with Jessie. Now that I run the test suite (woohoo), maybe I can use a custom, not-to-be-checked-in scenario to find out.

Any news on that one?

#17 Updated by kytv 2015-04-25 11:09:06

  • Status changed from Confirmed to In Progress

intrigeri wrote:
> kytv wrote:
> > intrigeri wrote:
> > > Kill Your TV, can you reproduce this with feature-jessie?
> >
> > Due to Bug #8778 I hadn’t done much with Jessie. Now that I run the test suite (woohoo), maybe I can use a custom, not-to-be-checked-in scenario to find out.
>
> Any news on that one?

None yet because I had the 1.4 test writing at the must be done before anything else priority (with Feature #9129 & Feature #9131 at the top). Now that my 1.4 targeted CI test writing is now finished other than my self-assigned electrum tests (with some tests/tickets still awaiting review), perhaps I should work on this (where “this” means determining whether this is a problem in feature/jessie or not).

#18 Updated by kytv 2015-04-26 06:22:21

  • Assignee changed from kytv to intrigeri
  • QA Check deleted (Info Needed)

Having run the following for hours (and hours (and hours))

    Given a computer
    And I start Tails from DVD with network unplugged and I login
    Then I see "WarningVirtualMachine.png" after at most 30 seconds

with the following results

180 scenarios (180 passed)
540 steps (540 passed)

I’m confident that this is not a problem in feature/jessie.

#19 Updated by kytv 2015-04-26 06:23:07

  • related to Feature #8539: Make the test suite robust enough to be run as part of a CI setup added

#20 Updated by kytv 2015-04-26 06:24:20

  • related to deleted (Feature #8539: Make the test suite robust enough to be run as part of a CI setup)

#21 Updated by kytv 2015-04-26 06:25:13

#22 Updated by kytv 2015-04-26 06:28:50

Adding Feature #8539 as a parent due to frequent failures in tests which have a step like Then I see a notification about Blah to fail.

#23 Updated by intrigeri 2015-04-26 09:50:39

> I’m confident that this is not a problem in feature/jessie.

Good news!

> Adding Feature #8539 as a parent due to frequent failures in tests which have a step like Then I see a notification about Blah to fail.

I don’t think that we can fix that problem (which is likely a bug in notification-daemon) without putting a lot of effort into it. Given the bug disappears once Tails/Jessie is out, the fact we lack the skills in-house to work on this bug, and the fact that notification-daemon is dead upstream, I don’t think we should treat a proper resolution of this problem as blocking Feature #8539, or high-priority in any other way.

Now, if this bug is making it too painful for you to work on the test suite, then probably we should try to find some possibly dirty workaround without spending too much time on it.

What do you think?

#24 Updated by intrigeri 2015-04-26 09:52:53

  • Assignee changed from intrigeri to kytv

#25 Updated by kytv 2015-04-27 15:30:49

  • Assignee changed from kytv to intrigeri

I’ve been able to work around this locally (mostly by commenting out the And the notification for Blah is displayed steps), I just figured that if it’s something I run into others may as well.

Waiting for moving Jessie—however long that takes and as long as I’m seemingly the only one hitting this—is fine from my PoV, and however you want to handle this ticket is fine as well. :)

#26 Updated by intrigeri 2015-04-29 08:10:59

> I’ve been able to work around this locally (mostly by commenting out the And the notification for Blah is displayed steps),

I’m very wary of seeing each of us run its own customized version of the test suite, because it decreases quite a bit the value of our reviews and test suite runs. In this specific case, you’re basically disabling tests to make the test suite pass. If you’re applying any other local modifications to the test suite, please make sure they are tracked by tickets, because I’d rather not see such practices become standard.

In the case at hand, I think we should try to find a (possibly gory) workaround for the actual bug. If we can’t find any such workaround without spending too much time on it, too bad, and then you’ll have to resort to locally hiding the bug’s effects when running the test suite.

#27 Updated by intrigeri 2015-04-29 08:27:06

  • Assignee changed from intrigeri to kytv
  • QA Check set to Dev Needed

#28 Updated by intrigeri 2015-05-03 08:50:47

Not sure it’s related, please file a dedicated ticket if it’s not — testing 1.4~rc1 I’ve seen:

  Scenario: Anti test: Detecting IPv4 TCP leaks from the Unsafe Browser with the firewall leak detector # features/tor_enforcement.feature:22
    Given I capture all network traffic                                                                 # features/step_definitions/common_steps.rb:124
    When I successfully start the Unsafe Browser                                                        # features/step_definitions/unsafe_browser.rb:133
      FindFailed: can not find UnsafeBrowserStartNotification.png on the screen.
      Line ?, in File ? (RuntimeError)
      ./features/step_definitions/unsafe_browser.rb:137:in `/^I successfully start the Unsafe Browser$/'
      features/tor_enforcement.feature:24:in `When I successfully start the Unsafe Browser'

#29 Updated by kytv 2015-05-03 10:52:18

  • Subject changed from Desktop notifications are not always displayed to Sometimes notification-daemon aborts, causing desktop notifications to not be displayed

intrigeri wrote:
> Not sure it’s related, please file a dedicated ticket if it’s not — testing 1.4~rc1 I’ve seen:
>
> […]

I’m pretty sure it’s related. Perhaps it’d be good to have the test suite test that notification-daemon is running? This way there’d be less ambiguity as to why the notification wasn’t displayed.

#30 Updated by intrigeri 2015-05-03 11:15:58

> Perhaps it’d be good to have the test suite test that notification-daemon is running? This way there’d be less ambiguity as to why the notification wasn’t displayed.

Yes! This way, we can more or less ignore errors caused by notification-daemon bugs, and detect real problems that we may have in other places, that could cause notifications not to be displayed :)

#31 Updated by kytv 2015-05-11 16:16:45

  • Target version changed from Tails_1.4 to Tails_1.4.1

#32 Updated by kytv 2015-05-31 18:37:49

In trying to troubleshoot this I made an ISO which starts notification-daemon with strace and directing its output to /tmp.

notification-daemon WILL NOT abort when run with strace. :| After thousands of test runs it simply does not abort when run under strace :(

I’m anxiously awaiting to see if Bug #7249 makes this better.

#33 Updated by kytv 2015-05-31 22:05:10

kytv wrote:

> I’m anxiously awaiting to see if Bug #7249 makes this better.

It does not solve this problem.

#34 Updated by kytv 2015-06-07 11:05:24

In #tails-dev I entertained a theory that this might have the same root cause as Bug #7912.

Sometimes it does happen that notification-daemon isn’t running and the locale isn’t set but I was able to rule it out as being pure coincidence. There’s absolutely, positively no correlation between them.

#35 Updated by kytv 2015-06-28 13:03:04

  • Target version changed from Tails_1.4.1 to Tails_1.5

#36 Updated by kytv 2015-08-04 04:02:19

  • Target version changed from Tails_1.5 to Tails_1.6

#37 Updated by bertagaz 2015-09-23 01:32:11

  • Target version changed from Tails_1.6 to Tails_1.7

#38 Updated by kytv 2015-11-04 10:46:04

  • Target version changed from Tails_1.7 to Tails_1.8

#39 Updated by intrigeri 2015-11-10 11:20:25

  • Target version deleted (Tails_1.8)
  • Parent task deleted (Feature #8539)
  • QA Check deleted (Dev Needed)

We’ve known since 7 months that this doesn’t affect Jessie, no substantial progress has been made since months, and it’s not been reported as a cause of fragile tests on Jenkins AFAICT, so I say we should just give up on this one, and instead focus on other matters, e.g. releasing Tails 2.0 => unparenting and dropping the target version (which has been bumped basically every 6 weeks since 1.2.3, so it’s not as if it was very meaningful info).

#40 Updated by intrigeri 2015-11-10 11:20:59

  • related to deleted (Bug #8685: Adapt waiting for user notification facilities for Jessie)

#41 Updated by BitingBird 2016-07-01 11:32:34

  • Assignee deleted (kytv)

no news from kytv -> removing assignee

#42 Updated by Anonymous 2017-06-28 14:46:46

  • Status changed from In Progress to Rejected

We don’t even run this daemon anymore :) Closing.