Bug #12481

SSH_AUTH_SOCK occasionally not set in GNOME Terminal

Added by sajolida 2017-04-26 10:25:19 . Updated 2017-07-05 19:03:51 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Target version:
Start date:
2017-04-26
Due date:
% Done:

100%

Feature Branch:
bugfix/12481-set-SSH_AUTH_SOCK
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

With the 3.0 beta series (at least 3 and 4) I occasionally don’t get a working SSH agent. After starting my session and when going git fetch on an SSH repo I’m asked for the passphrase on the terminal and not through the usual GNOME popup. Then I’m asked for my passphrase every time I try to use my SSH key.

I sent a WhisperBack report and will point it out to intrigeri. I’ll be happy to provide additional information if needed even though I don’t know how to reproduce this bug and it happens to me once a week or so despite having quite a fixed starting ritual (heavily customized as you can imagine).


Files


Subtasks


History

#1 Updated by intrigeri 2017-04-29 09:40:50

  • Assignee changed from intrigeri to sajolida

In the logs I see: gnome-keyring-d[12190]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory. I suspect there’s a race condition between the creation of one of the parent directories and the startup of gnome-keyring-daemon, but I’ll need more info to pinpoint the root cause of the problem, so next time this happens, please send me the output of ls -lA /run/user /run/user/1000 /run/user/1000/keyring. Thanks in advance!

#2 Updated by sajolida 2017-05-22 22:37:45

I still have this one in mind but haven’t seen it since then.

#3 Updated by sajolida 2017-05-26 18:06:05

I might have faced this in this session but I’m surprised to only have noticied it several hours after I started working.

So I’m still pasting the output you asked for but we should maybe take it with a grain a salt:

$ ls -lA /run/user /run/user/1000 /run/user/1000/keyring
/run/user:
total 0
drwx------ 11 amnesia    amnesia    240 May 26 12:00 1000
drwx------  8 Debian-gdm Debian-gdm 180 May 26 11:58 114

/run/user/1000:
total 0
srw-rw-rw- 1 amnesia amnesia   0 May 26 12:00 bus
drwx------ 2 amnesia amnesia  60 May 26 17:57 dconf
drwx--x--x 2 amnesia amnesia  60 May 26 12:00 gdm
drwx------ 3 amnesia amnesia  60 May 26 12:00 gnome-shell
drwx------ 2 amnesia amnesia 140 May 26 12:00 gnupg
drwx------ 2 amnesia amnesia  40 May 26 12:00 gvfs
drwx------ 2 amnesia amnesia  40 May 26 12:00 gvfs-burn
drwx------ 2 amnesia amnesia 100 May 26 12:00 keyring
drwx------ 2 amnesia amnesia  80 May 26 12:00 pulse
drwxr-xr-x 3 amnesia amnesia 100 May 26 12:00 systemd

/run/user/1000/keyring:
total 0
srwx------ 1 amnesia amnesia 0 May 26 12:00 control
srwx------ 1 amnesia amnesia 0 May 26 12:00 pkcs11
srwx------ 1 amnesia amnesia 0 May 26 12:00 ssh

#4 Updated by intrigeri 2017-05-27 08:03:14

Ouch, I didn’t ask enough info. Next time it happens, please give me the output of these commands and:

  • ps aux | grep gnome-keyring
  • sudo journalctl | grep gnome-keyring

Thanks in advance!

#5 Updated by sajolida 2017-06-01 14:35:27

  • Assignee changed from sajolida to intrigeri

This time I’m pretty sure it happened right from the start:

amnesia@amnesia:~$ ls -lA /run/user /run/user/1000 /run/user/1000/keyring
/run/user:
total 0
drwx------ 11 amnesia    amnesia    240 Jun  1 14:08 1000
drwx------  8 Debian-gdm Debian-gdm 180 Jun  1 14:05 114

/run/user/1000:
total 0
srw-rw-rw- 1 amnesia amnesia   0 Jun  1 14:08 bus
drwx------ 2 amnesia amnesia  60 Jun  1 14:17 dconf
drwx--x--x 2 amnesia amnesia  60 Jun  1 14:08 gdm
drwx------ 3 amnesia amnesia  60 Jun  1 14:08 gnome-shell
drwx------ 2 amnesia amnesia 140 Jun  1 14:08 gnupg
drwx------ 2 amnesia amnesia  40 Jun  1 14:08 gvfs
drwx------ 2 amnesia amnesia  40 Jun  1 14:08 gvfs-burn
drwx------ 2 amnesia amnesia 100 Jun  1 14:08 keyring
drwx------ 2 amnesia amnesia  80 Jun  1 14:08 pulse
drwxr-xr-x 3 amnesia amnesia 100 Jun  1 14:08 systemd

/run/user/1000/keyring:
total 0
srwx------ 1 amnesia amnesia 0 Jun  1 14:08 control
srwx------ 1 amnesia amnesia 0 Jun  1 14:08 pkcs11
srwx------ 1 amnesia amnesia 0 Jun  1 14:08 ssh
amnesia@amnesia:~$ ps aux | grep gnome-keyring
amnesia  12350  0.0  0.0 286672  6992 tty2     SLl+ 14:08   0:00 /usr/bin/gnome-keyring-daemon --start --components=pkcs11
amnesia  15414  0.0  0.0  12720  1016 pts/7    S+   14:34   0:00 grep --color=auto gnome-keyring
amnesia@amnesia:~$ sudo journalctl | grep gnome-keyring
[sudo] password for amnesia: 
Jun 01 14:08:39 amnesia gnome-keyring-pkcs11.desktop[12346]: gnome-keyring-daemon: insufficient process capabilities, unsecure memory might get used
Jun 01 14:08:39 amnesia gnome-keyring-daemon[12350]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
Jun 01 14:08:39 amnesia gnome-keyring-secrets.desktop[12348]: gnome-keyring-daemon: insufficient process capabilities, unsecure memory might get used
Jun 01 14:08:39 amnesia gnome-keyring-d[12350]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
Jun 01 14:08:39 amnesia gnome-keyring-ssh.desktop[12349]: gnome-keyring-daemon: insufficient process capabilities, unsecure memory might get used
Jun 01 14:08:39 amnesia gnome-keyring-daemon[12351]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
Jun 01 14:08:39 amnesia gnome-keyring-d[12351]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
Jun 01 14:08:39 amnesia gnome-keyring-daemon[12352]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
Jun 01 14:08:39 amnesia gnome-keyring-d[12352]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
Jun 01 14:08:39 amnesia gnome-keyring-daemon[12350]: Gkm: using old keyring directory: /home/amnesia/.gnome2/keyrings
Jun 01 14:08:39 amnesia gnome-keyring-d[12350]: using old keyring directory: /home/amnesia/.gnome2/keyrings
Jun 01 14:08:39 amnesia gnome-keyring-daemon[12350]: Gkm: using old keyring directory: /home/amnesia/.gnome2/keyrings
Jun 01 14:08:39 amnesia gnome-keyring-d[12350]: using old keyring directory: /home/amnesia/.gnome2/keyrings
Jun 01 14:08:39 amnesia gnome-keyring-ssh.desktop[12349]: SSH_AUTH_SOCK=/run/user/1000/keyring/ssh

#6 Updated by intrigeri 2017-06-02 15:22:27

  • QA Check deleted (Info Needed)

#7 Updated by intrigeri 2017-06-03 07:53:42

  • Assignee changed from intrigeri to sajolida
  • QA Check set to Info Needed

Here’s some additional debugging info I’ll need, next time you see that happen. Better send it to me privately as some of it can be sensitive.

First, run sudo apt update && sudo apt install strace && sudo sysctl kernel.yama.ptrace_scope=0 to allow using strace; this will make your session a tiny bit less hardened against an adversary who already got inside your Tails.

Then let’s gather info about the initial status:

  • env | grep '^SSH'
  • sudo lsof | grep /run/user/1000/keyring
  • pgrep gnome-keyring
  • strace -ff -o ssh-add.strace ssh-add -l
  • ssh -vvv USER_at_SERVER (for the user/server you’re trying to connect to): what I’m interested in is if/how the SSH client tries to talk to $SSH_AUTH_SOCK and why it fails

Finally, let’s do the same again but while gathering some debug info about gnome-keyring-daemon:

  • Run pkill gnome-keyring && sleep 5 && strace -ff -o gkd.strace /usr/bin/gnome-keyring-daemon --foreground --components=pkcs11,secrets,ssh
  • Run the same set of commands as above (env, lsof, pgrep, ssh-add, ssh -vvv) again.
  • Send me the output of all these commands, and the content of the *.strace.* files.

When you send me all this info, please make it clear which output comes from which set of commands (before vs. after restarting gnome-keyring-daemon).

Thanks in advance!

#8 Updated by intrigeri 2017-06-09 20:19:44

  • Target version changed from Tails_3.0 to Tails_3.1

3.0 is now frozen.

#9 Updated by sajolida 2017-06-17 10:38:44

  • File SSH.tar.gz.pgp added
  • Assignee changed from sajolida to intrigeri

In attachment.

#10 Updated by intrigeri 2017-06-19 11:04:40

  • QA Check deleted (Info Needed)

#11 Updated by intrigeri 2017-06-23 16:12:09

I’ve just seen that happen on 3.0~beta4 (yeah!): $SSH_AUTH_SOCK wasn’t set in GNOME Terminal. But SSH_AUTH_SOCK=/run/user/1000/keyring/ssh ssh $SERVER works just fine, so it seems that (at least in my case) GNOME Keyring works just fine, and the problem is about setting the GNOME session environment properly. Interestingly:

$ systemctl --user show-environment  | grep '^SSH'
SSH_AUTH_SOCK=/run/user/1000/keyring/ssh

… so at least /usr/local/lib/start-systemd-desktop-target (started via /etc/xdg/autostart/systemd-desktop-target.desktop) was run with the correct environment.

/etc/xdg/autostart/gnome-keyring-ssh.desktop has X-GNOME-Autostart-Phase=PreDisplayServer, which was introduced precisely to ensure the variables set by gnome-keyring-ssh are picked up by GNOME Shell and its descendants. That’s all right, except GNOME Terminal is run from /usr/lib/systemd/user/gnome-terminal-server.service, which is D-Bus activated, i.e. it’s not started by GNOME Shell. And indeed, the GNOME session’s dbus-daemon has no SSH_AUTH_SOCK in its environment, which is not surprising as (in that session that exposes the bug) dbus-daemon was started before gnome-keyring.

A trivial workaround would be to set SSH_AUTH_SOCK=/run/user/1000/keyring/ssh in ~/.bashrc, and worst case I’ll do that for Tails 3.1. But ideally I’d like to understand the root cause of this problem as I suspect it may explain/cause other, similar bugs. I’ll now look into sajolida’s report.

#12 Updated by intrigeri 2017-06-23 16:17:45

intrigeri wrote:
> I’ll now look into sajolida’s report.

… which confirms that SSH_AUTH_SOCK was not set, and likely gnome-keyring would work just fine if the SSH client was talking to it.

#13 Updated by intrigeri 2017-06-23 16:18:06

  • Subject changed from SSH agent occasionally not working to SSH_AUTH_SOCK occasionally not set in GNOME Terminal

#14 Updated by intrigeri 2017-06-23 17:04:11

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10

intrigeri wrote:
> And indeed, the GNOME session’s dbus-daemon has no SSH_AUTH_SOCK in its environment, which is not surprising as (in that session that exposes the bug) dbus-daemon was started before gnome-keyring.

FWIW, it’s the case on my own system (where GNOME Terminal has SSH_AUTH_SOCK correctly set) as well. But there probably is a difference between “the environment dbus-daemon was started in” and “the current activation environment used by dbus-daemon”.

After searching the web for similar issues, the only possibly relevant one I’ve found is https://bugs.debian.org/804703: GNOME keyring fails to tell the session manager about the correct SSH_AUTH_SOCK due to a race condition. It would be interesting to see if something like ** Message: couldn't register in session: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.gnome.SessionManager was not provided by any .service files can be found in the Journal when this problem occurs.

Anyway, my current conclusion is that the state of GNOME session initialization is in a weird status, in the middle of a transition: we’re still suffering from the drawbacks of old technologies (/etc/X11/Xsession.d, /etc/xdg/autostart), cannot fully benefit of the new ones (D-Bus activated services, systemd --user), and on top of that we suffer from race conditions caused by mixing old and new technologies — yay! Once gnome-session relies on systemd --user and D-Bus only, this class of problem should entirely disappear and we’ll be good. I can’t wait.

This situation should eventually change during the Buster dev cycle, so I won’t invest more time debugging the current, soon to be obsolete stack, in order to understand the root cause of this problem. Instead, I’m going to look for the best (simple, working) workaround I can find. The one I suggested about (in ~/.bashrc) should work just fine for anything run in a Terminal, but it won’t be enough for graphical applications that are run from GNOME Shell and use the SSH client. Are there any in our default set of packages? Do we care? I’ll try running dbus-update-activation-environment --systemd SSH_AUTH_SOCK=/run/user/$UID/keyring/ssh in a unit WantedBy our desktop.target: this should solve the problem for all kinds of apps. And if it is harder than expected, I’ll fallback on the rough ~/.bashrc (or /etc/environment) trick.

#15 Updated by intrigeri 2017-06-25 09:55:48

  • Feature Branch set to bugfix/12481-set-SSH_AUTH_SOCK

#16 Updated by intrigeri 2017-06-25 15:51:15

  • % Done changed from 10 to 20
  • Type of work changed from Research to Code

desktop.target is started too late, so /etc/X11/Xsession.d/ it will be.

#17 Updated by intrigeri 2017-06-27 13:13:19

  • Assignee changed from intrigeri to bertagaz
  • % Done changed from 20 to 50
  • QA Check set to Ready for QA

Passes the relevant tests.

#18 Updated by intrigeri 2017-06-30 06:18:49

  • Target version changed from Tails_3.1 to Tails_3.0.1

#19 Updated by bertagaz 2017-06-30 13:33:06

  • Status changed from In Progress to Fix committed
  • Assignee deleted (bertagaz)
  • % Done changed from 50 to 100
  • QA Check changed from Ready for QA to Pass

intrigeri wrote:
> Passes the relevant tests.

Seems to work, reproduced too the bug with 3.0, not with your branch. Merged!

#20 Updated by intrigeri 2017-07-05 19:03:51

  • Status changed from Fix committed to Resolved