Bug #12481
SSH_AUTH_SOCK occasionally not set in GNOME Terminal
100%
Description
With the 3.0 beta series (at least 3 and 4) I occasionally don’t get a working SSH agent. After starting my session and when going git fetch
on an SSH repo I’m asked for the passphrase on the terminal and not through the usual GNOME popup. Then I’m asked for my passphrase every time I try to use my SSH key.
I sent a WhisperBack report and will point it out to intrigeri. I’ll be happy to provide additional information if needed even though I don’t know how to reproduce this bug and it happens to me once a week or so despite having quite a fixed starting ritual (heavily customized as you can imagine).
Files
Subtasks
History
#1 Updated by intrigeri 2017-04-29 09:40:50
- Assignee changed from intrigeri to sajolida
In the logs I see: gnome-keyring-d[12190]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
. I suspect there’s a race condition between the creation of one of the parent directories and the startup of gnome-keyring-daemon
, but I’ll need more info to pinpoint the root cause of the problem, so next time this happens, please send me the output of ls -lA /run/user /run/user/1000 /run/user/1000/keyring
. Thanks in advance!
#2 Updated by sajolida 2017-05-22 22:37:45
I still have this one in mind but haven’t seen it since then.
#3 Updated by sajolida 2017-05-26 18:06:05
I might have faced this in this session but I’m surprised to only have noticied it several hours after I started working.
So I’m still pasting the output you asked for but we should maybe take it with a grain a salt:
$ ls -lA /run/user /run/user/1000 /run/user/1000/keyring
/run/user:
total 0
drwx------ 11 amnesia amnesia 240 May 26 12:00 1000
drwx------ 8 Debian-gdm Debian-gdm 180 May 26 11:58 114
/run/user/1000:
total 0
srw-rw-rw- 1 amnesia amnesia 0 May 26 12:00 bus
drwx------ 2 amnesia amnesia 60 May 26 17:57 dconf
drwx--x--x 2 amnesia amnesia 60 May 26 12:00 gdm
drwx------ 3 amnesia amnesia 60 May 26 12:00 gnome-shell
drwx------ 2 amnesia amnesia 140 May 26 12:00 gnupg
drwx------ 2 amnesia amnesia 40 May 26 12:00 gvfs
drwx------ 2 amnesia amnesia 40 May 26 12:00 gvfs-burn
drwx------ 2 amnesia amnesia 100 May 26 12:00 keyring
drwx------ 2 amnesia amnesia 80 May 26 12:00 pulse
drwxr-xr-x 3 amnesia amnesia 100 May 26 12:00 systemd
/run/user/1000/keyring:
total 0
srwx------ 1 amnesia amnesia 0 May 26 12:00 control
srwx------ 1 amnesia amnesia 0 May 26 12:00 pkcs11
srwx------ 1 amnesia amnesia 0 May 26 12:00 ssh
#4 Updated by intrigeri 2017-05-27 08:03:14
Ouch, I didn’t ask enough info. Next time it happens, please give me the output of these commands and:
ps aux | grep gnome-keyring
sudo journalctl | grep gnome-keyring
Thanks in advance!
#5 Updated by sajolida 2017-06-01 14:35:27
- Assignee changed from sajolida to intrigeri
This time I’m pretty sure it happened right from the start:
amnesia@amnesia:~$ ls -lA /run/user /run/user/1000 /run/user/1000/keyring
/run/user:
total 0
drwx------ 11 amnesia amnesia 240 Jun 1 14:08 1000
drwx------ 8 Debian-gdm Debian-gdm 180 Jun 1 14:05 114
/run/user/1000:
total 0
srw-rw-rw- 1 amnesia amnesia 0 Jun 1 14:08 bus
drwx------ 2 amnesia amnesia 60 Jun 1 14:17 dconf
drwx--x--x 2 amnesia amnesia 60 Jun 1 14:08 gdm
drwx------ 3 amnesia amnesia 60 Jun 1 14:08 gnome-shell
drwx------ 2 amnesia amnesia 140 Jun 1 14:08 gnupg
drwx------ 2 amnesia amnesia 40 Jun 1 14:08 gvfs
drwx------ 2 amnesia amnesia 40 Jun 1 14:08 gvfs-burn
drwx------ 2 amnesia amnesia 100 Jun 1 14:08 keyring
drwx------ 2 amnesia amnesia 80 Jun 1 14:08 pulse
drwxr-xr-x 3 amnesia amnesia 100 Jun 1 14:08 systemd
/run/user/1000/keyring:
total 0
srwx------ 1 amnesia amnesia 0 Jun 1 14:08 control
srwx------ 1 amnesia amnesia 0 Jun 1 14:08 pkcs11
srwx------ 1 amnesia amnesia 0 Jun 1 14:08 ssh
amnesia@amnesia:~$ ps aux | grep gnome-keyring
amnesia 12350 0.0 0.0 286672 6992 tty2 SLl+ 14:08 0:00 /usr/bin/gnome-keyring-daemon --start --components=pkcs11
amnesia 15414 0.0 0.0 12720 1016 pts/7 S+ 14:34 0:00 grep --color=auto gnome-keyring
amnesia@amnesia:~$ sudo journalctl | grep gnome-keyring
[sudo] password for amnesia:
Jun 01 14:08:39 amnesia gnome-keyring-pkcs11.desktop[12346]: gnome-keyring-daemon: insufficient process capabilities, unsecure memory might get used
Jun 01 14:08:39 amnesia gnome-keyring-daemon[12350]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
Jun 01 14:08:39 amnesia gnome-keyring-secrets.desktop[12348]: gnome-keyring-daemon: insufficient process capabilities, unsecure memory might get used
Jun 01 14:08:39 amnesia gnome-keyring-d[12350]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
Jun 01 14:08:39 amnesia gnome-keyring-ssh.desktop[12349]: gnome-keyring-daemon: insufficient process capabilities, unsecure memory might get used
Jun 01 14:08:39 amnesia gnome-keyring-daemon[12351]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
Jun 01 14:08:39 amnesia gnome-keyring-d[12351]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
Jun 01 14:08:39 amnesia gnome-keyring-daemon[12352]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
Jun 01 14:08:39 amnesia gnome-keyring-d[12352]: couldn't access control socket: /run/user/1000/keyring/control: No such file or directory
Jun 01 14:08:39 amnesia gnome-keyring-daemon[12350]: Gkm: using old keyring directory: /home/amnesia/.gnome2/keyrings
Jun 01 14:08:39 amnesia gnome-keyring-d[12350]: using old keyring directory: /home/amnesia/.gnome2/keyrings
Jun 01 14:08:39 amnesia gnome-keyring-daemon[12350]: Gkm: using old keyring directory: /home/amnesia/.gnome2/keyrings
Jun 01 14:08:39 amnesia gnome-keyring-d[12350]: using old keyring directory: /home/amnesia/.gnome2/keyrings
Jun 01 14:08:39 amnesia gnome-keyring-ssh.desktop[12349]: SSH_AUTH_SOCK=/run/user/1000/keyring/ssh
#6 Updated by intrigeri 2017-06-02 15:22:27
- QA Check deleted (
Info Needed)
#7 Updated by intrigeri 2017-06-03 07:53:42
- Assignee changed from intrigeri to sajolida
- QA Check set to Info Needed
Here’s some additional debugging info I’ll need, next time you see that happen. Better send it to me privately as some of it can be sensitive.
First, run sudo apt update && sudo apt install strace && sudo sysctl kernel.yama.ptrace_scope=0
to allow using strace
; this will make your session a tiny bit less hardened against an adversary who already got inside your Tails.
Then let’s gather info about the initial status:
env | grep '^SSH'
sudo lsof | grep /run/user/1000/keyring
pgrep gnome-keyring
strace -ff -o ssh-add.strace ssh-add -l
ssh -vvv USER_at_SERVER
(for the user/server you’re trying to connect to): what I’m interested in is if/how the SSH client tries to talk to$SSH_AUTH_SOCK
and why it fails
Finally, let’s do the same again but while gathering some debug info about gnome-keyring-daemon
:
- Run
pkill gnome-keyring && sleep 5 && strace -ff -o gkd.strace /usr/bin/gnome-keyring-daemon --foreground --components=pkcs11,secrets,ssh
- Run the same set of commands as above (
env
,lsof
,pgrep
,ssh-add
,ssh -vvv
) again. - Send me the output of all these commands, and the content of the
*.strace.*
files.
When you send me all this info, please make it clear which output comes from which set of commands (before vs. after restarting gnome-keyring-daemon
).
Thanks in advance!
#8 Updated by intrigeri 2017-06-09 20:19:44
- Target version changed from Tails_3.0 to Tails_3.1
3.0 is now frozen.
#9 Updated by sajolida 2017-06-17 10:38:44
- File SSH.tar.gz.pgp added
- Assignee changed from sajolida to intrigeri
In attachment.
#10 Updated by intrigeri 2017-06-19 11:04:40
- QA Check deleted (
Info Needed)
#11 Updated by intrigeri 2017-06-23 16:12:09
I’ve just seen that happen on 3.0~beta4 (yeah!): $SSH_AUTH_SOCK
wasn’t set in GNOME Terminal. But SSH_AUTH_SOCK=/run/user/1000/keyring/ssh ssh $SERVER
works just fine, so it seems that (at least in my case) GNOME Keyring works just fine, and the problem is about setting the GNOME session environment properly. Interestingly:
$ systemctl --user show-environment | grep '^SSH'
SSH_AUTH_SOCK=/run/user/1000/keyring/ssh
… so at least /usr/local/lib/start-systemd-desktop-target
(started via /etc/xdg/autostart/systemd-desktop-target.desktop
) was run with the correct environment.
/etc/xdg/autostart/gnome-keyring-ssh.desktop
has X-GNOME-Autostart-Phase=PreDisplayServer
, which was introduced precisely to ensure the variables set by gnome-keyring-ssh
are picked up by GNOME Shell and its descendants. That’s all right, except GNOME Terminal is run from /usr/lib/systemd/user/gnome-terminal-server.service
, which is D-Bus activated, i.e. it’s not started by GNOME Shell. And indeed, the GNOME session’s dbus-daemon
has no SSH_AUTH_SOCK
in its environment, which is not surprising as (in that session that exposes the bug) dbus-daemon
was started before gnome-keyring
.
A trivial workaround would be to set SSH_AUTH_SOCK=/run/user/1000/keyring/ssh
in ~/.bashrc
, and worst case I’ll do that for Tails 3.1. But ideally I’d like to understand the root cause of this problem as I suspect it may explain/cause other, similar bugs. I’ll now look into sajolida’s report.
#12 Updated by intrigeri 2017-06-23 16:17:45
intrigeri wrote:
> I’ll now look into sajolida’s report.
… which confirms that SSH_AUTH_SOCK
was not set, and likely gnome-keyring
would work just fine if the SSH client was talking to it.
#13 Updated by intrigeri 2017-06-23 16:18:06
- Subject changed from SSH agent occasionally not working to SSH_AUTH_SOCK occasionally not set in GNOME Terminal
#14 Updated by intrigeri 2017-06-23 17:04:11
- Status changed from Confirmed to In Progress
- % Done changed from 0 to 10
intrigeri wrote:
> And indeed, the GNOME session’s dbus-daemon
has no SSH_AUTH_SOCK
in its environment, which is not surprising as (in that session that exposes the bug) dbus-daemon
was started before gnome-keyring
.
FWIW, it’s the case on my own system (where GNOME Terminal has SSH_AUTH_SOCK
correctly set) as well. But there probably is a difference between “the environment dbus-daemon
was started in” and “the current activation environment used by dbus-daemon
”.
After searching the web for similar issues, the only possibly relevant one I’ve found is https://bugs.debian.org/804703: GNOME keyring fails to tell the session manager about the correct SSH_AUTH_SOCK
due to a race condition. It would be interesting to see if something like ** Message: couldn't register in session: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.gnome.SessionManager was not provided by any .service files
can be found in the Journal when this problem occurs.
Anyway, my current conclusion is that the state of GNOME session initialization is in a weird status, in the middle of a transition: we’re still suffering from the drawbacks of old technologies (/etc/X11/Xsession.d
, /etc/xdg/autostart
), cannot fully benefit of the new ones (D-Bus activated services, systemd --user
), and on top of that we suffer from race conditions caused by mixing old and new technologies — yay! Once gnome-session
relies on systemd --user
and D-Bus only, this class of problem should entirely disappear and we’ll be good. I can’t wait.
This situation should eventually change during the Buster dev cycle, so I won’t invest more time debugging the current, soon to be obsolete stack, in order to understand the root cause of this problem. Instead, I’m going to look for the best (simple, working) workaround I can find. The one I suggested about (in ~/.bashrc
) should work just fine for anything run in a Terminal, but it won’t be enough for graphical applications that are run from GNOME Shell and use the SSH client. Are there any in our default set of packages? Do we care? I’ll try running dbus-update-activation-environment --systemd SSH_AUTH_SOCK=/run/user/$UID/keyring/ssh
in a unit WantedBy our desktop.target
: this should solve the problem for all kinds of apps. And if it is harder than expected, I’ll fallback on the rough ~/.bashrc
(or /etc/environment
) trick.
#15 Updated by intrigeri 2017-06-25 09:55:48
- Feature Branch set to bugfix/12481-set-SSH_AUTH_SOCK
#16 Updated by intrigeri 2017-06-25 15:51:15
- % Done changed from 10 to 20
- Type of work changed from Research to Code
desktop.target
is started too late, so /etc/X11/Xsession.d/
it will be.
#17 Updated by intrigeri 2017-06-27 13:13:19
- Assignee changed from intrigeri to bertagaz
- % Done changed from 20 to 50
- QA Check set to Ready for QA
Passes the relevant tests.
#18 Updated by intrigeri 2017-06-30 06:18:49
- Target version changed from Tails_3.1 to Tails_3.0.1
#19 Updated by bertagaz 2017-06-30 13:33:06
- Status changed from In Progress to Fix committed
- Assignee deleted (
bertagaz) - % Done changed from 50 to 100
- QA Check changed from Ready for QA to Pass
intrigeri wrote:
> Passes the relevant tests.
Seems to work, reproduced too the bug with 3.0, not with your branch. Merged!
#20 Updated by intrigeri 2017-07-05 19:03:51
- Status changed from Fix committed to Resolved