Bug #8778
"Oh no!" / Xorg crash after logging in at the Greeter on Jessie
0%
Description
This is the problem referred to in Bug #8710.
So far, I have been able to reproduce this with every Jessie Tails ISO I’ve tried. This time I used tails-i386-feature_jessie-1.3-20150122T0511Z-0e61703.iso
.
The steps to reproduce this are few:
1. Get a Tails Jessie ISO.
2. Boot it with kvm -m 4096 -cdrom tails-i386-feature_jessie-1.3-20150122T0511Z-0e61703.iso
3. Log in at the greeter.
The end result is the dreaded “Oh no!” GNOME 3 screen. If you switch to another VT and then go back to X you’ll be back at the Tails greeter.
Attached to this ticket is the result of journalctl -a > journal.txt
. @var/log/gdm@ only contained the file tails-greeter.errors
. Its content:
day 022 of 2015 [18:45:28] Password variable not found.
Files
Subtasks
Related issues
Related to Tails - |
Rejected | 2014-06-12 |
History
#1 Updated by kytv 2015-01-22 19:13:50
kytv wrote:
> This is the problem referred to in Bug #8710.
I meant “that I referred to at Bug #8710#note-1”.
#2 Updated by intrigeri 2015-01-22 19:20:20
- Subject changed from [feature/jessie] "Oh no!" / Xorg crash after logging in at the greeter to "Oh no!" / Xorg crash after logging in at the Greeter on Jessie
- Assignee set to kytv
- Target version set to Tails_2.0
- QA Check set to Info Needed
I’ll need the output of journalctl -ax -o verbose
to better differentiate between the various GDM, gnome-session and X.Org instances.
#3 Updated by kytv 2015-01-22 19:24:28
- File deleted (
journal.txt)
#4 Updated by kytv 2015-01-22 19:25:44
- File <del>missing: journal.txt</del> added
Attachment updated.
#5 Updated by kytv 2015-01-22 19:32:03
- File journal.txt added
I’m re-adding the file in case it was added too soon after the failures and it was missing info that may be of use in diagnosing this problem.
#6 Updated by kytv 2015-01-22 19:32:16
- File deleted (
journal.txt)
#7 Updated by kytv 2015-01-22 19:32:41
- Assignee changed from kytv to intrigeri
- QA Check deleted (
Info Needed)
#8 Updated by intrigeri 2015-01-22 20:02:21
Here are bits of the log I find useful to understand the timing of events:
Thu 2015-01-22 19:22:30.832007 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=31b;b=a38adcd9c0164ccfaa853ad70505b6e6;m=3193e97;t=50d4297e39cc0;x=83e8427bee77a3c6]
MESSAGE=pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Thu 2015-01-22 19:23:03.506610 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=59a;b=a38adcd9c0164ccfaa853ad70505b6e6;m=50bd314;t=50d4299d6313c;x=5e7e8b9172b96802]
SYSLOG_IDENTIFIER=nm-dispatcher
MESSAGE=Dispatching action 'up' for eth0
_PID=2546
Thu 2015-01-22 19:23:07.972295 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5a4;b=a38adcd9c0164ccfaa853ad70505b6e6;m=54ff49f;t=50d429a1a52c7;x=3cfa3c931d8799be]
MESSAGE=/etc/gdm3/PostLogin/Default: line 162: /var/lib/gdm3/tails.password: No such file or directory
Thu 2015-01-22 19:23:08.317312 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5a8;b=a38adcd9c0164ccfaa853ad70505b6e6;m=5553a3f;t=50d429a1f9867;x=fc333b5726d95dfe]
PRIORITY=6
_UID=0
_SYSTEMD_SLICE=system.slice
_BOOT_ID=a38adcd9c0164ccfaa853ad70505b6e6
_MACHINE_ID=fe471aac2ec3c2730d0d14a2069ad59c
_CAP_EFFECTIVE=3fffffffff
_TRANSPORT=syslog
SYSLOG_FACILITY=10
_HOSTNAME=amnesia
_SYSTEMD_CGROUP=/system.slice/gdm.service
_SYSTEMD_UNIT=gdm.service
_COMM=gdm-session-wor
_EXE=/usr/lib/gdm3/gdm-session-worker
_GID=1000
_PID=1956
_CMDLINE=gdm-session-worker [pam/gdm-autologin]
SYSLOG_IDENTIFIER=gdm-autologin]
MESSAGE=pam_unix(gdm-autologin:session): session opened for user amnesia by (unknown)(uid=0)
Thu 2015-01-22 19:23:08.386992 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5b6;b=a38adcd9c0164ccfaa853ad70505b6e6;m=556896c;t=50d429a20e794;x=3bf5168f2ea1d0e2]
MESSAGE=pam_unix(gdm-launch-environment:session): session closed for user Debian-gdm
Thu 2015-01-22 19:23:08.395664 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5b7;b=a38adcd9c0164ccfaa853ad70505b6e6;m=5568eea;t=50d429a20ed12;x=2c04f18f671beefa]
MESSAGE=pam_systemd(gdm-launch-environment:session): Failed to release session: Interrupted system call
Thu 2015-01-22 19:23:08.349504 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5aa;b=a38adcd9c0164ccfaa853ad70505b6e6;m=555b719;t=50d429a201541;x=7c87077f925fd853]
MESSAGE=pam_unix(systemd-user:session): session opened for user amnesia by (uid=0)
Thu 2015-01-22 19:23:08.508258 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5c3;b=a38adcd9c0164ccfaa853ad70505b6e6;m=558223a;t=50d429a228062;x=bfe1ecbf1b0d6e56]
MESSAGE=/etc/gdm3/Xsession: Beginning session setup...
Thu 2015-01-22 19:23:08.896561 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5c4;b=a38adcd9c0164ccfaa853ad70505b6e6;m=55e13a5;t=50d429a2871cd;x=c1c859bd7bfb89d8]
Thu 2015-01-22 19:23:12.916602 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5d0;b=a38adcd9c0164ccfaa853ad70505b6e6;m=59b6652;t=50d429a65c47a;x=43dc49f7ffaec0dd]
_CMDLINE=/usr/bin/dbus-daemon --fork --print-pid 4 --print-address 6 --session
MESSAGE=Successfully activated service 'org.a11y.atspi.Registry'
19:23:13: starting pulseaudio and gnome-keyring-daemon
Thu 2015-01-22 19:23:14.164419 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5e3;b=a38adcd9c0164ccfaa853ad70505b6e6;m=5ae719f;t=50d429a78cfc7;x=40be85e06733cf1b]
SYSLOG_IDENTIFIER=x-session-manager
MESSAGE=WARNING: App 'pulseaudio.desktop' exited with code 1
MESSAGE=Failure: Module initialization failed
Thu 2015-01-22 19:23:16.302979 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5e6;b=a38adcd9c0164ccfaa853ad70505b6e6;m=5cf13ac;t=50d429a9971d4;x=d481e1d3487ae1b6]
MESSAGE=[system] Activating via systemd: service name='org.freedesktop.UDisks2' unit='udisks2.service'
Thu 2015-01-22 19:23:16.887663 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5ec;b=a38adcd9c0164ccfaa853ad70505b6e6;m=5d7fe47;t=50d429aa25c6f;x=98978d11be15321e]
SYSLOG_IDENTIFIER=org.gtk.Private.AfcVolumeMonitor
MESSAGE=Volume monitor alive
Thu 2015-01-22 19:23:16.910460 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5ed;b=a38adcd9c0164ccfaa853ad70505b6e6;m=5d858f3;t=50d429aa2b71b;x=65294fbf64c52f8d]
MESSAGE=[system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service'
Thu 2015-01-22 19:23:17.740102 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5ef;b=a38adcd9c0164ccfaa853ad70505b6e6;m=5e50440;t=50d429aaf6268;x=4a5c261a35efbd6d]
MESSAGE=[system] Successfully activated service 'org.freedesktop.hostname1'
Thu 2015-01-22 19:23:17.870697 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5f0;b=a38adcd9c0164ccfaa853ad70505b6e6;m=5e6fe41;t=50d429ab15c69;x=b17ae10265a9cae4]
starting to kill GDM's xorg
Thu 2015-01-22 19:23:18.034150 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=5f1;b=a38adcd9c0164ccfaa853ad70505b6e6;m=5ea3050;t=50d429ab48e78;x=2190e574b6bd985a]
SYSLOG_IDENTIFIER=cupsd
MESSAGE=Unable to change ownership of "/var/log/cups" - Permission denied
... and then more.
Thu 2015-01-22 19:23:18.996487 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=613;b=a38adcd9c0164ccfaa853ad70505b6e6;m=5f82d51;t=50d429ac28b79;x=d19faeab5aaa01c6]
MESSAGE=[system] Activating via systemd: service name='org.freedesktop.locale1' unit='dbus-org.freedesktop.locale1.service'
Thu 2015-01-22 19:23:19.062520 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=617;b=a38adcd9c0164ccfaa853ad70505b6e6;m=5f92f76;t=50d429ac38d9e;x=d4875f7734ab7838]
MESSAGE=[system] Successfully activated service 'org.freedesktop.locale1'
Thu 2015-01-22 19:23:24.402946 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=61a;b=a38adcd9c0164ccfaa853ad70505b6e6;m=64aac12;t=50d429b150a3a;x=bd0b18c2a0f88a64]
MESSAGE=Registered Authentication Agent for unix-session:1 (system bus name :1.43 [/usr/bin/gnome-shell], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8)
Thu 2015-01-22 19:23:25.559911 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=61d;b=a38adcd9c0164ccfaa853ad70505b6e6;m=65c536c;t=50d429b26b194;x=2cf00269694cb5b9]
MESSAGE=pam_unix(login:session): session opened for user root by LOGIN(uid=0)
_SYSTEMD_CGROUP=/system.slice/system-getty.slice/getty@tty2.service
Thu 2015-01-22 19:23:26.810898 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=62e;b=a38adcd9c0164ccfaa853ad70505b6e6;m=66f69c4;t=50d429b39c7ec;x=7825eb86ce6c8c31]
_CMDLINE=sudo -n -u debian-tor /usr/local/sbin/tor-has-bootstrapped
MESSAGE=pam_unix(sudo:session): session opened for user debian-tor by (uid=0)
_PID=2902
Thu 2015-01-22 19:23:36.866314 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=64c;b=a38adcd9c0164ccfaa853ad70505b6e6;m=708d7e2;t=50d429bd3360a;x=9c1d954af0f09984]
MESSAGE=amnesia : TTY=unknown ; PWD=/home/amnesia ; USER=debian-tor ; COMMAND=/usr/local/sbin/tor-has-bootstrapped
_CMDLINE=sudo -n -u debian-tor /usr/local/sbin/tor-has-bootstrapped
Thu 2015-01-22 19:23:37.317343 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=651;b=a38adcd9c0164ccfaa853ad70505b6e6;m=70fbaaf;t=50d429bda18d7;x=fcdac88b95cafec7]
_CMDLINE=sudo -n -u debian-tor /usr/local/sbin/tor-has-bootstrapped
MESSAGE=pam_unix(sudo:session): session opened for user debian-tor by (uid=0)
Thu 2015-01-22 19:23:43.044585 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=656;b=a38adcd9c0164ccfaa853ad70505b6e6;m=7671dc1;t=50d429c317be9;x=45ed00ab91852c9b]
MESSAGE=Gjs-Message: JS WARNING: [/usr/share/gnome-shell/extensions/launch-new-instance@gnome-shell-extensions.gcampax.github.com/extension.js 9]: assignment to undeclared variable _activateOriginal
Thu 2015-01-22 19:23:45.301564 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=659;b=a38adcd9c0164ccfaa853ad70505b6e6;m=7898e14;t=50d429c53ec3c;x=9e75ffa87f63b789]
MESSAGE=Gjs-Message: JS WARNING: [/usr/share/gnome-shell/extensions/shutdown-helper@tails.boum.org/extension.js 137]: assignment to undeclared variable extension
Thu 2015-01-22 19:23:45.301564 UTC [s=ac50e06708e0463b84ea82d7e2f781cb;i=659;b=a38adcd9c0164ccfaa853ad70505b6e6;m=7898e14;t=50d429c53ec3c;x=9e75ffa87f63b789]
MESSAGE=(gnome-shell:2819): mutter-WARNING **: STACK_OP_RAISE_ABOVE: window 0x4f00c00016 not in stack
kytv:
- had GNOME crashed already when you opened the root session on tty2?
- may you please retry after removing the apparmor-related parameters on the kernel command-line, just to be sure?
#9 Updated by intrigeri 2015-01-22 20:02:46
- Assignee changed from intrigeri to kytv
- QA Check set to Info Needed
- Affected tool deleted (
Greeter)
#10 Updated by intrigeri 2015-01-22 20:06:32
- related to
Bug #7323: Wheezy's GNOME crashes randomly after Greeter login added
#11 Updated by intrigeri 2015-01-22 22:11:30
- would be good if you could take note of the exact second when the “oh no!” message appears, to correlate it with other events (most likely, a timeout expiring).
- may I have a log of processes open by uid 1000 (e.g. every second) and another journal from the same boot, with all options used above? best if it’s the same boot as for previous bullet point
#12 Updated by intrigeri 2015-01-22 23:13:18
I’m both sad and happy to have reproduced here:
- with the same kvm command-line
- with
-cpu qemu32
(“QEMU Virtual CPU version 2.1.2”) - with
-cpu qemu64
(“QEMU Virtual CPU version 2.1.2”) - with
-machine pc-i440fx-2.0,accel=kvm,usb=off -cpu qemu32 -smp 2,sockets=2,cores=1,threads=1
- with
-machine pc-i440fx-2.0,accel=kvm,usb=off -cpu qemu64 -smp 2,sockets=2,cores=1,threads=1
This means that I can probably debug this myself. OTOH, I’d be happy to have the info I’ve asked Kill Your TV to gather :)
However, I could not reproduce this with -machine pc-i440fx-2.0,accel=kvm,usb=off -cpu SandyBridge,+invtsc,+invpcid,+erms,+bmi2,+smep,+avx2,+bmi1,+fsgsbase,+abm,+pdpe1gb,+rdrand,+f16c,+osxsave,+movbe,+pcid,+pdcm,+xtpr,+fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme
(creds go to libvirt for generating these bits for me).
In all cases, I was running the test suite in parallel in another KVM guest, not tried without yet. On failures, in most cases I’ve briefly seen some of the desktop (e.g. desktop icons) appear with a black background before the “oh no!” screen.
#13 Updated by intrigeri 2015-01-23 11:10:52
Also, it would be useful to figure out where (in what .desktop
file?) we can pass --debug
to gnome-session
. I guess that this flag would help us understand what exactly is going on.
#14 Updated by intrigeri 2015-01-23 12:15:12
intrigeri wrote:
> Also, it would be useful to figure out where (in what .desktop
file?) we can pass --debug
to gnome-session
. I guess that this flag would help us understand what exactly is going on.
Appending --debug
to the Exec
and TryExec
lines in /usr/share/xsessions/gnome-classic.desktop
seems to produce the intended effect. Then, I see that the command run by gnome-shell-classic.desktop
(that is, /usr/bin/gnome-shell
) is exiting with code 1.
#15 Updated by intrigeri 2015-01-23 12:52:05
An error message I see just after the one about GNOME Shell dying is LLVM ERROR: Do not know how to split the result of this operator
. I don’t see it with the slightly different CPU configuration where I cannot reproduce the bug. This looks like https://freedesktop.org/patch/34445/, http://llvm.org/bugs/show_bug.cgi?id=15929 and https://bugs.launchpad.net/ubuntu/+source/llvm-toolchain-3.5/+bug/1360241.
#16 Updated by intrigeri 2015-01-23 13:38:36
Works:
-machine pc-i440fx-2.0,accel=kvm,usb=off -cpu SandyBridge
-machine pc-i440fx-2.0,accel=kvm,usb=off -cpu pentium3
-machine pc-i440fx-2.0,accel=kvm,usb=off -cpu pentium2
-machine pc-i440fx-2.0,accel=kvm,usb=off -cpu pentium
Buggy:
-machine pc-i440fx-2.0,accel=kvm,usb=off -cpu qemu64,+invtsc,+invpcid,+erms,+bmi2,+smep,+avx2,+bmi1,+fsgsbase,+abm,+pdpe1gb,+rdrand,+f16c,+osxsave,+movbe,+pcid,+pdcm,+xtpr,+fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme
-machine pc-i440fx-2.0,accel=kvm,usb=off -cpu qemu64,+invtsc,+invpcid,+erms,+bmi2,+smep,+avx2,+bmi1,+fsgsbase,+abm,+pdpe1gb,+rdrand,+f16c,+osxsave,+movbe,+pcid,+pdcm,+xtpr,+fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme,+sse2,+sse
The difference between -cpu pentium2
and -cpu qemu32
is:
- pentium2 adds:
- mca, mtrr, pse36 (that pentium hasn’t, so probably irrelevant)
- vme
- qemu32 adds:
- EXT_POPCNT
- EXT_SSE3
- SSE
- SSE2
And then:
qemu32,+vme
doesn’t workqemu32,+vme,+mca,+mtrr,+pse36
doesn’t work eitherqemu32,-sse,-sse2,-sse3
works fine
=> it looks like we’re hitting a bug in how llvmpipe behaves with some very specific combination of CPU features. On the one hand, it’s a shame that we fail with the default QEMU virtual CPU. On the other hand, it seems that no real hardware is affected, and when running in QEMU, one (starting with our own test suite) should specify a CPU that’s closer to common bare metal ones and/or to the host CPU.
#17 Updated by Tails 2015-01-23 14:12:40
- Status changed from Confirmed to In Progress
Applied in changeset commit:2983b2ec0702642af5f2e538727e73de5abcd004.
#18 Updated by intrigeri 2015-01-23 14:23:33
- Priority changed from Normal to Elevated
- % Done changed from 0 to 20
- QA Check changed from Info Needed to Ready for QA
Aforementioned commit should fix this problem in our automated test suite. And commit:d0cdbce1a289919f4056ca32a60425bbd16c0ff5 documents it as a known issue, along with a workaround.
Kill Your TV, may you please confirm that this fixes the problem for you?
#19 Updated by kytv 2015-01-23 19:46:11
- Assignee changed from kytv to intrigeri
- QA Check changed from Ready for QA to Dev Needed
Unfortunately I still have this problem when running the test suite in a nested VM.
I reset the Level 1 guest (the one that I run the test suite in) to use the host CPU in its configs. The libvirt generated qemu
command line:
qemu-system-x86_64 -enable-kvm -name TestSuite -S -machine pc-i440fx-2.1,accel=kvm,usb=off -cpu Opteron_G4,+invtsc,+perfctr_nb,+perfctr_core,+topoext,+nodeid_msr,+lwp,+wdt,+skinit,+ibs,+osvw,+cr8legacy,+extapic,+cmp_legacy,+fxsr_opt,+mmxext,+osxsave,+monitor,+ht,+vme -m 10240 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid d7fb4a44-3cf7-4c4b-baa4-7c82172aa77b -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/TestSuite.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2 -drive file=/VMs/libvirt/images/testsuite.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=unsafe -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-ide0-0-1,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:5d:da:73,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
Level 2 (TailsToaster
) libvirt generated command line looks like
qemu-system-x86_64 -name TailsToaster -S -machine pc-0.15,accel=kvm,usb=off -cpu qemu64,+fma4,+xop,+3dnowprefetch,+misalignsse,+sse4a,+abm,+lahf_lm,+pdpe1gb,+hypervisor,+avx,+osxsave,+xsave,+aes,+popcnt,+x2apic,+sse4.2,+sse4.1,+ssse3,+pclmuldq -m 1280 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid f6c0e2d6-4260-43ef-b66a-9956124e2a23 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/TailsToaster.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -drive file=/home/kytv/tails/tails-i386-feature_jessie-1.3-20150120.iso,if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:ac:dd:ee,bus=pci.0,addr=0x3 -chardev socket,id=charserial0,host=127.0.0.1,port=1337,server,nowait -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:1 -device qxl-vga,id=video0,ram_size=67108864,vram_size=9437184,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on
/proc/cpuinfo
on the host:
processor : 0
vendor_id : AuthenticAMD
cpu family : 21
model : 1
model name : AMD FX(tm)-6100 Six-Core Processor
stepping : 2
microcode : 0x600063d
cpu MHz : 3300.000
cache size : 2048 KB
physical id : 0
siblings : 6
core id : 0
cpu cores : 3
apicid : 16
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 nodeid_msr topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bogomips : 6630.28
TLB size : 1536 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb
I can work on getting the debug info requested earlier if it’d still be of use.
#20 Updated by intrigeri 2015-01-23 21:42:48
> Unfortunately I still have this problem when running the test suite in a nested VM.
Too bad :(
I’m curious if you can still replicate this bug with the -machine
and -cpu
combinations that work for me => can you please try giving e.g. a pentium
, pentium2
or pentium3
CPU to the level 2 VM (hopefully that can work on a AMD host), or any combination that works for me?
> I can work on getting the debug info requested earlier if it’d still be of use.
It would be good to know (using gnome-session --debug
, as explained above) what part of the GNOME session fails to load, and especially whether you see the same LLVM error as me. If that’s the case, then it would be good to try and reproduce this bug using:
- the same virtualization setup (same level 1 guest, and same domain configuration for the level 2 guest)
- regular Debian Jessie
- GNOME Shell in Classic mode
… and if it works fine, then retry with the same set of GNOME Shell extensions that we enable in feature/jessie
.
#21 Updated by intrigeri 2015-01-23 22:16:05
- Assignee changed from intrigeri to kytv
- QA Check changed from Dev Needed to Info Needed
#22 Updated by kytv 2015-01-24 00:09:23
- QA Check changed from Info Needed to Dev Needed
Some results on the level 2 VM:
-cpu SandyBridge
yields a crash with the sameLLVM
message referenced above.-cpu pentium3
loads the desktop normally with no crashes
#23 Updated by kytv 2015-01-24 00:09:52
- QA Check changed from Dev Needed to Info Needed
#24 Updated by intrigeri 2015-01-24 08:29:18
It seems that Ubuntu had this llvmpipe bug, and then reverted to build mesa against llvm-3.4 for a while, and now they have a newer mesa, and they now built against llvm 3.5, just like Debian. So perhaps only the newer version of mesa works fine with llvm 3.5. So we should retry with:
- an ISO that has that set of packages rebuilt with llvm-3.4;
- an ISO that has the sid version of all binary packages we install that are built from the mesa source package.
And then we’ll have enough info to tell the Debian or upstream mesa folks that something’s wrong.
#25 Updated by intrigeri 2015-01-24 08:47:31
kytv wrote:
> Some results on the level 2 VM:
>
> * -cpu SandyBridge
yields a crash with the same LLVM
message referenced above.
> * -cpu pentium3
loads the desktop normally with no crashes
Can you please bisect that a bit and find a 64-bit Intel -cpu
that works for you? (See kvm -cpu help
for the full list.) Using it for our automated test suite would be a good enough stopgap measure until we have the llvmpipe bug fixed.
#26 Updated by kytv 2015-01-25 18:42:49
intrigeri wrote:
> Can you please bisect that a bit and find a 64-bit Intel -cpu
that works for you?
Certainly! (It was already on my personal TODO list to satisfy my curiosity)
#27 Updated by kytv 2015-01-25 21:00:51
The command run for each of these: kvm -cpu $CPUTYPE -m 4096 -cdrom tails-i386-feature_jessie-1.3-20150120.iso
Working
means “I login at the Greeter and get to the desktop.”
Not working
means “I login at the Greeter and see the dreaded ‘Oh no!’ screen.”
None of these are working in the level 2 VM:
- SandyBridge
- core2duo
- coreduo
- Broadwell
- Haswell
- Westmere
- Nehalem
- Penryn
- Conroe
- n270
These are working:
- host
- kvm64
- Opteron_G1
- Opteron_G2
- Opteron_G3
- Opteron_G4
- Opteron_G5
- phenom
Interestingly, host
as set in changeset 2983b2ec0702642af5f2e538727e73de5abcd004 did not work for me in the test suite (as I noted above at Bug #8778#19).
#28 Updated by kytv 2015-01-26 02:24:11
- Assignee deleted (
kytv) - QA Check deleted (
Info Needed)
FWIW, I tried SandyBridge and core2duo in a level 1 VM and they also crashed, so this doesn’t appear to be a nested VM problem.
#29 Updated by intrigeri 2015-01-26 10:45:22
- Assignee set to kytv
- QA Check set to Info Needed
kytv wrote:
> FWIW, I tried SandyBridge and core2duo in a level 1 VM and they also crashed, so this doesn’t appear to be a nested VM problem.
Would be good to know if these crashes are the same llvmpipe bug as I’ve seen (see Bug #8778#note-20).
#30 Updated by kytv 2015-01-26 15:37:10
intrigeri wrote:
> kytv wrote:
> > FWIW, I tried SandyBridge and core2duo in a level 1 VM and they also crashed, so this doesn’t appear to be a nested VM problem.
>
> Would be good to know if these crashes are the same llvmpipe bug as I’ve seen (see Bug #8778#note-20).
Seems to be the same. (This was with -cpu SandyBridge
in a Level 1 VM on an AMD host).
Mon 2015-01-26 15:31:32.305172 UTC [s=8436e2e3d3f14492bea1592751a01c6a;i=687;b=09aad8b5a40a44bf9dfbc6dcf436d7b1;m=905294c;t=50d8fd538c514;x=4c87b186a0cdf245]
PRIORITY=6
_BOOT_ID=09aad8b5a40a44bf9dfbc6dcf436d7b1
_MACHINE_ID=417a22869e1544f2ba02cf3fad947857
_TRANSPORT=stdout
_HOSTNAME=amnesia
_CAP_EFFECTIVE=0
_EXE=/usr/bin/gnome-session
SYSLOG_IDENTIFIER=gnome-session
_GID=1000
_AUDIT_SESSION=2
_AUDIT_LOGINUID=1000
_SYSTEMD_OWNER_UID=1000
_SYSTEMD_SLICE=user-1000.slice
_UID=1000
_PID=2749
_SYSTEMD_CGROUP=/user.slice/user-1000.slice/session-2.scope
_SYSTEMD_SESSION=2
_SYSTEMD_UNIT=session-2.scope
_COMM=x-session-manag
_CMDLINE=x-session-manager
MESSAGE=LLVM ERROR: Do not know how to split the result of this operator!
#31 Updated by intrigeri 2015-01-26 19:42:43
>> Would be good to know if these crashes are the same llvmpipe bug as I’ve seen (see Bug #8778#note-20).
> Seems to be the same.
Thanks! Next best steps are then the ones described in Bug #8778#note-24, I think.
Unless someone has a better idea? (I’d love it :)
#32 Updated by Tails 2015-02-25 13:33:27
Applied in changeset commit:8544f14485289653b8dde9a29dc496f79e768866.
#33 Updated by BitingBird 2015-02-25 20:54:45
- QA Check deleted (
Info Needed)
#34 Updated by intrigeri 2015-02-26 08:59:15
- Assignee deleted (
kytv)
#35 Updated by intrigeri 2015-03-08 16:21:17
Also see https://bugs.debian.org/770130 and merged bug reports.
#36 Updated by intrigeri 2015-03-08 17:50:40
- Category set to Hardware support
- Assignee set to kytv
- % Done changed from 20 to 30
- QA Check set to Ready for QA
I’ve rebuilt the mesa source package with the patch from https://freedesktop.org/patch/34445/, and uploaded the result to our feature-jessie
package. Going to build an ISO and test locally. Kill Your TV, may you please do the same and report back whether it fixes the bug for you?
#37 Updated by intrigeri 2015-03-08 19:56:26
- % Done changed from 30 to 40
intrigeri wrote:
> Going to build an ISO and test locally.
It fixes the bug for me with qemu-system-x86_64 -enable-kvm -machine pc-i440fx-2.0,accel=kvm,usb=off -cdrom foo.iso -m 2048 -cpu qemu32
(and I could reproduce the bug again with the same command-line, and an older ISO built from feature/jessie).
#38 Updated by kytv 2015-03-08 22:41:39
- % Done changed from 40 to 50
intrigeri wrote:
> I’ve rebuilt the mesa source package with the patch from https://freedesktop.org/patch/34445/
GREAT find!
So far, so good with the default kvm command line that I included in my original report. I’ll try with my nested VM set-up later—maybe I’ll finally be able to do the test suite stuff for feature/jessie
. ;)
#39 Updated by kytv 2015-03-09 18:49:51
Initial findings:
These did not work before. They still do not work.
- SandyBridge
- Broadwell
- Haswell
- Westmere
- Nehalem
- Penryn
These worked before and still work:
- host
- kvm64
- Opteron_G1
- Opteron_G2
- Opteron_G3
- Opteron_G4
- Opteron_G5
- phenom
These did not work before but they work now.
- core2duo
- coreduo
- Conroe
- n270
- qemu32
Logs will be forthcoming.
#40 Updated by intrigeri 2015-03-09 20:51:53
> Initial findings:
Thanks! May you please send this information to the corresponding Debian bug report, by replying to the email I’ve Cc’ed you? Otherwise, I can do it, just tell me.
#41 Updated by kytv 2015-03-09 21:17:17
intrigeri wrote:
> > Initial findings:
>
> Thanks! May you please send this information to the corresponding Debian bug report, by replying to the email I’ve Cc’ed you? Otherwise, I can do it, just tell me.
I did that ~5 minutes after posting it here :)
#42 Updated by kytv 2015-03-09 22:30:42
I’m assuming that the error logged will be the same for all of the configs that fail. With SandyBridge
:
Mon 2015-03-09 22:02:18.006526 UTC [s=eb33466202884b30aef3339fee8ed7a2;i=748;b=298ef4f4f5c3475ebacdd57abc1aa2fa;m=da900d1;t=510e2300787fe;x=497f9a9f60f18c15]
PRIORITY=6
_BOOT_ID=298ef4f4f5c3475ebacdd57abc1aa2fa
_MACHINE_ID=c559c5e56b134ab5a5acf6c74eba069f
_TRANSPORT=stdout
_HOSTNAME=amnesia
_CAP_EFFECTIVE=0
_EXE=/usr/bin/gnome-session
SYSLOG_IDENTIFIER=gnome-session
_GID=1000
_AUDIT_SESSION=2
_AUDIT_LOGINUID=1000
_SYSTEMD_OWNER_UID=1000
_SYSTEMD_SLICE=user-1000.slice
_UID=1000
_PID=2728
_SYSTEMD_CGROUP=/user.slice/user-1000.slice/session-2.scope
_SYSTEMD_SESSION=2
_SYSTEMD_UNIT=session-2.scope
_COMM=x-session-manag
_CMDLINE=x-session-manager
MESSAGE=LLVM ERROR: Do not know how to split the result of this operator!
As mentioned on the Debian bug, my CPU doesn’t have support for all of the Sandy Bridge (and probably other CPU) features, but the failure should be more graceful, falling back to something that’s almost certainly going to work.
#43 Updated by intrigeri 2015-03-10 02:21:16
> As mentioned on the Debian bug, my CPU doesn’t have support for all of the Sandy
> Bridge (and probably other CPU) features, but the failure should be more graceful,
> falling back to something that’s almost certainly going to work.
I see what you mean. Arguably that’s a QEMU design problem, that may be hard to change now without breaking tons of stuff. Anyway: as far as the mesa bug is concerned, we should probably test only vcpus that can actually be fully emulated.
#44 Updated by kytv 2015-03-10 03:22:56
I meant that gnome-shell/mesa should just work. :) Failing to find a non-essential CPU flag (if that’s what’s happening) shouldn’t be able halt the displaying of the desktop. I see this as gnome-shell giving up (ungracefully) when it doesn’t have to as opposed to it being a qemu problem, especially since this same crash was seen on real, non-broken hardware.
(So many of the changes in recent GNOME versions bother me but that’s a discussion for another time—if ever).
I’m still unable to run the test suite in Jessie. To be investigated…
#45 Updated by kytv 2015-03-10 03:59:01
- Assignee changed from kytv to intrigeri
- % Done changed from 50 to 80
- QA Check changed from Ready for QA to Pass
kytv wrote:
>
> I’m still unable to run the test suite in Jessie. To be investigated…
It looks like I finally can. When I first ran into this problem, first seen in the test suite, I set my lvl1 vm to “Host CPU”. That didn’t fix it but I kept that setting.
After removing it (in virt-manager
, selecting Hypervisor Default
) my lvl1 libvirt xml file changed thusly:
- <cpu mode='custom' match='exact'>
- <model fallback='allow'>Opteron_G4</model>
- <vendor>AMD</vendor>
- <feature policy='require' name='perfctr_core'/>
- <feature policy='require' name='monitor'/>
- <feature policy='require' name='skinit'/>
- <feature policy='require' name='ibs'/>
- <feature policy='require' name='mmxext'/>
- <feature policy='require' name='osxsave'/>
- <feature policy='require' name='vme'/>
- <feature policy='require' name='topoext'/>
- <feature policy='require' name='fxsr_opt'/>
- <feature policy='require' name='cr8legacy'/>
- <feature policy='require' name='ht'/>
- <feature policy='require' name='wdt'/>
- <feature policy='require' name='extapic'/>
- <feature policy='require' name='osvw'/>
- <feature policy='require' name='nodeid_msr'/>
- <feature policy='require' name='perfctr_nb'/>
- <feature policy='require' name='cmp_legacy'/>
- <feature policy='require' name='lwp'/>
- <feature policy='require' name='invtsc'/>
- </cpu>
Now I can see the desktop in the test suite within my nested VM set-up.
After all that I can say that the patched Mesa pkgs fixed my problem. intrigeri
#46 Updated by intrigeri 2015-03-10 09:54:24
> I meant that gnome-shell/mesa should just work. :) Failing to find a non-essential CPU flag (if that’s what’s happening) shouldn’t be able halt the displaying of the desktop.
Indeed, you’re right (note that there’s little that GNOME Shell can do about it if the underlying hardware drivers fail).
#47 Updated by intrigeri 2015-03-10 09:54:42
>> I’m still unable to run the test suite in Jessie. To be investigated…
> It looks like I finally can.
Woohoo!
#48 Updated by intrigeri 2015-05-16 09:35:18
- Status changed from In Progress to Resolved
- Assignee deleted (
intrigeri) - % Done changed from 80 to 100
Please open another ticket if the problem comes back.
#49 Updated by goupille 2016-02-09 15:11:48
- related to
Bug #11096: "Oh no!" / Xorg crash after logging in at the Greeter with Intel 855GM graphics added
#50 Updated by intrigeri 2016-04-29 13:38:04
- related to deleted (
)Bug #11096: "Oh no!" / Xorg crash after logging in at the Greeter with Intel 855GM graphics