Feature #14789

Upgrade to Linux 4.13

Added by intrigeri 2017-10-05 08:00:23 . Updated 2017-11-15 11:35:26 .

Status:
Resolved
Priority:
Elevated
Assignee:
Category:
Hardware support
Target version:
Start date:
2017-10-05
Due date:
% Done:

100%

Feature Branch:
feature/14789-linux-4.13
Type of work:
Research
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

It made it into sid a few days ago, so 4.12 won’t get any security update => we should consider this upgrade for Tails 3.3 despite its theoretical bugfix-only status. If no relevant security update is fixed in 4.12..4.13 when it’s time to make this decision, then we can postpone this to Tails 3.4 (and the same reasoning applies again since 3.4 is bugfix-only as well).

If we decide to go this way, we’ll need to use our freeze exception mechanism unless we bump the APT snapshots anyway for Feature #14714.


Files


Subtasks


Related issues

Blocks Tails - Feature #13244: Core work 2017Q4: Foundations Team Resolved 2017-06-29
Blocked by Tails - Feature #14714: Consider upgrading to Stretch 9.2 in Tails 3.3 Resolved 2017-09-24

History

#1 Updated by intrigeri 2017-10-05 08:01:20

#2 Updated by intrigeri 2017-10-05 08:04:06

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10
  • Feature Branch set to feature/14789-linux-4.13

Note that I’ve based the topic branch on devel for now, that’s the simplest way to get CI results as it uses current snapshots of the Debian archive. But if we have to merge this into stable for 3.3, then I’ll need to rebase my work on top of stable first.

#3 Updated by intrigeri 2017-10-05 09:33:29

aufs module does not build due to https://bugs.debian.org/875620. Our build system does not complain despite set -e (because dkms skips it rather than failing), which causes extra useless load on our isotesters: it’ll take a while to verify N times that an ISO does not boot at all. I think we should check if aufs.ko can be found at the end of 50-dkms and abort the build if not. I’ll do that shortly.

#4 Updated by intrigeri 2017-10-08 07:35:20

  • Priority changed from Normal to High

(This will the build of the devel branch.)

#5 Updated by intrigeri 2017-10-08 07:39:15

Full test suite passes except “Spoofing MAC addresses ǂ MAC address spoofing is disabled” and “Emergency shutdown ǂ Tails shuts down on DVD boot medium removal”, both failing because systemctl is-system-running fails; Bug #14772 will help understand what’s going on so I’ll start another run with that other branch merged in so I can investigate (if there are similar failures again).

#6 Updated by intrigeri 2017-10-08 12:37:58

“Emergency shutdown ǂ Tails shuts down on DVD boot medium removal”, “Spoofing MAC addresses ǂ MAC address spoofing is disabled” and “Localization ǂ The Unsafe Browser can be used in all languages supported in Tails” failed again due to systemctl is-system-running returning 1 (“starting”). Three out of these 4 cases, the debug output shows no failed service nor running job when we list them immediately after is-system-running returned !0; and in the last case, there were two jobs “2440 user-0.slice stop waiting” and “2439 user@0.service stop running”. So I think that’s a race condition. We could wrap is-system-running with try_for. Now, I’ve seen these problems happen a lot last time I triaged “false positives”, so it doesn’t seem to be a regression brought by this branch. I’ll re-run these scenarios locally and if they pass I’ll say it’s fine.

The other failure is in “Encryption and verification using GnuPG ǂ Encryption/signing and decryption/verification using OpenPGP Applet”:

01:20:41.549172926: calling as amnesia: xdotool key Super
01:20:41.697639351: call returned: [0, "", ""]
01:20:42.334228966: [log]  TYPE "gedit"
01:20:44.830131784: [log] ( Ctrl )  TYPE "#ENTER."
01:20:45.553284737: [log] CLICK on L(507,50)@S(0)[0,0 1024x768]
01:20:51.375694821: [log]  TYPE "ATTACK AT DAWN"
    When I type a message into gedit                                            # features/step_definitions/encryption.rb:34
01:20:52.240875212: [log] RIGHT CLICK on L(628,385)@S(0)[0,0 1024x768]
01:20:53.953284883: [log] CLICK on L(690,570)@S(0)[0,0 1024x768]
01:20:54.753262199: [log] RIGHT CLICK on L(628,385)@S(0)[0,0 1024x768]
01:21:07.383857920: [log] RIGHT CLICK on L(628,385)@S(0)[0,0 1024x768]
01:21:20.019018845: [log] RIGHT CLICK on L(628,385)@S(0)[0,0 1024x768]
01:21:43.153538114: [log] RIGHT CLICK on L(628,385)@S(0)[0,0 1024x768]
01:21:45.479901915: [log] RIGHT CLICK on L(628,385)@S(0)[0,0 1024x768]
    And I both encrypt and sign the message using my OpenPGP key                # features/step_definitions/encryption.rb:110
      try_for() timeout expired
      Last ignored exception was: FindFailed: can not find GeditCopy.png in S(0)[0,0 1024x768] (Timeout::Error)
      ./features/support/helpers/misc_helpers.rb:90:in `rescue in try_for'
      ./features/support/helpers/misc_helpers.rb:36:in `try_for'
      ./features/step_definitions/common_steps.rb:12:in `context_menu_helper'
      ./features/step_definitions/encryption.rb:60:in `gedit_copy_all_text'
      ./features/step_definitions/encryption.rb:69:in `/^I both encrypt and sign the message using my OpenPGP key$/'
      features/encryption.feature:29:in `And I both encrypt and sign the message using my OpenPGP key'
    Then I can decrypt and verify the encrypted message                         # features/step_definitions/encryption.rb:120

This seems to indicate a bug in context_menu_helper (that moves the mouse in a place where right-click won’t do anything useful) more than in the topic branch.

#7 Updated by intrigeri 2017-10-08 13:28:42

  • Assignee changed from intrigeri to anonym
  • % Done changed from 20 to 50
  • QA Check set to Ready for QA

Seen all these pass locally on 1st try, except I can’t manage to see “Spoofing MAC addresses ǂ MAC address spoofing is disabled” succeed here: the hotplugged 2nd NIC is nowhere to be seen. I’ll blame my version of libvirt. Anyway, it passed on Jenkins: https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_feature-14789-linux-4.13/1/cucumberTestReport/spoofing-mac-addresses/

#8 Updated by intrigeri 2017-10-09 15:18:46

  • related to Feature #14714: Consider upgrading to Stretch 9.2 in Tails 3.3 added

#9 Updated by anonym 2017-10-09 16:23:22

  • % Done changed from 50 to 70
  • QA Check changed from Ready for QA to Info Needed
  • Feature Branch deleted (feature/14789-linux-4.13)

I’ve merged the feature branch into devel so it won’t FTBFS. I leave this ticket open until we know if we want to have this in Tails 3.3 (i.e. the stable branch ATM). In that case we’ll just cherry-pick the relevant commits (the branch won’t do since it is based on devel).

#10 Updated by anonym 2017-10-16 14:06:46

I checked what we are affected by currently like this:

apt-get changelog linux-image-4.13.0-1-amd64 \
  | dpkg-parsechangelog -l - --since 4.12.12-2 \
  | grep --extended-regexp -o 'CVE-[0-9]+-[0-9]+' \
  | while read cve; do
      echo ${cve}
      curl --silent "http://cve.circl.lu/api/cve/${cve}" | \
      ruby -ryaml -rfacets -e \
          'h = YAML.load(STDIN.read);
           puts h ? h["summary"].word_wrap(72) : "RESERVED"'
      echo
    done

and got:

CVE-2017-0786
A elevation of privilege vulnerability in the Broadcom wi-fi driver.
Product: Android. Versions: Android kernel. Android ID: A-37351060.
References: B-V2017060101.

CVE-2017-1000255
RESERVED

CVE-2017-12192
A vulnerability was found in the Key Management sub component of the
Linux kernel, where when trying to issue a KEYTCL_READ on negative key
would lead to a NULL pointer dereference. A local attacker could use
this flaw to crash the kernel.

CVE-2017-5123
RESERVED

CVE-2017-15265
RESERVED

CVE-2017-12188
arch/x86/kvm/mmu.c in the Linux kernel through 4.13.5, when nested
virtualisation is used, does not properly traverse guest pagetable
entries to resolve a guest virtual address, which allows L1 guest OS
users to execute arbitrary code on the host OS or cause a denial of
service (incorrect index during page walking, and host OS crash), aka an
"MMU potential stack buffer overrun."

CVE-2017-12188
arch/x86/kvm/mmu.c in the Linux kernel through 4.13.5, when nested
virtualisation is used, does not properly traverse guest pagetable
entries to resolve a guest virtual address, which allows L1 guest OS
users to execute arbitrary code on the host OS or cause a denial of
service (incorrect index during page walking, and host OS crash), aka an
"MMU potential stack buffer overrun."

CVE-2017-14954
The waitid implementation in kernel/exit.c in the Linux kernel through
4.13.4 accesses rusage data structures in unintended cases, which allows
local users to obtain sensitive information, and bypass the KASLR
protection mechanism, via a crafted system call.

CVE-2017-1000251
The native Bluetooth stack in the Linux Kernel (BlueZ), starting at the
Linux kernel version 3.3-rc1 and up to and including 4.13.1, are
vulnerable to a stack overflow vulnerability in the processing of L2CAP
configuration responses resulting in Remote code execution in kernel
space.

CVE-2017-14340
The XFS_IS_REALTIME_INODE macro in fs/xfs/xfs_linux.h in the Linux
kernel before 4.13.2 does not verify that a filesystem has a realtime
device, which allows local users to cause a denial of service (NULL
pointer dereference and OOPS) via vectors related to setting an
RHINHERIT flag on a directory.

CVE-2017-7558
RESERVED

CVE-2017-14051
An integer overflow in the qla2x00_sysfs_write_optrom_ctl function in
drivers/scsi/qla2xxx/qla_attr.c in the Linux kernel through 4.12.10
allows local users to cause a denial of service (memory corruption and
system crash) by leveraging root access.

CVE-2017-12153
A security flaw was discovered in the nl80211_set_rekey_data() function
in net/wireless/nl80211.c in the Linux kernel through 4.13.3. This
function does not check whether the required attributes are present in a
Netlink request. This request can be issued by a user with the
CAP_NET_ADMIN capability and may result in a NULL pointer dereference
and system crash.

CVE-2017-12154
The prepare_vmcs02 function in arch/x86/kvm/vmx.c in the Linux kernel
through 4.13.3 does not ensure that the "CR8-load exiting" and
"CR8-store exiting" L0 vmcs02 controls exist in cases where L1 omits the
"use TPR shadow" vmcs12 control, which allows KVM L2 guest OS users to
obtain read and write access to the hardware CR8 register.

CVE-2017-14156
The atyfb_ioctl function in drivers/video/fbdev/aty/atyfb_base.c in the
Linux kernel through 4.12.10 does not initialize a certain data
structure, which allows local users to obtain sensitive information from
kernel stack memory by reading locations associated with padding bytes.

CVE-2017-14489
The iscsi_if_rx function in drivers/scsi/scsi_transport_iscsi.c in the
Linux kernel through 4.13.2 allows local users to cause a denial of
service (panic) by leveraging incorrect length validation.

CVE-2017-14497
The tpacket_rcv function in net/packet/af_packet.c in the Linux kernel
before 4.13 mishandles vnet headers, which might allow local users to
cause a denial of service (buffer overflow, and disk and memory
corruption) or possibly have unspecified other impact via crafted system
calls.

CVE-2017-1000252
The KVM subsystem in the Linux kernel through 4.13.3 allows guest OS
users to cause a denial of service (assertion failure, and hypervisor
hang or crash) via an out-of bounds guest_irq value, related to
arch/x86/kvm/vmx.c and virt/kvm/eventfd.c.

#11 Updated by intrigeri 2017-10-21 09:05:55

None of the above issues warrant an upgrade IMO.

Since then 4.13.4-2 was uploaded by a security team member. The most serious issues it fixes seem to be:

  • CVE-2017-5123 allows local attackers to write directly to kernel memory, but does not affect Linux 4.12.x (the bug was introduced in 4.13-rc1)
  • CVE-2017-15265, “unspecified impact” sounds scary, let’s call it a potential local priv. esc.

#12 Updated by intrigeri 2017-10-31 06:27:32

4.13.10-1 is now in sid.

#13 Updated by intrigeri 2017-11-06 15:49:36

  • Assignee changed from anonym to intrigeri

#14 Updated by intrigeri 2017-11-08 15:37:32

  • QA Check deleted (Info Needed)
  • Type of work changed from Research to Code

intrigeri wrote:
> None of the above issues warrant an upgrade IMO.

> Since then 4.13.4-2 was uploaded by a security team member. The most serious issues it fixes seem to be: […]

I’ve looked at these ones + the ones that lacked a description when anonym looked into this (“RESERVED” == embargo’ed, presumaly) + the more recent ones fixed in 4.13.10-1:

  • CVE-2017-1000255: on relevant when running on PowerPC hardware
  • CVE-2017-15299: “The KEYS subsystem in the Linux kernel before 4.13.10 does not correctly synchronize the actions of updating versus finding a key in the ”negative" state to avoid a race condition, which allows local users to cause a denial of service or possibly have unspecified other impact via
    crafted system calls." → possible LPE, no DSA yet
  • CVE-2017-15265: as said above, possible LPE, no DSA yet
  • CVE-2017-15537: “The x86/fpu (Floating Point Unit) subsystem in the Linux kernel before 4.13.5, when a processor supports the xsave feature but not the xsaves feature, does not correctly handle attempts to set reserved bits in the xstate header via the ptrace() or rt_sigreturn() system call, allowing local users to read the FPU registers of other processes on the system, related to arch/x86/kernel/fpu/regset.c and arch/x86/kernel/fpu/signal.c.” → seems low severity
  • CVE-2017-15649: “net/packet/af_packet.c in the Linux kernel before 4.13.6 allows local users to gain privileges via crafted system calls that trigger mishandling of packet_fanout data structures, because of a race condition (involving fanout_add and packet_do_bind) that leads to a use-after-free, a different vulnerability than CVE-2017-6346.” → not sure what’s the impact, potential RCE, rated high severity by Red Hat, medium by Ubuntu; neither Ubuntu nor Debian released fixes yet so let’s assume it’s not a RCE
  • CVE-2017-15951: “The KEYS subsystem in the Linux kernel before 4.13.10 does not correctly synchronize the actions of updating versus finding a key in the ”negative" state to avoid a race condition, which allows local users to cause a denial of service or possibly have unspecified other impact via crafted system calls." → rated medium by Red Hat & Ubuntu; Ubuntu is going to release fixes, no DSA yet
  • CVE-2017-12190: “memory leak when merging buffers in SCSI IO vectors” → rated medium by Red Hat & Ubuntu; Ubuntu is going to release fixes, no DSA yet
  • CVE-2017-7558: “sctp: out-of-bounds read in inet_diag_msg_sctp{,l}addr_fill() and sctp_get_sctp_info()” → we blacklist the sctp module so we should be good; was fixed in a Stretch via DSA

So all in all, none of these seem super scary, but still that’s a lot of potential LPEs and other “unspecified impact”. I think we should upgrade and take the risk of hardware support regressions: the thing is, even if we delay to a future release (by then chances are that we don’t have a choice and have to upgrade to 4.13 to pick more security fixes anymay), even if hardware support regressions are identified post-RC, it’s highly unlikely that we can get them fixed before the final release, so upgrading now does not make a big difference in practice.

Next step is to decide something on Feature #14714: if we’re going to bump our APT snapshots, then we can easily get 4.13; otherwise we’ll have to upload it to our custom APT repo.

#15 Updated by intrigeri 2017-11-08 15:37:47

  • related to deleted (Feature #14714: Consider upgrading to Stretch 9.2 in Tails 3.3)

#16 Updated by intrigeri 2017-11-08 15:37:54

  • blocked by Feature #14714: Consider upgrading to Stretch 9.2 in Tails 3.3 added

#17 Updated by intrigeri 2017-11-08 15:46:18

  • Feature Branch set to feature/14789-linux-4.13

I’ll run a full test suite on the topic branch that I’ve rebased on stable + feature/14714-bump-APT-snapshots (thaw APT snapshots + update Tor Browser profile patch).

#18 Updated by intrigeri 2017-11-08 15:48:22

  • Priority changed from High to Elevated

#19 Updated by intrigeri 2017-11-08 20:26:12

Full test suite run passes on first try except:

  • “Thunderbird can download the inbox with POP3”: local network issue I think.
  • “The persistent Tor Browser directory is usable”: known broken, reported ticket today

So I’m confident with getting this into 3.3.

#20 Updated by intrigeri 2017-11-08 20:31:26

  • Assignee changed from intrigeri to anonym
  • QA Check set to Ready for QA

#21 Updated by intrigeri 2017-11-08 20:31:55

Note that the branch has the commit for Bug #14923 cherry-picked.

#22 Updated by intrigeri 2017-11-09 10:16:48

Two full test suite runs have completed on Jenkins since.

  • in both cases 3 Tor Browser scenarios failed (startup page was not loaded) → not seen it locally so I think it’s a temporary network glitch
  • “A screenshot is taken when the PRINTSCREEN key is pressed” failed once → looks like Bug #13458 i.e. not a regression

I’ve seen all this pass locally at least once so this does not decrease my confidence in merging this for 3.3.

I’m now testing locally an additional commit on top of this branch, that drops our obsolete manual enabling of AppArmor on the kernel command-line: it’s now enabled by default. I should update this branch accordingly in a few hours. It’s a detail and should not block this review’n’merge though.

#23 Updated by intrigeri 2017-11-09 13:41:02

intrigeri wrote:
> I’m now testing locally an additional commit on top of this branch, that drops our obsolete manual enabling of AppArmor on the kernel command-line: it’s now enabled by default. I should update this branch accordingly in a few hours.

Pushed. Full test suite run on my local Jenkins passes except “The persistent Tor Browser directory is usable” (Bug #14935).

#24 Updated by anonym 2017-11-11 14:41:38

  • Status changed from In Progress to Fix committed
  • Assignee deleted (anonym)
  • % Done changed from 70 to 100
  • QA Check changed from Ready for QA to Pass

#25 Updated by anonym 2017-11-15 11:35:26

  • Status changed from Fix committed to Resolved
  • Type of work changed from Code to Research