Feature #11143

Harden Tails kernel with security-related kernel parameters

Added by cypherpunks 2016-02-19 13:13:22 . Updated 2016-06-08 01:25:42 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Target version:
Start date:
2016-02-19
Due date:
% Done:

100%

Feature Branch:
feature/11143-harden-kernel
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

There are a few kernel parameters which can be safely added to the Tails boot command line which increase security at little to no cost, and some of which improve security pretty noticably. Here I present a few kernel parameters which can improve the security of Tails against kernel exploits, their rational, and rough cost in terms of performance, compatibility, or memory footprint. I have been adding these to Tails each time I boot manually for around a year on various machines and have never had any problem with any of them. I hope you’ll consider utilizing them to harden Tails from kernel exploits. If any additional information is needed on any of the options, I will be happy to do more research into them and provide relevant kernel code snippets if necessary.

slab_nomerge
Disables the merging of slabs of similar sizes. Many times some obscure slab will be used in a vulnerable way, allowing an attacker to mess with it more or less arbitrarily. Most slabs are not usable even when exploited, so this isn’t too big of a deal. Unfortunately the kernel will merge similar slabs to save a tiny bit of space, and if a vulnerable and useless slab is merged with a safe but useful slab, an attacker can leverage that aliasing to do far more harm than they could have otherwise. In effect, this reduces kernel attack surface area by isolating slabs from each other. The trade-off is a very slight increase in kernel memory utilization. “slabinfo -a” can be used to tell what the memory footprint increase would be on a given system.

slub_debug=FZ
Enables sanity checks (F) and redzoning (Z). Sanity checks are self-evident and come with a modest performance impact, but this is unlikely to be significant on an average Tails system. The checks are basic but are still useful both for security and as a debugging measure. Redzoning adds extra areas around slabs that detect when a slab is overwritten past its real size, which can help detect overflows. Its performance impact is negligible. I did consider adding the P value which enables poisoning. Poisoning writes an arbitrary value to freed objects, so any modification or reference to that object after being freed or before being initialized will be detected and prevented. This prevents many types of use-after-free vulns at little perf cost. Unfortunately, the default poison value points into userland and might make exploitation easier on systems without SMAP (aka most systems), so I excluded the P. I’ll look into it more to see if the trade-offs (increased vulnerability to dereferencing into userland memory in exchange for increased resistence to UAFs) are worth it, but until then I left it out to be safe. An additional note: any time slub_debug= is put in the kernel command line, slab_nomerge is implied. But having slab_nomerge explicitely declared can help prevent regressions where disabling of debugging features is desired but re-enabling of merging is not.

vsyscall=none
Virtual syscalls are the obsolete predecessor of vDSO calls. Unfortunately, both vsyscall=native and vsyscall=emulate (the default) have a negative security impact, with the latter a little less so. Namely, they provide a target for any attacker who has control of the return instruction pointer, which is increasingly common these days now that attackers need to resort to ROP and similar attacks which target a process’ control flow. The impact of this is with reduced compatibility, however only legacy statically compiled binaries and old versions of glibc used vsyscalls. All software on modern Tails uses vDSO instead. If for some reason a program does try to use a vsyscall, the process will crash with a memory access violation, and won’t bring the whole system down.

mce=0
Mostly useful for systems with ECC memory, setting mce to 0 will cause the kernel to panic on any uncorrectable errors detected by the machine check exception system. Corrected errors will just be logged. The default is mce=1, which will SIGBUS on many uncorrected errors. Unfortunately this means malicious processes which try to exploit hardware bugginess (such as rowhammer) will be able to try over and over, suffering only a SIGBUS at failure. Setting mce=0 should have no impact. Any hardware which regularly triggers a memory-based MCE is unlikely to even boot, and the default is 1 only for long-lived servers.

oops=panic
Sets the kernel to fail-fast, which is highly desirable from a security-perspective (see https://en.wikipedia.org/wiki/Fail-fast for an extremely useful and succinct explaination which provides very useful reasoning). Many kernel exploits hit the kernel hard and fail many times before finally hitting the sweet spot and gaining full control over kernel space. A large percentage of these times, the failures result in a kernel oops, rather than a kernel panic. Setting oops=panic will trigger a true stop error instead. This may be problematic for machines using very buggy drivers which cause harmless oopses. These systems will simply crash. I think this is very unlikely on a Tails system though. oops=panic can also be set as a sysctl, which may be preferable because it could also allow a few other panic_on_* features to be enabled which for some reason do not have their own kernel parameters, such as panic_on_warn, panic_on_unknown_nmi, and panic_on_io_nmi. There’s also panic_on_oom which might be useful to prevent the system from locking up when memory pressure is high and not responding to a yanked out USB stick, but that’s another discussion…

Summary: slab_nomerge slightly increases memory footprint, but this shouldn’t matter for Tails because it’s not an embedded system. slub_debug=FZ increases memory footprint slightly, and has a moderate performance impact in benchmarks, but is unlikely to have any impact in the real world. Remove the “F” to remove the majority of that perf impact. vsyscall=none breaks very old apps but Tails uses none of these anyway. mce=0 prevents malicious programs from trying to exploit hardware bugs by giving them only one shot at it. oops=panic causes the system to fail-fast, which is desirable from a security perspective. Systems with very buggy drivers may crash with this option set.

Additional options I am looking into are reboot=cold (may make certain types of cold-boot attacks harder if memory is not removed from the system), acpi=copy_dsdt (may harden the system slightly from buggy BIOSes), and elevator=deadline (might reduce kernel surface area, with a nice side effect of improving USB and SSD performance). I may post rational for them as well if they turn out to be useful security-wise.


Files


Subtasks


History

#1 Updated by intrigeri 2016-02-19 20:33:30

Hey! Cool, thanks! DrWhax, ioerror, jvoisin: any recommandation about this?

#2 Updated by jvoisin 2016-02-21 10:01:36

First of all, thank you for this nicely documented issue. I agree with every points, except the latest one (oops=panic). On recent/barely-supported hardware, oops are happening frequently. I don’t think that we should panic on them, since this will prevent users with bleeding-edge computers to use Tails at all.

#3 Updated by cypherpunks 2016-02-21 11:50:02

That’s too bad. Do the people who get oopses from bleeding-edge hardware tend to get them immediately, or are they delayed or appear at random intervals? I think a nice, albeit slightly hacky compromise would be to have kernel.panic_on_oops=0 upon boot, and then set kernel.panic_on_oops=1 after the GNOME desktop starts up if the kernel has not oopsed by then. That would opportunistically provide a security improvement on supported hardware.

Anyway I’m in the process of benchmarking the various boot options on a spare laptop using unixbench. I’ll post the results for the following boot combinations when it finishes: “toram”, “toram slab_nomerge”, “toram slub_debug=F”, “toram slub_debug=Z”, “toram slub_debug=FZ”, and “toram slab_nomerge slub_debug=FZ vsyscall=none mce=0 oops=panic”. I used toram so the benchmark data would not be skewed by a cheap low-speed flash drive.

Does the Tails testing suite do any benchmarking to detect performance regressions or anything, or does it purely exercise Tails functionality?

Also, is anyone on the Tails team experienced with kernel exploitation? I’m not familiar enough with it myself to say for certain whether or not slab poisoning harms security more than it benefits it, for the reasons outlined in the original post.

#4 Updated by jvoisin 2016-02-21 11:58:49

On my shiny bleeding-edge work’s laptop, I’ve got a lot of oops due to the graphic card, in a regular fashion, during boot, and usage.


Does the Tails testing suite do any benchmarking to detect performance regressions or anything, or does it purely exercise Tails functionality?


I don’t think that there is any benchmark in the testsuite yet :/


Also, is anyone on the Tails team experienced with kernel exploitation? I'm not familiar enough with it myself to say for certain whether or not slab poisoning harms security more than it benefits it, for the reasons outlined in the original post.


Slab poisoning from vanilla kernel isn’t designed as a hardening mitigation, and will make it easier to get a working exploit (without SMAP/SMEP) since the poisonous value points to userspace. I wouldn’t recommend to enable it.

#5 Updated by cypherpunks 2016-02-21 12:09:51

> Slab poisoning from vanilla kernel isn’t designed as a hardening mitigation, and will make it easier to get a working exploit (without SMAP/SMEP) since the poisonous value points to userspace. I wouldn’t recommend to enable it.

Yeah that’s what I mentioned in the original post. But it does make UAFs harder, even if it was not designed intentionally to do so. I guess with trade-offs such as this, it’s better to be conservative and assume that changes would be for the worse. If only the poisoning value could be specified in a sysctl. Well, whenever the overlayfs+AppArmor problem is resolved and grsecurity is added (I’m not holding my breath…), it’ll be a moot point.

#6 Updated by intrigeri 2016-02-21 15:52:30

> Does the Tails testing suite do any benchmarking to detect performance regressions

It does not. We manually compare boot time on bare metal between the last version and the upcoming one at release time.

#7 Updated by cypherpunks 2016-02-27 09:40:53

I finished benchmarking it. Between the default settings and the extra kernel boot parameters, the performance changes are statistically insignificant for the most part. I’ve also attached a txz of the raw benchmark data. Does this seem like an acceptable perf impact? If so, I’ve attached a patch to do that. I hope I did the patch correctly.

Dhrystone 2 using register variables
old:  7445732.1 lps
new:  7474831.9 lps
old: 14874112.9 lps
new: 14883145.5 lps

Double-Precision Whetstone
old: 1828.3 MWIPS
new: 1828.7 MWIPS
old: 3656.9 MWIPS
new: 3662.3 MWIPS

System Call Overhead
old: 917844.0 lps
new: 919257.2 lps
old: 597726.8 lps
new: 598227.7 lps

Pipe Throughput
old: 744814.5 lps
new: 745264.2 lps
old: 480598.1 lps
new: 483130.1 lps

Pipe-based Context Switching
old: 71955.2 lps
new: 72032.8 lps
old: 49489.4 lps
new: 50579.0 lps

Process Creation
old: 7388.6 lps
new: 7397.5 lps
old: 4595.9 lps
new: 4673.1 lps

Execl Throughput
old: 2275.6 lps
new: 2230.6 lps
old: 3756.0 lps
new: 3789.4 lps

File Write 1024 bufsize 2000 maxblocks
old: 585275.2 KBps
new: 586603.1 KBps
old: 660361.3 KBps
new: 635322.3 KBps

File Read 1024 bufsize 2000 maxblocks
old: 1475338.0 KBps
new: 1421977.5 KBps
old: 2456849.7 KBps
new: 2425946.4 KBps

File Copy 1024 bufsize 2000 maxblocks
old: 370419.5 KBps
new: 367560.2 KBps
old: 616734.5 KBps
new: 614784.8 KBps

File Write 256 bufsize 500 maxblocks
old: 209665.2 KBps
new: 210262.5 KBps
old: 255039.5 KBps
new: 254540.8 KBps

File Read 256 bufsize 500 maxblocks
old: 442414.6 KBps
new: 442760.5 KBps
old: 878939.9 KBps
new: 883791.0 KBps

File Copy 256 bufsize 500 maxblocks
old: 117692.9 KBps
new: 117662.0 KBps
old: 217670.6 KBps
new: 218009.3 KBps

File Write 4096 bufsize 8000 maxblocks
old: 1361071.1 KBps
new: 1356683.9 KBps
old: 1491961.8 KBps
new: 1481835.6 KBps

File Read 4096 bufsize 8000 maxblocks
old: 3172588.8 KBps
new: 2999878.0 KBps
old: 5098517.3 KBps
new: 5051454.9 KBps

File Copy 4096 bufsize 8000 maxblocks
old: 895490.9 KBps
new: 889867.0 KBps
old: 320117.7 KBps
new: 312950.4 KBps

Shell Scripts (1 concurrent)
old: 3461.4 lpm
new: 3450.3 lpm
old: 4761.2 lpm
new: 4750.3 lpm

Shell Scripts (8 concurrent)
old: 624.0 lpm
new: 622.3 lpm
old: 629.8 lpm
new: 629.7 lpm

Shell Scripts (16 concurrent)
old: 313.8 lpm
new: 313.6 lpm
old: 315.5 lpm
new: 315.0 lpm

C Compiler Throughput (gcc)
old:  621.2 lpm
new:  619.6 lpm
old: 1098.1 lpm
new: 1097.9 lpm

Recursion Test -- Tower of Hanoi
old:  92766.0 lps
new:  93643.2 lps
old: 187073.1 lps
new: 187240.3 lps

Grep a large file (system's grep)
old: 17817.3 lpm
new: 18078.6 lpm
old: 31607.3 lpm
new: 32070.3 lpm

Exec System Call Overhead
old: 1568.5 lps
new: 1558.8 lps
old: 2551.9 lps
new: 2569.9 lps

2D graphics: rectangles
old: 2040.5 score
new: 2047.5 score

2D graphics: lines
old: 1038.8 score
new: 1033.8 score

2D graphics: circles
old: 1175.6 score
new: 1179.4 score

2D graphics: ellipses
old: 751.5 score
new: 751.6 score

2D graphics: polygons
old: 1408.8 score
new: 1382.4 score

2D graphics: aa polygons
old: 4574.7 score
new: 4393.6 score

2D graphics: complex polygons
old: 352.7 score
new: 353.5 score

2D graphics: text
old: 86835.7 score
new: 87125.9 score

2D graphics: images and blits
old: 232565.3 score
new: 233083.4 score

2D graphics: windows
old: 311.7 score
new: 315.0 score

#8 Updated by intrigeri 2016-02-28 13:29:36

  • Tracker changed from Bug to Feature
  • Status changed from New to In Progress
  • Assignee set to intrigeri
  • Target version set to Tails_2.4
  • % Done changed from 0 to 10
  • QA Check set to Ready for QA
  • Feature Branch set to feature/11143-harden-kernel

cypherpunks wrote:
> Does this seem like an acceptable perf impact?

I think so, yes.

> If so, I’ve attached a patch to do that. I hope I did the patch correctly.

Thanks! Looks good. I’ve imported it into a Git branch, and will build + run our automated test suite on it, so we’ll see if it breaks anything :)

It’s too late for inclusion in our next major release (2.2), so the target is the one after, that is 2.4 (in ~3 months).

Bonus points if someone imports the discussion from this ticket into our design doc, on a Git branch based on the one I’m referencing here: it would be nice if the thinking process that leads to these changes was recorded in Git.

#9 Updated by intrigeri 2016-02-29 13:55:32

intrigeri wrote:
> will build + run our automated test suite on it, so we’ll see if it breaks anything :)

It passes the small subset of our test suite that we run on Jenkins. Will do a full run too.

#10 Updated by cypherpunks 2016-04-03 12:24:34

Good news regarding slab poisoning. With kernel 4.6, it will be possible to set the value to zero which clears the memory, so there will no longer be the problem of a poison value pointing into userspace. It may be a while before Tails is using a version of Debian that uses 4.6, but it’s something to keep in mind so we can make use of it in a timely fashion.

According to https://lwn.net/Articles/680566/

>Page poisoning has traditionally been a kernel debugging feature; it fills freed pages with a special pattern that is easy to spot when looking for things that went wrong. In 4.6, poisoning can be enabled independently of the debugging options, and the “poison” value can be set to zero; this results in pages being simply cleared when they are freed. This behavior, inspired by the grsecurity/PaX patches, reduces the chances of the kernel leaking sensitive data.

#11 Updated by intrigeri 2016-04-29 11:50:18

intrigeri wrote:
> passes the small subset of our test suite that we run on Jenkins. Will do a full run too.

Passed!

#12 Updated by intrigeri 2016-04-29 11:51:14

cypherpunks wrote:
> Good news regarding slab poisoning. With kernel 4.6, it will be possible to set the value to zero which clears the memory, so there will no longer be the problem of a poison value pointing into userspace. It may be a while before Tails is using a version of Debian that uses 4.6, but it’s something to keep in mind so we can make use of it in a timely fashion.
>
> According to https://lwn.net/Articles/680566/
>
> >Page poisoning has traditionally been a kernel debugging feature; it fills freed pages with a special pattern that is easy to spot when looking for things that went wrong. In 4.6, poisoning can be enabled independently of the debugging options, and the “poison” value can be set to zero; this results in pages being simply cleared when they are freed. This behavior, inspired by the grsecurity/PaX patches, reduces the chances of the kernel leaking sensitive data.

Cool! Can you please track that in a new, dedicated ticket? This one will be most likely be resolved, thanks to your initial batch of proposals, in Tails 2.4.

#13 Updated by intrigeri 2016-04-29 11:52:42

  • Assignee changed from intrigeri to anonym
  • % Done changed from 10 to 50

Design doc added => please review & merge for 2.4 :)

#14 Updated by anonym 2016-05-09 03:22:24

  • Status changed from In Progress to Fix committed
  • Assignee deleted (anonym)
  • % Done changed from 50 to 100
  • QA Check changed from Ready for QA to Pass

#15 Updated by anonym 2016-06-08 01:25:42

  • Status changed from Fix committed to Resolved