Feature #12428

Ensure disk caches and aufs read-write branch are emptied during emergency shutdown

Added by intrigeri 2017-04-05 16:07:06 . Updated 2017-05-23 09:13:12 .

Status:
Resolved
Priority:
High
Assignee:
Category:
Target version:
Start date:
2017-04-05
Due date:
% Done:

100%

Feature Branch:
bugfix/12354-drop-kexec-memory-wipe
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

… so that they are overwritten by kernel memory poisoning. “Emergency” shutdown = when one unplugs the boot device.

This needs to be done in a way that’s as reliable as possible: in particular, the storage medium may host the persistent filesystem. If unmounting doesn’t work well enough (or just as an additional safeguard), we should probably echo 3 > /proc/sys/vm/drop_caches at some point during the shutdown process.

Writing an automated test for the “Tails with persistent volume unlocked” and “aufs read-write branch” usecases would help confirming it actually works. It probably requires implementing a /lib/systemd/system-shutdown/ hook that pauses for a while when debug=wipemem is passed on the kernel command line, so that we can dump memory after we’ve tried unmounting the filesystems.


Subtasks


History

#1 Updated by intrigeri 2017-04-05 16:11:28

  • Subject changed from Ensure filesystems are unmounted during emergency shutdown to Ensure disk caches are emptied during emergency shutdown
  • Description updated

#2 Updated by intrigeri 2017-04-05 16:12:12

  • Priority changed from Elevated to High

#3 Updated by intrigeri 2017-04-05 17:57:03

systemd-shutdown(8) (src/core/shutdown.c in the systemd source tree) tries hard to detach all DM & loop devices and unmount all filesystems. Then it runs everything found in /lib/systemd/system-shutdown/ before actually shutting down or rebooting. So /lib/systemd/system-shutdown/ indeed seems to be a good place to drop a script that echo 3 > /proc/sys/vm/drop_caches and then pauses if /run/tails_shutdown_debugging exists. Note that “All executables in this directory are executed in parallel, and execution of the action is not continued before all executables finished” so what we want to do really needs to be in one single script.

#4 Updated by intrigeri 2017-04-06 06:36:19

  • Subject changed from Ensure disk caches are emptied during emergency shutdown to Ensure disk caches and aufs read-write branch are emptied during emergency shutdown
  • Description updated

#5 Updated by intrigeri 2017-04-06 06:48:46

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10

My initial tests show that the content of the aufs read-write branch is not erased from memory on shutdown. It’s not very surprising, as systemd’s umount_all function (called from systemd-shutdown) does not try to unmount the root filesystem. Now, systemd-shutdown has code to return to the initrd and run /shutdown in there. Next step: check if this facility works in the context of Tails (might it be that only dracut supports this?). If it does, great! But it happens after /lib/systemd/system-shutdown/* so we cannot automatically test this with a script dropped in there.

#6 Updated by intrigeri 2017-04-06 07:41:27

intrigeri wrote:
> Now, systemd-shutdown has code to return to the initrd and run /shutdown in there. Next step: check if this facility works in the context of Tails (might it be that only dracut supports this?).

This is indeed supported by dracut, but not by initramfs-tools. I’m assuming that fully switching to dracut requires more work than we’re ready to put in time for Tails 3.0. I can think of two other solutions, and I don’t know which one would be cheaper:

  • hack support for this facility into our initramfs; requirements:
    • a /run/initramfs that systemd-shutdown can chroot into (when using dracut, dracut-shutdown.service executes /usr/lib/dracut/dracut-initramfs-restore which unpacks the initramfs to /run/initramfs), and that contains everything needed to run:
    • a /run/initramfs/shutdown executable, that systemd-shutdown will call after chroot’ing; it can access the old root filesystem in /oldroot; its main task would be to unmount the old root FS; presumably the shutdown script included in dracut-generated initrds would be an excellent source of inspiration
  • switch to a dracut-generated initramfs during shutdown; requirements:
    • install dracut-core
    • during ISO build, use dracut to build a second initramfs dedicated to shutdown; it can be very small, e.g. we don’t need any kernel module in there
    • possibly disable some dracut systemd units
    • ensure dracut-shutdown.service and /run/initramfs/shutdown work fine

Requirements common to both cases:

  • All this must work reliably during emergency shutdown as well: all the needed files must be locked into memory, and whatever code is responsible for unpacking the initramfs to /run/initramfs must either have been run already, or must work even when the root filesystem is not available anymore.
  • /run/initramfs/shutdown must sleep for a while when /run/tails_shutdown_debugging exists, so we can write automated tests. This should be easy both with the initramfs-tools option (we write the shutdown script ourselves so it can do whatever we want) and with the dracut one (its shutdown script runs administrator-defined custom hooks after unmounting the old root FS).

#7 Updated by intrigeri 2017-04-06 07:45:27

https://www.freedesktop.org/wiki/Software/systemd/InitrdInterface/ says that the ArchLinux initrd supports the initrd interface of systemd; see their:

And https://www.freedesktop.org/wiki/Software/systemd/RootStorageDaemons/ has some useful info.

#8 Updated by intrigeri 2017-04-06 09:28:48

> This is indeed supported by dracut, but not by initramfs-tools.

There’s a wishlist bug for it: https://bugs.debian.org/778849.

#9 Updated by anonym 2017-04-06 11:19:12

intrigeri wrote:
> * hack support for this facility into our initramfs; requirements:
> [vs]
> * switch to a dracut-generated initramfs during shutdown; requirements:

IMHO, if we plan to fully migrate to dracut soon (say, within a year) then let’s consider going that way, otherwise let’s not introduce yet another technology that only you understand and that we only partially use; I worry that since we won’t be in the “normal” use case of dracut, we will be a bit on our own, and any changes that cause breakage for us will frustrate me unless you take it as your responsibility to fix it. Also, it feels bloaty to have two different initramfs generation systems, and since you present no argument why the dracut approach is preferable than the one we have, I fail to see any reason for considering it.

#10 Updated by intrigeri 2017-04-06 11:47:34

Thanks for your feedback!

> IMHO, if we plan to fully migrate to dracut soon (say, within a year)

I would not bet on this, and given what I read below, I’m not going to lead this effort unless you are happy to use it and learn how it works (until external changes force us to evolve, at least).

> otherwise let’s not introduce yet another technology that only you understand and that we only partially use;

Got it. I understand the reluctance about having to learn new tools, and I empathize with the frustration that comes with breakage related to new tools, breakage that’s hard to understand before one as spent some time learning how they work. Now, IMO it’s an essential part of the Foundations Team job to learn new technologies when we (need to) switch to them (e.g. systemd). Conclusion: I agree that we should be careful when introducing new technologies, and carefully weight if they’re worth the learning time they require from us all.

> I worry that since we won’t be in the “normal” use case of dracut, we will be a bit on our own, and any changes that cause breakage for us will frustrate me unless you take it as your responsibility to fix it.

Right, valid point, but:

> and since you present no argument why the dracut approach is preferable than the one we have, I fail to see any reason for considering it.

There’s no such thing as “the one we have”: if we go the initramfs-tools way, we need to implement a brand new feature there, and then we’ll probably have to maintain it ourselves. That’s the main argument I was (implicitly) making in favour of the dracut approach, when I was comparing them above: it already does what we need (although surely nobody tested it in the same context yet). So in both cases, we will 1. have something Tails-specific to maintain; and 2. have to deal with breakage caused by external changes.

With this in mind, your argument in favour of initramfs-tools looks a bit like “let’s write our own stuff from scratch, so that we don’t have to learn about existing software that does essentially what we want already”, which can be relevant sometimes, but doesn’t feel very convincing in general :) In this case it’s somewhat relevant: relevant since the “existing software” doesn’t really support the context in which we want to use it. But only somewhat since writing our own code won’t prevent external changes that can break it, and the initial implementation might require more Tails-specific work.

So as you can see, it’s not clear cut to me what’s best. I think I’ll take your feedback/concerns into account, and will first give a try to the initramfs-tools option. If I realize that it involves reinventing too many wheels, I’ll want to reconsider.

Thanks again!

#11 Updated by anonym 2017-04-06 13:04:17

Meta: I feel that you misunderstood me a lot, so I’ll be overly clear to get my point across this time. Sorry for the verbosity!

intrigeri wrote:
> > otherwise let’s not introduce yet another technology that only you understand and that we only partially use;
>
> Got it. I understand the reluctance about having to learn new tools, and I empathize with the frustration that comes with breakage related to new tools, breakage that’s hard to understand before one as spent some time learning how they work. Now, IMO it’s an essential part of the Foundations Team job to learn new technologies when we (need to) switch to them (e.g. systemd).

Clarification: my “reluctance to learn new stuff” stems purely from time constraints. I feel excited about learning new stuff when I have the time to do it properly, which is rare. I hate learning stuff when I don’t have the time, since that degenerates into learning by stressful, frustrating trial-and-error when trying to get something to work ASAP, therefore taking shortcuts in the learning process so you miss essential stuff, and finally ending up with something you are seriously unsure of does the right thing, and a sour initial feeling towards this technology.

> Conclusion: I agree that we should be careful when introducing new technologies, and carefully weight if they’re worth the learning time they require from us all.

Exactly! Let’s just not forget that this is not only about the Foundations team learning new technologies, but about future contributors, auditors etc.

Also, let me refine my position like this: let’s only use dracut if it comes out on top in the cost-benefit analysis with enough margin to justify introducing a new tool.

> > I worry that since we won’t be in the “normal” use case of dracut, we will be a bit on our own, and any changes that cause breakage for us will frustrate me unless you take it as your responsibility to fix it.
>
> Right, valid point, but:
>
> > and since you present no argument why the dracut approach is preferable than the one we have, I fail to see any reason for considering it.
>
> There’s no such thing as “the one we have”:

Clarification: re-read the above with s/the one we have/the initramfs-tools appriach/! That was what I meant, sorry for being unclear!

> if we go the initramfs-tools way, we need to implement a brand new feature there, and then we’ll probably have to maintain it ourselves. That’s the main argument I was (implicitly) making in favour of the dracut approach, when I was comparing them above: it already does what we need (although surely nobody tested it in the same context yet). So in both cases, we will 1. have something Tails-specific to maintain; and 2. have to deal with breakage caused by external changes.

So the choice boils down to picking between:

  • maintaining a new feature for initramsfs-tools
  • maintaining a probably unsupported use case of dracut

> With this in mind, your argument in favour of initramfs-tools looks a bit like “let’s write our own stuff from scratch, so that we don’t have to learn about existing software that does essentially what we want already”

That is not my argument. My argument is: “Let’s extend the tool we already are using, so that we don’t have to learn about existing software that does essentially what we want already, but that we will use in an unusual (possibly unsupported) way, and it won’t replace the other tool we are already using, but now we will use two tools.”

And with my refined position, let’s concateneate: “Unless extending the tool we already use turns out too costly.”

> In this case it’s somewhat relevant: relevant since the “existing software” doesn’t really support the context in which we want to use it. But only somewhat since writing our own code won’t prevent external changes that can break it, and the initial implementation might require more Tails-specific work.

Agreed, so (again) once these are weighed against each other, let’s pick dracut if it is superior enough to justify introducing another tool/technology.

> So as you can see, it’s not clear cut to me what’s best.

Exactly, and take into account that I have no idea what dracut is beyond an event-driven initramfs-tools replacement and no experience of it whatsoever, so I focused on what I know, which are just some general points:

  • Introducing a new technology imposes a cost in time for learning it, both for current and future contributors.
  • Introducing a new technology in parallel to a similar technology introduces bloat and complexity.

Let me end with that I fully trust that your choice will be the right one! :)

#12 Updated by intrigeri 2017-04-07 11:36:33

My (local) work addresses this but so far emergency shutdown on boot medium removal doesn’t return to the initramfs so the RAM is (presumably) not cleared. I’m working on this last part.

#13 Updated by anonym 2017-04-16 16:00:53

intrigeri wrote:
> My (local) work addresses this but so far emergency shutdown on boot medium removal doesn’t return to the initramfs so the RAM is (presumably) not cleared. I’m working on this last part.

What is the status on this?

#14 Updated by intrigeri 2017-04-16 16:03:27

anonym wrote:
> intrigeri wrote:
> > My (local) work addresses this but so far emergency shutdown on boot medium removal doesn’t return to the initramfs so the RAM is (presumably) not cleared. I’m working on this last part.
>
> What is the status on this?

Exactly what I wrote above (I generally keep my tickets up-to-date as I prefer storing status on Redmine than in my brain :)

#15 Updated by intrigeri 2017-04-18 15:04:06

  • Target version changed from Tails_3.0 to Tails_3.0~rc1

#16 Updated by intrigeri 2017-05-18 09:07:22

  • Assignee changed from intrigeri to anonym
  • % Done changed from 10 to 50
  • QA Check set to Ready for QA

#17 Updated by anonym 2017-05-18 14:58:17

  • Assignee changed from anonym to intrigeri
  • % Done changed from 50 to 100
  • QA Check changed from Ready for QA to Pass

See Bug #12354 for the review (no blockers found). Please close when you merge!

#18 Updated by intrigeri 2017-05-18 15:37:41

  • Status changed from In Progress to Fix committed

Applied in changeset commit:5f588e526056ed6384659e7e04ef39edf0a93634.

#19 Updated by intrigeri 2017-05-18 15:38:14

  • Assignee deleted (intrigeri)

#20 Updated by intrigeri 2017-05-23 09:13:12

  • Status changed from Fix committed to Resolved