Feature #12238

Ship full APT lists in the live file system

Added by anonym 2017-02-14 18:32:40 . Updated 2019-06-07 10:04:53 .

Status:
Confirmed
Priority:
Normal
Assignee:
Category:
Target version:
Start date:
2017-02-14
Due date:
% Done:

0%

Feature Branch:
feature/12238-ship-apt-lists
Type of work:
Discuss
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

This would improve the UX when using APT since it will not be necessary to run apt update most of the time, and when it has to be run (will be most common when installing packages from testing/@unstable) it will be faster since only a diff has to be downloaded. Users that install packages without persistence will like this, as will Tails Server users (and segfault, since he fears the initial apt update@, that can take minutes, will put off users).

The drawback is that the lists occupy quite a bit of space => increased image size. It will also introduce data that will change quite a bit between each release => larger IUKs.


Subtasks


Related issues

Related to Tails - Feature #12237: Reduce apt update time during first start of Tails Server Resolved 2017-02-14
Related to Tails - Bug #6390: ISO size differ without an obvious reason Resolved 2013-10-29
Related to Tails - Feature #15584: Wrap apt to download lists if there are none Confirmed 2018-05-05

History

#1 Updated by anonym 2017-02-14 18:45:38

  • Feature Branch set to feature/12238-ship-apt-lists

The feature branch does this, and the image size is increased by 32 MiB.

Note 1: this branch has a separate commit (440b9cc73a Completely disable APT translations.) that we probably should merge into Tails ASAP.

Note 2: the current way this (but not the commit from “Note 1”) is implemented is non-reproducible. A real solution would be to convert the lists from our APT snapshots so they seem to come from the real Debian APT repo (perhaps by just renaming the files appropriately? Ours seem to be larger though.). Another would be to use snapshot.debian.org.

Note 3: given “Note 2”, the increased IUK size will only be an issue between major Tails versions.

#2 Updated by intrigeri 2017-02-14 18:49:03

> This would improve the UX when using APT since it will not be necessary to run apt update most of the time,

I’m curious where this “most of the time” comes from. Last time I checked, it was rather “during the 7-10 days after a given ISO was built”, i.e. in practice “during the 5-7 days after a given ISO was released”. Can you please clarify?

> as will Tails Server users (and segfault, since he fears the initial apt update, that can take minutes, will put off users).

Meta: I’d like the benchmark on Stretch, requested on Feature #12237, to be done before we take this argument into account.

#3 Updated by anonym 2017-02-15 09:06:33

  • related to Feature #12237: Reduce apt update time during first start of Tails Server added

#4 Updated by anonym 2017-02-15 11:14:52

anonym wrote:
> The feature branch does this, and the image size is increased by 32 MiB.

This was with gzip compression, I now realize. I guess it should be done again, with xz, but I’d suspect the impact to be similar when looking at the relative sizes.

#5 Updated by anonym 2017-02-15 11:21:38

intrigeri wrote:
> > This would improve the UX when using APT since it will not be necessary to run apt update most of the time,
>
> I’m curious where this “most of the time” comes from. Last time I checked, it was rather “during the 7-10 days after a given ISO was built”, i.e. in practice “during the 5-7 days after a given ISO was released”. Can you please clarify?

Since packages don’t change frequently in Debian stable (where users will install packages from by default) most package info in even old APT lists will remain valid, so those packages can be installed without an apt update.

> > as will Tails Server users (and segfault, since he fears the initial apt update, that can take minutes, will put off users).
>
> Meta: I’d like the benchmark on Stretch, requested on Feature #12237, to be done before we take this argument into account.

Done, at least for apt update only: Feature #12237#note-12

#6 Updated by intrigeri 2017-02-15 12:55:55

>> > This would improve the UX when using APT since it will not be necessary to run apt update most of the time,
>>
>> I’m curious where this “most of the time” comes from. Last time I checked, it was rather “during the 7-10 days after a given ISO was built”, i.e. in practice “during the 5-7 days after a given ISO was released”. Can you please clarify?

> Since packages don’t change frequently in Debian stable (where users will install packages from by default) most package info in even old APT lists will remain valid, so those packages can be installed without an apt update.

I still don’t get it, very sorry if I’m missing the obvious or if I’m reasoning based on wrong assumptions. I’ll try to clarify my assumptions here.

It seems to me that whether packages change frequently or not is orthogonal to the APT list expiration problem: what matters is the value of the Valid-Until field.

I (naively?) believe that one single expired APT list (e.g. testing/sid) was enough to prevent apt install from accepting to use any list at all (including those that are not expired yet). If I’m wrong, then right, one can still install packages from one given APT source as long as its lists are still valid, even if other lists have expired. Is this how it works?

#7 Updated by anonym 2017-02-15 13:26:23

intrigeri wrote:
> It seems to me that whether packages change frequently or not is orthogonal to the APT list expiration problem:

It is definitely not orthogonal, even if Valid-Until works like you assumed: if you try to install any package that has been updated and whose .deb has been cleaned up from the Debian APT mirrors, then you’ll have an error (trying to fetch an url pointing to an non-existing .deb) until you run apt update and get the url pointing to the new version.

> what matters is the value of the Valid-Until field.

I suspect it only matters when fetching the lists (to protect against replay attacks), i.e. only when running apt update, and not afterwards. I just booted a VM, saw via grep Valid-Until /var/lib/apt/lists/* that all lists were expired, but I could still successfully apt download cowsay, because its version hasn’t changed since I fetched the list some months ago, but I can not apt download icedove, because it has had a security update since then, and the old version referenced in my APT lists has been cleaned up. I also tried rm /var/cache/apt/{src,}pkgcache.bin but they were regenerated without issue indicating that the Valid-Until value is not relevant there either.

tl;dr: apt install will work as long as all packages’ versions it wants to install are still present on the Debian mirrors.

#8 Updated by intrigeri 2017-02-15 19:56:46

> It is definitely not orthogonal, even if […]

Indeed, thanks!

>> what matters is the value of the Valid-Until field.

> I suspect it only matters when fetching the lists (to protect against replay attacks), i.e. only when running apt update, and not afterwards. I just booted a VM, saw via grep Valid-Until /var/lib/apt/lists/* that all lists were expired, but I could still successfully apt download cowsay, because its version hasn’t changed since I fetched the list some months ago, but I can not apt download icedove, because it has had a security update since then, and the old version referenced in my APT lists has been cleaned up. I also tried rm /var/cache/apt/{src,}pkgcache.bin but they were regenerated without issue indicating that the Valid-Until value is not relevant there either.

Excellent!

> tl;dr: apt install will work as long as all packages’ versions it wants to install are still present on the Debian mirrors.

Woohoo, that’s very good news. Thanks a lot! :)

(I’ll assume that the “potential for running network services with known security issues” issue is taken into account on the Tails Server side, and won’t bother about it here.)

#9 Updated by sajolida 2017-03-04 20:41:33

I read through this quickly and won’t block on shipping them if you decide on doing so.

Still, I’d be very interested in seeing which fraction of our user base will actually make use that data.

Right now the only clue I have is the 9% of WB reports that have additional packages amongst the 48% of reports with persistence, so that’s 4% of the total. But of course that’s a bare minimum and other
users (people without persistence, people using the future GUI for additional software, people using the future Tails Server) would boost this up by some unknown factor.

#10 Updated by sajolida 2018-01-28 14:48:07

  • related to Bug #6390: ISO size differ without an obvious reason added

#11 Updated by sajolida 2018-01-28 14:56:36

  • Assignee changed from anonym to alant
  • Parent task set to Feature #14568

The lists were removed in Bug #6390.

While working on the UX of Additional Software we realized that not having these lists would be a big hurdle:

  • When running apt install mumble from the terminal I get an error saying “Unable to locate …” which is probably hard to relate to missing APT lists.
  • When running Synaptic, I see in the list of packages only the packages that are already installed in Tails, with no clue that other packages are also installable from Debian.

In Feature #12238#note-9, I said that this extra data might not be useful for many people, but that would change drastically once we have a GUI for Additional Software.

So if adding these lists solve the problems I’m describing here, I’m all in favor of doing this!

If we’re worried about additional space taken in our ISO image, we might keep in mind that Additional Software, if well implemented and easy to use, would allow us to remove software from Tails and get back these 32 MB :)

Assigning to Alan to make sure he moves the discussion forward as part of Feature #14568.

#12 Updated by sajolida 2018-01-28 15:26:31

By the way, how would this relate to build reproducibility? :)

#13 Updated by sajolida 2018-02-05 13:12:56

  • Deliverable for set to 299

#14 Updated by sajolida 2018-02-05 14:05:29

Alan: I think the first step could be to ask release managers and reproducible build folks if that’s going to be a problem. Maybe intrigeri can do that as he’s part of our team as well.

So unless you want to give it more thought yourself first, maybe reassign it to him with “Info Needed”.

Another issue I just thought about is IUK: the full APT lists would change for each release and they would always go in the IUK.

#15 Updated by sajolida 2018-02-05 14:05:57

Also target 3.9 for that unless you want to do it in time for 3.6.

#16 Updated by intrigeri 2018-02-06 17:05:05

> Alan: I think the first step could be to ask release managers and reproducible build folks if that’s going to be a problem. Maybe intrigeri can do that as he’s part of our team as well.

  • From a RM point of view, at first glance I see no obvious issue with this proposal.
  • Regarding reproducibility: I don’t know, but Alan can test this himself (set a branch as Ready for QA and it will be tested for reproducibility in Jenkins :)
  • I’ve already expressed my other concerns (of which the only relevant one is probably “potential for running network services with known security issues”) above.

So I think next steps are on Alan’s plate:

  1. think about the security consequences of enabling users to install a package from stable, with security issues have been fixed since via stable-security; we don’t enable them to do that right now
  2. if that is deemed acceptable, build a PoC and ensure it does not break other stuff such as reproducibility

#17 Updated by anonym 2018-02-06 17:49:12

sajolida wrote:
> Another issue I just thought about is IUK: the full APT lists would change for each release and they would always go in the IUK.

The lists are N MB compressed (from Feature #12238#note-1 it seems N ~= 32 MB), so the actual effect is that each IUK will become N MB larger than they are currently. It is equivalent to if we made running a fresh apt update as part of the Tails Upgrader (e.g. if we shipped the IUKs as Debian packages… :)), and this is only (partially) redundant for users that activate persistent APT lists, and users that never will use APT to install something.

#18 Updated by alant 2018-02-18 15:38:39

  • Assignee changed from alant to intrigeri
  • QA Check set to Info Needed

intrigeri wrote:
> * From a RM point of view, at first glance I see no obvious issue with this proposal.

Great!

> * Regarding reproducibility: I don’t know, but Alan can test this himself (set a branch as Ready for QA and it will be tested for reproducibility in Jenkins :)

OK

> * I’ve already expressed my other concerns (of which the only relevant one is probably “potential for running network services with known security issues”) above.
>
I don’t understand this issue. Do you mean that by shipping APT lists, a user will be able to install a network service from stable that has an update in stable-security, without noticing?

If that is the issue, then we might want to ensure that apt lists are updated before actually installing a new package if we have network access.

#19 Updated by intrigeri 2018-02-18 21:07:16

  • Assignee changed from intrigeri to alant

>> * I’ve already expressed my other concerns (of which the only relevant one is probably “potential for running network services with known security issues”) above.

> I don’t understand this issue. Do you mean that by shipping APT lists, a user will be able to install a network service from stable that has an update in stable-security, without noticing?

Exactly.

> If that is the issue, then we might want to ensure that apt lists are updated before actually installing a new package if we have network access.

I’m too lazy to check myself if this addresses all instances of this problem but it sounds like a good start!

#20 Updated by alant 2018-03-04 15:47:31

  • Assignee deleted (alant)
  • Parent task deleted (Feature #14568)
  • QA Check deleted (Info Needed)

> > If that is the issue, then we might want to ensure that apt lists are updated before actually installing a new package if we have network access.
>
> I’m too lazy to check myself if this addresses all instances of this problem but it sounds like a good start!

We’ve got an easier plan for the Additional Software UX: update APT lists automatically at package manager UI startup.

I’ve implemented that for synaptic in commit b34e8a5117.

#21 Updated by alant 2018-03-04 15:48:15

  • Deliverable for deleted (299)

#22 Updated by alant 2018-03-04 20:58:49

  • Status changed from Confirmed to In Progress

Applied in changeset commit:b34e8a511733a9dbd8718f93f4f28b05e486cbad.

#23 Updated by anonym 2018-03-05 09:22:21

  • Assignee set to alant
  • QA Check set to Info Needed

alant wrote:
> We’ve got an easier plan for the Additional Software UX: update APT lists automatically at package manager UI startup.

I agree that this change is good since it prevents the package manager from displaying outdated/partial info about the packages in Debian, but I do not see how it relates to this ticket. The way I see it, this ticket is purely about the UX of apt update taking probably something like five minutes on average.

#24 Updated by bertagaz 2018-03-05 09:45:14

Given one of the drawback is the growth of the IUK size, I was wondering if we could eventually ship the APT lists in the ISO (if that’s what is decided), but not in the IUKs.

I’m a bit concerned about reproducibility though.

#25 Updated by intrigeri 2018-03-05 10:43:55

> I do not see how it relates to this ticket

I think Alan merely meant that we don’t need the change this ticket is about for the ASP project. But if you want/need to work on it for other reasons, feel free to :)

#26 Updated by alant 2018-03-06 11:29:11

intrigeri wrote:
> > I do not see how it relates to this ticket
>
> I think Alan merely meant that we don’t need the change this ticket is about for the ASP project. But if you want/need to work on it for other reasons, feel free to :)

Exactly. We decided we don’t need that for ASP, it doesn’t mean it’s not cool, just that I’m not commited to make it happen withn 3 monthes.

#27 Updated by anonym 2018-03-06 13:29:17

  • Assignee changed from alant to segfault

alant wrote:
> intrigeri wrote:
> > > I do not see how it relates to this ticket
> >
> > I think Alan merely meant that we don’t need the change this ticket is about for the ASP project. But if you want/need to work on it for other reasons, feel free to :)
>
> Exactly. We decided we don’t need that for ASP, it doesn’t mean it’s not cool, just that I’m not commited to make it happen withn 3 monthes.

Got it! My confusion was that I couldn’t imagine a problem solved by this ticket, that is also solved by `synaptic —update-at-startup`. Any way, I guess you are still interested in this, segfault? I guess we might want to discuss this, and there is actually a contributors meeting tonight in case you have time to prepare the discussion.

BTW, for my own curiosity I measured 7 fresh `apt update` (with different guards) in Tails:

      TIME: 792.894223055
      TIME: 729.469853512
      TIME: 238.929281906
      TIME: 154.32395474
      TIME: 288.730508966
      TIME: 255.571447287
      TIME: 178.297504082

Average: 377 seconds. Two 12+ minute runs. Ouch.

(Note that these measurements were done before most relays updated to a tor mitigating the Great DDoS of 2017/2018.)

#28 Updated by sajolida 2018-06-05 16:06:59

  • related to Feature #15584: Wrap apt to download lists if there are none added

#29 Updated by sajolida 2018-06-05 16:09:20

While working on Additional Software we proposed Feature #15584: wrapping apt and apt-get on interactive shells to download the APT lists if there are none yet.

This would prevent people from wondering what’s going on when trying APT which was an important issue when testing the Additional Software beta.

So once we have Feature #15584 we could reject this ticket (or seriously lower its priority).

#30 Updated by intrigeri 2018-06-05 16:25:39

> So once we have Feature #15584 we could reject this ticket (or seriously lower its priority).

… unless it’s needed for something else :)

#31 Updated by segfault 2019-03-29 22:08:42

  • QA Check deleted (Info Needed)

#32 Updated by segfault 2019-06-06 16:45:35

  • Assignee deleted (segfault)

intrigeri:
> I’ve already expressed my other concerns (of which the only relevant one is probably “potential for running network services with known security issues”) above.

Note that the same is already true for users who enabled the additional software feature. They update the package index once and any time later, when they run apt install without updating first, they could install a package from stretch which has an updated package in stretch-security.

Anyway, I don’t think this issue is that pressing anymore, because Tor got faster again. I just timed an apt update on Tails it “only” took 51 seconds, which is a lot better than anonym’s results from last year. Also, we don’t currently plan to release Tails Server, so I have no reason to push things forward here and unassign myself.

#33 Updated by intrigeri 2019-06-07 10:04:53

  • Status changed from In Progress to Confirmed