Feature #15290

Reduce IUK size

Added by anonym 2018-02-05 16:27:04 . Updated 2019-12-25 23:08:24 .

Status:
Resolved
Priority:
Low
Assignee:
Category:
Target version:
Start date:
2016-04-13
Due date:
% Done:

10%

Feature Branch:
feature/15281-single-squashfs-diff
Type of work:
Research
Blueprint:

Starter:
Affected tool:
Upgrader
Deliverable for:

Description

This is about researching low-hanging fruits to reduce the size of our IUKs in order to be nicer to users with not very fast Internet connection. Regarding increased memory consumption caused by the “single SquashFS diff” idea, another approach (Feature #6876) will solve the problem.

Current best ideas (collected initially on Bug #11211 and Feature #11345; see also Feature #6425) are:

  • apply the 99-set_mtimes trick to:
    • /usr/share/mime/application/*
    • /var/cache/debconf/*
    • /var/lib/dpkg/info
  • delete /var/lib/dkms/ during the build, then remove the dkms bits from config/chroot_local-hooks/99-zzzzzz_reproducible-builds-post-processing

To evaluate the impact of these ideas:

  1. Fork the tag for Tails version N, apply the aforementioned tricks. Build an ISO.
  2. Do the same for Tails version N+1.
  3. Build the IUK from N to N+1 using these ISOs.
  4. Compare the size of the IUK you’ve built with the size of the IUK we’ve published from N to N+1.

Subtasks


Related issues

Related to Tails - Feature #15289: Make Tails Upgrader suggest a manual upgrade to decrease future IUK sizes Confirmed 2018-02-05
Related to Tails - Feature #6425: Do not save some large files whose only modification is mtime, for smaller IUK Resolved 2013-11-16
Has duplicate Tails - Bug #11211: Don't include files from /usr/share/mime in IUKs if their only modification is mtime Duplicate 2016-03-09
Has duplicate Tails - Feature #11345: Ship less unneeded cache/generated files in the ISO and IUK Duplicate 2016-04-13

History

#1 Updated by anonym 2018-02-05 16:27:51

  • Target version set to 2018

#2 Updated by anonym 2018-02-05 16:29:57

  • related to Feature #15281: Stack one single SquashFS diff when upgrading added

#3 Updated by anonym 2018-02-05 16:31:02

This is related to Feature #15281 (1BigIUK) because of the larger IUKs. Once we have it it really seems worth at least picking the low-hanging fruit among size optimizations.

#4 Updated by intrigeri 2018-02-06 15:37:17

  • Description updated

#5 Updated by Anonymous 2018-08-16 12:19:23

  • related to Feature #15289: Make Tails Upgrader suggest a manual upgrade to decrease future IUK sizes added

#6 Updated by anonym 2019-01-15 13:03:31

  • Target version changed from 2018 to 2019

#7 Updated by intrigeri 2019-03-07 15:36:53

  • blocked by Feature #15277: Update our survey of non-NIH system upgrade solutions added

#8 Updated by intrigeri 2019-03-07 15:38:32

  • blocks deleted (Feature #15277: Update our survey of non-NIH system upgrade solutions)

#9 Updated by intrigeri 2019-03-07 15:41:44

anonym wrote:
> This is related to Feature #15281 (1BigIUK) because of the larger IUKs. Once we have it it really seems worth at least picking the low-hanging fruit among size optimizations.

Yes, maybe… but the marginal benefit (relative to the total upgrade download time) brought by such optimizations will actually be lower once IUKs get bigger. So I guess it could be worth giving a quick try to the low-hanging fruits I’ve identified on the subtasks and check how much space (and thus download time) we would save, before even evaluating whether these tricks break anything we care about. Now, no big deal if we don’t do it IMO.

#10 Updated by intrigeri 2019-08-31 15:59:28

  • Description updated

(Import current best ideas from Bug #11211 and Feature #11345 so I can close them and simply this set of tickets.)

#11 Updated by intrigeri 2019-08-31 16:00:25

  • related to Feature #6425: Do not save some large files whose only modification is mtime, for smaller IUK added

#12 Updated by intrigeri 2019-08-31 16:01:25

  • has duplicate Bug #11211: Don't include files from /usr/share/mime in IUKs if their only modification is mtime added

#13 Updated by intrigeri 2019-08-31 16:01:37

  • has duplicate Feature #11345: Ship less unneeded cache/generated files in the ISO and IUK added

#14 Updated by intrigeri 2019-08-31 16:09:40

  • Target version deleted (2019)

This would be nice, but technically it’s not on our roadmap.

In a build of current devel branch (Buster):

  • Compressed with tar+bz2 together, /usr/share/mime/application/ + /var/cache/debconf/ + /var/lib/dpkg/info are 4.2M large, so best case, we’ll be saving a few MB on each IUK.
  • /var/lib/dkms is 862KB

So implementing our current best ideas would definitely be nice, but that won’t improve UX much.

Now, if there are other low-hanging fruits we did not spot, that would impact the IUK size substantially, then it would be a completely different story!

#15 Updated by intrigeri 2019-12-25 12:02:19

  • Status changed from Confirmed to In Progress
  • Assignee changed from anonym to intrigeri
  • Feature Branch set to feature/15281-single-squashfs-diff

Tails_amd64_4.1_to_4.1.1.iuk is an interesting example in that it brought extremely few changes but is still 94MB large. Granted, 30% of it is the initrd, which we did mean to upgrade. But the 64M 4.1.1.squashfs feels too big to me and there could be something to save in there so I took a closer look; following sizes are uncompressed files from the SquashFS.

Potentially low-hanging fruits that are unlikely to make a significant difference in isolation, but maybe together they’re worth it:

  • our website takes 45M; it compresses down to 8M with tar cJ; if most of it is present only due to changing mtimes, we should probably use the 99-set_mtimes trick
  • /usr/share/mime/{application,packages}/*
  • /var/cache/debconf/
  • /var/lib/dpkg/info

So I’ve simulated how applying the 99-set_mtimes trick to the last 3 sets of files would perform, by passing --ignore-if-same-content '/usr/share/mime/application/*' --ignore-if-same-content '/usr/share/mime/packages/*' --ignore-if-same-content '/var/cache/debconf/*' --ignore-if-same-content '/var/lib/dpkg/info/*' to tails-create-iuk (https://jenkins.tails.boum.org/job/build_IUKs/35/). I could not do this for the website because --ignore-if-same-content breaks stuff if the passed glob expands to directories. The resulting Tails_amd64_4.1_to_4.1.1.iuk is 1.5M (1.7%) smaller, which IMO makes it a lead that’s not worth pursuing (which would require testing and reasoning about the risk of regressions).

But our website is a different beast: the savings should be higher and we understand much better the impact, i.e. the risk of regressions seems close to zero. So I’m trying the 99-set_mtimes trick there.

And the total size of the SquashFS diff is dominated by files that IMO are false leads:

  • /usr/share/thunderbird/omni.ja is 15M and is already compressed, so that probably accounts for 23% of the SquashFS diff; note that we did not upgrade Thunderbird between 4.1 and 4.1.1, but still, the content of their omni.ja differ.
  • Tor Browser’s 2 × omni.ja are in the SquashFS diff too; together, they take 23M i.e. another 36% of the SquashFS diff; but the content of these files has not changed, so in theory the 99-set_mtimes trick would be efficient. I’m worried about the potential for side-effects here, though.

I’m saying false leads because with the single SquashFS diff scheme, most IUKs will include Thunderbird and Tor Browser upgrades: even if they’re upgrading to a Tails version that itself does not upgrade Thunderbird and Tor Browser, surely we upgraded this software since the initially installed Tails version. So the only case where improving things in this area would make a difference is a user who just installed a brand new Tails (e.g. 4.1) and we publish an emergency release that does not upgrade Thunderbird or Tor Browser (e.g. 4.1.1), which is a very rare corner case. I’m therefore not going to pursue this lead.

#16 Updated by intrigeri 2019-12-25 12:02:57

  • Target version set to Tails_4.2

#17 Updated by intrigeri 2019-12-25 23:08:25

  • Status changed from In Progress to Resolved
  • Assignee deleted (intrigeri)
  • Parent task set to Feature #15281

intrigeri wrote:
> But our website is a different beast: the savings should be higher and we understand much better the impact, i.e. the risk of regressions seems close to zero. So I’m trying the 99-set_mtimes trick there.

Done, and test suite passes. I don’t think it’s worth spending more time on this.