Feature #17099

Use trimage instead of optipng and advancecomp in compress-image.sh

Added by sajolida 2019-09-28 01:51:26 . Updated 2019-12-30 13:38:52 .

Status:
Resolved
Priority:
Normal
Assignee:
intrigeri
Category:
Target version:
Start date:
Due date:
% Done:

100%

Feature Branch:
contrib/17099-trimage
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

Tip shared by @xin.

Trimage is available in Debian and can be used on the command line.


Subtasks


Related issues

Blocks Tails - Feature #16711: Core work 2019Q3 → 2019Q4: Technical writing Resolved 2016-01-08

History

#1 Updated by sajolida 2019-09-28 01:54:30

  • blocks Feature #16711: Core work 2019Q3 → 2019Q4: Technical writing added

#2 Updated by sajolida 2019-09-28 16:50:53

  • Tracker changed from Bug to Feature
  • Subject changed from Replace compress-image.sh by trimage to Use trimage instead of optipng and advancecomp in compress-image.sh
  • Status changed from Confirmed to Needs Validation
  • Assignee deleted (sajolida)
  • Feature Branch set to contrib/17099-trimage

When run on /features/images:

original    996 
compress-image  980 2%
trimage     896 10%
trimage+mat 952 4%
mat+trimage 916 8%

PNGs can have a whole lot of metadata, so I propose to keep MAT in the loop and our custom script, instead of using trimage only, even though trimage only leads to slightly better compression.

See https://dev.exiv2.org/projects/exiv2/wiki/The_Metadata_in_PNG_files.

The new script is still deterministic (it produces the same output when run twice on 2 copies of the same file) and idempotent (it produces the same output when run twice on the same file).

The branch is based on devel since compress-image.sh has been changed in devel already to adjust to mat2.

#3 Updated by sajolida 2019-09-29 00:28:16

When running the new script on all the images included in the ISO image,
I get a 752 KB reduction on 4788 KB of data. Not bad!

Shall I apply this on devel as well? It’s easy to automate but I’m
worried about the extra pollution in the Git history.

#4 Updated by intrigeri 2019-09-29 14:16:55

  • Assignee set to intrigeri

#5 Updated by intrigeri 2019-09-29 14:28:20

  • Assignee changed from intrigeri to sajolida

Thanks for working on this. The only reason why I did not merge this branch is that I’m slightly worried about running MAT before trimage: as I understand it, the reason why we run MAT in this context is to avoid leaking info about the contributor’s environment. I understand that running MAT before trimage at least protects us against info leaks that happened before one runs compress-image.sh. But it does not protect us from infoleaks caused by trimage itself, while the current version of compress-image.sh does.

I’m undecided at the moment wrt. whether this is a blocker. But I want us to at least consider this problem, that I don’t see mentioned in the Git branch nor on this ticket.

What do you think?

> Shall I apply this on devel as well? It’s easy to automate but I’m worried about the extra pollution in the Git history.

Let’s please not do this: the main reason why we’re compressing some images (e.g. test suite) is to make our Git repo smaller to clone. Re-compressing them all would make our Git repo bigger to clone (because the history will contain both the old, bigger images, and the new, smaller ones).

And regarding website images, for the same reason, I’d rather not batch-recompress them all: in general, they’ll get smaller one after the other, as they are updated. If recompressing a few specific images now would make a significant difference (e.g. on the total amount of data needed to load our homepage or /home), then it might be worth doing it for those images, though :)

#6 Updated by sajolida 2019-10-03 17:49:41

  • Assignee changed from sajolida to intrigeri

> […] I’m slightly worried about running MAT before trimage: as I understand it, the reason why we run MAT in this context is to avoid leaking info about the contributor’s environment.

I considered this. I first tried trimage+mat but was surprised to see
that it was much bigger than mat only. I then proposed mat+trimage
because it had 2 x better performance overall.

I thought that doing mat+trimage could only reveal small variations in
the version of Trimage used by the contributor, like maybe their Debian
version or something, and that the tradeoff in compression was worth it.

I haven’t tested again because it takes quite some time, but I think
that the only metadata added back by Trimage was about the color palette
on some images. I don’t mind doing more test and report on what metadata
is being added exactly.

On the other hand, the original PNG files can in theory contain a whole
lot more crap. See “Textual information chunks” in the link from my last
comment.

But I don’t care that much and I’m fine if you prefer trimage+mat.

We might as well report this to MAT because it’s pretty weird to see
that running MAT on PNG usually makes them bigger :)

> Let’s please not do this: the main reason why we’re compressing some images (e.g. test suite) is to make our Git repo smaller to clone. Re-compressing them all would make our Git repo bigger to clone (because the history will contain both the old, bigger images, and the new, smaller ones).

I asked because I thought that you would care about that :) So I won’t
do recompress all our images.

Just so it’s being said: the main reason why I compress our images is to
make our website faster to load, especially for the thousands of daily
Tor and Tails users behind poor Internet conncetions, and our limit the
growth of our ISO image.

I don’t as much about the few dozens of contributors that might download
our repo each year.

#7 Updated by intrigeri 2019-10-04 06:41:52

  • Assignee changed from intrigeri to sajolida

> I thought that doing mat+trimage could only reveal small variations in the version of Trimage used by the contributor, like maybe their Debian version or something, and that the tradeoff in compression was worth it.

> I haven’t tested again because it takes quite some time, but I think that the only metadata added back by Trimage was about the color palette on some images. I don’t mind doing more test and report on what metadata is being added exactly.

Knowing what metadata trimage leaves/adds one one example image would indeed make me more comfortable with the proposed implementation.

Alternatively, switch to trimage+mat and merge.

> Just so it’s being said: the main reason why I compress our images is to make our website faster to load, especially for the thousands of daily Tor and Tails users behind poor Internet conncetions, and our limit the growth of our ISO image.
> I don’t as much about the few dozens of contributors that might download our repo each year.

Yeah, I can agree with this. I’m trying to balance these factors but it looks like I’m weighing them slightly differently than you are.

I do care about the performance of our website. Apart of the work I put into this on the server side, a few years ago I did some measurements and applied some targeted improvements in this area, e.g. commit:74e340ce6af1a002e907aac8e57b1f663015f33f and commit:d2cf083f943a6370138d828f41c94e955879ae64. As a result, we now have a decent score on Google PageSpeed, and none of the issues and suggestions this tool raises are about the size of our images. That’s why I’m wondering whether recompressing images will really make a measurable difference in website visitors’ UX. I suspect that it won’t provide the improvements you’re expecting, and that there are other, much more efficient ways to make our website faster to load.

This being said, recompressing images certainly can’t hurt website performance and all in all, I’m fine if you do it. Just don’t expect too much out of optimizing aspects of our website that are not the actual performance culprits :)

#8 Updated by jvoisin 2019-10-13 18:04:32

It’s kinda-expected for highly-compressed images to become larger when processed by mat2: mat2 is a bit paranoid, and does a rendering of the picture on a cairo surface before saving it to an image again, and it seems that cairo doesn’t compress as efficiently as optipng/pngcrush is able to.

The good news is that for Tails’ website, I think it’s ok to do mat2 then trimage:

- it’s no secret that Tails contributor do like Debian, so even if it would be possible to fingerprint back an operating system from a trimage-processed image alone, it doesn’t leak much to an attacker
- mat2 is removing all metadata, and since trimage is compressing pictures, it shouldn’t add new metadata: I just checked a couple of optipng-compressed images, and everything looks good.

As a side-but-related note, taking a look at gtmetrix will show low-hanging optimization fruits: eg. currently, t.b.o doesn’t make much use of browser cache :/

#9 Updated by intrigeri 2019-10-14 08:31:04

  • Assignee changed from sajolida to intrigeri

> The good news is that for Tails’ website, I think it’s ok to do mat2 then trimage:

You’re the expert ⇒ that’s good enough for me! I’ll do one last check and will merge.

#10 Updated by intrigeri 2019-10-14 12:23:32

  • Target version changed from Tails_4.0 to Tails_4.1

Gah, I’ve just tried to use the updated script and of course it wants me to install trimage, which made me notice that the version currently in Debian uses Qt4 (that had been purged from my system so far, thankfully) and was removed from Debian testing a few weeks ago for this very reason: https://bugs.debian.org/935352.

Thankfully, there’s a new upstream version (1.0.6) that was ported to Qt5, so there’s hope. But the Debian maintainers seem kinda MIA and they did not upload it yet.

So I’d rather not merge this branch before that version is uploaded to Debian: I don’t really want to end up in the situation where in a year, we have to maintain trimage in Debian ourselves, because we’ve started relying on it today. I’ll come back to it during our next development cycle and we’ll see how reliable trimage is as a dependency.

Sorry I did not look at this aspect of things the first time I looked at this ticket!

#11 Updated by intrigeri 2019-10-24 07:49:25

  • Status changed from Needs Validation to In Progress
  • Type of work changed from Contributors documentation to Wait

No progress on the Debian front since 2 months.

#12 Updated by intrigeri 2019-10-31 18:08:50

  • Target version changed from Tails_4.1 to Tails_4.2

I’ll take another look during the 4.2 dev cycle.

#13 Updated by intrigeri 2019-12-01 10:51:58

  • Target version changed from Tails_4.2 to Tails_4.3

No reply from maintainers so far.

#14 Updated by intrigeri 2019-12-30 13:35:31

  • Target version changed from Tails_4.3 to Tails_4.2
  • Type of work changed from Wait to Code

OK, so a Qt5 version was uploaded to sid.

The discussion on https://bugs.debian.org/935352 is worrying though: apparently the official maintainer of the package lacks the skills to do things like updating to a new upstream release. So if things go south again, we may face the choice between reverting to optipng + advancecomp, or helping with the maintenance of trimage in Debian.

Worst case, reverting will be trivial so we have an emergency exit ⇒ I’m going to merge this branch.

#15 Updated by intrigeri 2019-12-30 13:37:10

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Applied in changeset commit:tails|13e2807bae57ae09bdfc7f58774edfe61a1f0829.

#16 Updated by intrigeri 2019-12-30 13:38:52

> As a side-but-related note, taking a look at gtmetrix will show low-hanging optimization fruits: eg. currently, t.b.o doesn’t make much use of browser cache :/

Thanks! This is tracked on Feature #16091.