Feature #15702

Adjust to uBlock web extension

Added by intrigeri 2018-07-03 12:54:41 . Updated 2018-08-14 05:48:18 .

Status:
Resolved
Priority:
Elevated
Assignee:
Category:
Target version:
Start date:
Due date:
2018-08-09
% Done:

100%

Feature Branch:
feature/15023-tor-browser-8
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Browser
Deliverable for:

Description

We need to update how we generate its database as the format has changed (see https://bugs.debian.org/877040 for details and useful pointers).

  • Drop bin/convert-ublock-settings if it’s not needed (i.e. if it does not actually save space in .git)
  • Update release process and bin/convert-ublock-settings if needed.

Subtasks


Related issues

Blocks Tails - Feature #15334: Core work 2018Q3: Foundations Team Resolved 2018-02-20

History

#1 Updated by intrigeri 2018-07-03 12:54:50

#2 Updated by intrigeri 2018-07-03 13:33:16

  • Description updated

#3 Updated by intrigeri 2018-07-03 13:33:38

  • Description updated

#4 Updated by intrigeri 2018-07-03 13:36:49

  • Assignee set to segfault
  • Start date changed from 2018-07-03 to 2018-08-09
  • Estimated time set to 2 h

2h for the initial research and testing. Hopefully it’ll be enough to reach a Ready for QA state, if not, by then we’ll know what needs to be done and will be able to update the time allocation.

#5 Updated by intrigeri 2018-07-03 13:39:46

  • Priority changed from Normal to Elevated

#6 Updated by intrigeri 2018-07-03 13:41:06

  • Due date set to 2018-08-09
  • Start date deleted (2018-08-09)

#7 Updated by segfault 2018-08-08 15:27:00

> Drop bin/convert-ublock-settings if it’s not needed (i.e. if it does not actually save space in .git)

It seems like it does not actually save space in .git. Here is how I tested it:

  • Update the rules of the uBlock Origin web extension in Tor Browser 8
  • Copy the storage.js file (the JSON file which now contains the uBO rules) to a new, empty git repository, and commit the file
  • Create another empty git repository, commit there a copy of storage.js in which all “},” are replaces with “},\n” (this causes the file to have 67 lines instead of 1)
  • Remove two lines, and slightly change three other lines, and commit the result
  • Commit the same modified file in the first repository, but with all newlines removed
  • Run du -bs on both repositories

Results:

135464  without_newlines
139697  with_newlines

So the repository without newlines is actually a bit smaller. FWIW, the storage.js file itself has a size of 24508 Bytes.

#8 Updated by segfault 2018-08-08 15:29:01

segfault wrote:
> * Remove two lines, and slightly change three other lines, and commit the result

This is probably quite different from what an actual diff between two versions of this file looks like, but I don’t have another version of it to test this :/

#9 Updated by intrigeri 2018-08-08 15:36:18

> This is probably quite different from what an actual diff between two versions of this file looks like, but I don’t have another version of it to test this :/

I don’t know what’s the frequency of updates of the lists we ship. If at least some of them are updated daily, perhaps it’s worth waiting a day to test with actual data?

#10 Updated by intrigeri 2018-08-08 15:36:35

> It seems like it does not actually save space in .git.

Sounds promising!

#11 Updated by segfault 2018-08-08 15:45:30

  • Feature Branch set to segfault:feature/15023-tor-browser-8

The uBlock Origin addon id is “uBlock0@raymondhill.net”, not “uBlock0-webext@raymondhill.net” (maybe that was changed recently?). I fixed this in my branch of Feature #15702.

#12 Updated by segfault 2018-08-08 15:52:55

Note that the new rules file is smaller by an order of magnitude: Only 25K instead of 12M. I don’t know the reason for that.

#13 Updated by segfault 2018-08-08 15:56:15

segfault wrote:
> Note that the new rules file is smaller by an order of magnitude: Only 25K instead of 12M. I don’t know the reason for that.

If the file stays that small, I wonder whether it’s still worth it to update it manually during the release process instead of just letting uBlock Origin download it when Tor Browser is started.

#14 Updated by segfault 2018-08-09 08:53:04

  • Assignee changed from segfault to intrigeri
  • QA Check set to Ready for QA

#15 Updated by intrigeri 2018-08-09 09:19:40

  • Status changed from Confirmed to In Progress
  • Assignee changed from intrigeri to segfault
  • % Done changed from 0 to 60
  • QA Check changed from Ready for QA to Dev Needed

Looks good!

It looks like there are some obsolete leftovers though (git grep is your friend):

  • bin/convert-ublock-settings
  • the note about config/chroot_local-includes/usr/share/tails/ublock-origin/ublock0.dump in the release process doc
  • rm -rf /usr/share/tails/ublock-origin/ in config/chroot_local-hooks/98-remove_unwanted_files

I’ll merge this into my feature/15023-tor-browser-8 branch anyway after building+testing but let’s please solve this before we merge this branch into devel.

#16 Updated by segfault 2018-08-09 10:35:25

intrigeri wrote:
> Looks good!
>
> It looks like there are some obsolete leftovers though (git grep is your friend):
>
> * bin/convert-ublock-settings
> * the note about config/chroot_local-includes/usr/share/tails/ublock-origin/ublock0.dump in the release process doc
> * rm -rf /usr/share/tails/ublock-origin/ in config/chroot_local-hooks/98-remove_unwanted_files
>
> I’ll merge this into my feature/15023-tor-browser-8 branch anyway after building+testing but let’s please solve this before we merge this branch into devel.

Done in 0ff97b01c8cc20856fd47ef5d48ce9dc87269046.

#17 Updated by segfault 2018-08-09 10:48:26

  • Assignee changed from segfault to intrigeri
  • QA Check changed from Dev Needed to Ready for QA

#18 Updated by intrigeri 2018-08-09 11:30:23

  • Assignee changed from intrigeri to segfault
  • % Done changed from 60 to 70
  • QA Check changed from Ready for QA to Info Needed

LGTM, merged!

Now, it seems that the config you’ve saved has changed compared to what we had in 3.8:

  • I notice that “Auto-update filter lists” is enabled, while we’ve historically disabled it.
  • A few lists are enabled, that we previously did not enable.
  • Two regional lists are disabled, that we previously enaled (maybe this partly explains why the resulting file is smaller?)

Are these changes made on purpose? I think some of them (e.g. removing the regional lists) might make sense but at least it would be good to know if it’s on purpose and to check the impact.

#19 Updated by segfault 2018-08-09 12:00:31

  • Assignee changed from segfault to intrigeri

intrigeri wrote:
> LGTM, merged!
>
> Now, it seems that the config you’ve saved has changed compared to what we had in 3.8:
>
> * I notice that “Auto-update filter lists” is enabled, while we’ve historically disabled it.
> * A few lists are enabled, that we previously did not enable.
> * Two regional lists are disabled, that we previously enaled (maybe this partly explains why the resulting file is smaller?)
>
> Are these changes made on purpose? I think some of them (e.g. removing the regional lists) might make sense but at least it would be good to know if it’s on purpose and to check the impact.

No, it was not on purpose. I didn’t change anything in the file, I just did exactly what I described in the release process, i.e. copy the storage.js from Tor Browser. And I don’t see where we had previously changed any uBlock settings, I thought we were just using the default settings. Did I miss something?

I now noticed that there was this line in bin/convert-ublock-settings, which I don’t understand:

sed "/^INSERT INTO \"settings\" VALUES('\(remoteBlacklists\|cached_asset_entries\)'/"'s_,_,\r\n_g'

#20 Updated by intrigeri 2018-08-09 12:12:01

  • Assignee changed from intrigeri to segfault
  • QA Check changed from Info Needed to Dev Needed

> No, it was not on purpose […]

OK, so we’re back to the default settings then.

> And I don’t see where we had previously changed any uBlock settings, I thought we were just using the default settings. Did I miss something?

IIRC our settings used to be in the sqlite DB that you’ve replaced. Look at its Git history, there’s probably something there about when the settings were moved to sqlite.
And now the settings seem to be at the beginning of the new storage.js file.

> I now noticed that there was this line in bin/convert-ublock-settings, which I don’t understand:

>

> sed "/^INSERT INTO \"settings\" VALUES('\(remoteBlacklists\|cached_asset_entries\)'/"'s_,_,\r\n_g'
> 

I don’t understand it either (which is a reason why I generally oppose writing complex undocumented regexps without the “x” flag that allows inline commenting, at least in Perl). I don’t know if that’s related.

#21 Updated by segfault 2018-08-09 12:28:01

  • Assignee changed from segfault to intrigeri
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:
> > No, it was not on purpose […]
>
> OK, so we’re back to the default settings then.
>
> > And I don’t see where we had previously changed any uBlock settings, I thought we were just using the default settings. Did I miss something?
>
> IIRC our settings used to be in the sqlite DB that you’ve replaced. Look at its Git history, there’s probably something there about when the settings were moved to sqlite.
> And now the settings seem to be at the beginning of the new storage.js file.

The sqlite DB didn’t exist in the Tails repo, it was created from the .dump file by config/chroot_local-hooks/12-generate-ublock-origin-filter. And since the dump file was replaced during each release process with the current uBlock Origin defaults, I don’t see how any non-default settings would be stored - except by the bin/convert-ublock-settings, but it didn’t set any of the settings you noticed.

#22 Updated by intrigeri 2018-08-09 13:24:34

  • Assignee changed from intrigeri to segfault
  • QA Check changed from Ready for QA to Dev Needed

> The sqlite DB didn’t exist in the Tails repo, […]

Yeah, sorry: I meant the settings were in the text dump of the DB.

> And since the dump file was replaced during each release process with the current uBlock Origin defaults,

I believe this is wrong: instead, the dump file was replaced during each release process with a new dump produced by combining:

  • the version of the DB (generated from the dump) that was in Git previously
  • the filter lists updates

Did I miss anything that makes you think the dump file was “replaced during each release process with the current uBlock Origin defaults”?

> I don’t see how any non-default settings would be stored

The way I understand the update process, I think our custom settings did survive it just fine. Now, if you’re not convinced (and you’d be right, I might be guessing wrong and the settings we have in 3.8 may be the default upstream ones), it’s easy to check: boot 3.8, delete the DB, start Tor Browser, check the settings in the uBlock dashboard :)

FTR our custom settings came from adblock plus and were removed in commit:659ea1bf89f82e0807f4f455cbd581444b037b96. I believe spriver adequately ported them to uBlock when she did the conversion. Feature #9833 has further details.

#23 Updated by intrigeri 2018-08-10 07:54:14

  • Assignee changed from segfault to intrigeri

I’ll give a try to my hypothesis.

#24 Updated by intrigeri 2018-08-10 08:11:43

First, I’ve noticed that storage.js contains only the list of available filter lists and the settings (e.g. which lists are enabled) but it does not contain the actual filters. Looks like we now know why it’s vastly smaller than our previous DB dump.

#25 Updated by intrigeri 2018-08-10 09:12:42

  • Assignee changed from intrigeri to segfault
  • QA Check changed from Dev Needed to Info Needed

I’ve tried updating the list immediately before closing the browser, and then:

  • I see no relevant file modified in the profile directory.
  • If I restart the browser, in the uBlock dashboard, all lists are flagged as “Out of date”.

Same in a pristine Tor Browser 8.0a9 outside of Tails. In comparison, on a regular Firefox 60 or in a Tor Browser 7.5.6 tweaked to allow writing to disk, I see files updated in storage/default/moz-extension+++323a5b72-cf52-4841-a85e-29e95534d12c/idb/ when I update the lists.

So I think that the uBlock webext uses a storage mechanism that Tor Browser disables by default and as a consequence, only the filters shipped with the add-on (/usr/share/webext/ublock-origin/assets) are used unless one updates the filter lists manually. The most important one (ublock/filters.txt) is marked for expiration after 4 days so it does not matter much whether we update it at ISO build time or not.

At this point, IMO it’s not worth bothering more. I think we should:

  • use the default filter lists (whose content is shipped with the add-on) and only them
  • drop the entire “update the filters list and save them in the ISO” process
  • consider disabling the default filter lists we did not enable until Tails 3.8

The only downsides I can think of are:

  • the filter lists we ship in the ISO (coming straight from the Debian package) can be as old as the upstream release the package was built from, instead of as old as the ISO; given the latter is 0-8 weeks, I doubt adding a few more weeks has much impact
  • we lose the additional filter lists we previously enabled (EasyList DE and FR); arguably it makes little sense in 2018 to special-case DE and FR in Tails so I think it’s a reasonable move

What do you think?

The only alternate option I can think of is enabling auto-update. I’d rather not do that because 1. calling home always requires more thinking than that; 2. that’s lots of data to download every time one starts Tails, which can be painful on slow connections/circuits.

#26 Updated by intrigeri 2018-08-10 09:13:40

If you agree with my proposal: either you can implement it by the end of the week and then go for it; or just let me know and I’ll implement it today.

#27 Updated by segfault 2018-08-12 10:09:42

intrigeri wrote:
> > And since the dump file was replaced during each release process with the current uBlock Origin defaults,
>
> I believe this is wrong: instead, the dump file was replaced during each release process with a new dump produced by combining:
>
> * the version of the DB (generated from the dump) that was in Git previously
> * the filter lists updates
>
> Did I miss anything that makes you think the dump file was “replaced during each release process with the current uBlock Origin defaults”?

Doesn’t step 7 from the uBlock patterns and settings file section of the release process doc replace the dump?

7. Copy the `.tor-browser/profile.default/extension-data/ublock0.sqlite`
   from this Tor Browser instance into the root of Tails' Git repo and
   run the following command:

        ./bin/convert-ublock-settings ublock0.sqlite > \
            config/chroot_local-includes/usr/share/tails/ublock-origin/ublock0.dump

#28 Updated by segfault 2018-08-12 10:35:46

  • QA Check changed from Info Needed to Dev Needed

intrigeri wrote:
> First, I’ve noticed that storage.js contains only the list of available filter lists and the settings (e.g. which lists are enabled) but it does not contain the actual filters. Looks like we now know why it’s vastly smaller than our previous DB dump.

Ouch. I almost suspected something like that, but I didn’t thoroughly check.

> I’ve tried updating the list immediately before closing the browser, and then:
>
> * I see no relevant file modified in the profile directory.
> * If I restart the browser, in the uBlock dashboard, all lists are flagged as “Out of date”.
>
> Same in a pristine Tor Browser 8.0a9 outside of Tails. In comparison, on a regular Firefox 60 or in a Tor Browser 7.5.6 tweaked to allow writing to disk, I see files updated in storage/default/moz-extension+++323a5b72-cf52-4841-a85e-29e95534d12c/idb/ when I update the lists.
>
> So I think that the uBlock webext uses a storage mechanism that Tor Browser disables by default and as a consequence, only the filters shipped with the add-on (/usr/share/webext/ublock-origin/assets) are used unless one updates the filter lists manually.

I agree on that analysis.

> The most important one (ublock/filters.txt) is marked for expiration after 4 days so it does not matter much whether we update it at ISO build time or not.
>
> At this point, IMO it’s not worth bothering more. I think we should:
>
> * use the default filter lists (whose content is shipped with the add-on) and only them
> * drop the entire “update the filters list and save them in the ISO” process
> * consider disabling the default filter lists we did not enable until Tails 3.8
>
> The only downsides I can think of are:
>
> * the filter lists we ship in the ISO (coming straight from the Debian package) can be as old as the upstream release the package was built from, instead of as old as the ISO; given the latter is 0-8 weeks, I doubt adding a few more weeks has much impact
> * we lose the additional filter lists we previously enabled (EasyList DE and FR); arguably it makes little sense in 2018 to special-case DE and FR in Tails so I think it’s a reasonable move

I remember that in Adblock Plus times, I always had to add the regional EasyList to make the adblocker work well on non-english sites, but that doesn’t seem to be the case anymore, so I don’t think it would be a problem if we lose those.

> The only alternate option I can think of is enabling auto-update. I’d rather not do that because 1. calling home always requires more thinking than that; 2. that’s lots of data to download every time one starts Tails, which can be painful on slow connections/circuits.

I think we could reconsider this if the outdated lists turn out to be a problem, which I don’t expect.

> If you agree with my proposal: either you can implement it by the end of the week and then go for it; or just let me know and I’ll implement it today.

I agree. I can implement it, but I’m also quite busy with other stuff, so you can do it too if you want to.

#29 Updated by intrigeri 2018-08-12 10:37:37

  • Assignee changed from segfault to intrigeri
  • QA Check deleted (Dev Needed)

#30 Updated by intrigeri 2018-08-13 09:32:05

> Doesn’t step 7 from the uBlock patterns and settings file section of the release process doc replace the dump?

It replaces the dump with an updated version of itself based on the previous version, which includes our settings.

#31 Updated by intrigeri 2018-08-13 11:50:03

  • Assignee changed from intrigeri to segfault
  • % Done changed from 70 to 80
  • QA Check set to Ready for QA
  • Feature Branch changed from segfault:feature/15023-tor-browser-8 to feature/15023-tor-browser-8

Works fine here. Please review my last commits and if you’re happy enough with them I’ll merge this branch into devel.

#32 Updated by segfault 2018-08-13 21:45:30

  • Assignee changed from segfault to intrigeri
  • QA Check changed from Ready for QA to Pass

intrigeri wrote:
> > Doesn’t step 7 from the uBlock patterns and settings file section of the release process doc replace the dump?
>
> It replaces the dump with an updated version of itself based on the previous version, which includes our settings.

Oh. Right. [hides in shame]

> Works fine here. Please review my last commits and if you’re happy enough with them I’ll merge this branch into devel.

LGTM. Merge merge merge!

#33 Updated by intrigeri 2018-08-14 05:48:18

  • Status changed from In Progress to Resolved
  • Assignee deleted (intrigeri)
  • % Done changed from 80 to 100