Bug #12741

/lib/modules/*/modules.* not reproducible in some environments

Added by anonym 2017-06-19 10:21:56 . Updated 2017-08-24 15:28:34 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Build system
Target version:
Start date:
2017-06-19
Due date:
% Done:

100%

Feature Branch:
Type of work:
Research
Blueprint:

Starter:
Affected tool:
Deliverable for:
289

Description

See Feature #12608#note-18 for arnaud’s diffoscope report. There are also .torrent:s for the good ISO, and for arnaud’s.

Ignoring the .bin files, diffoscope shows something interesting:

├── /lib/modules/4.9.0-3-amd64/modules.alias
│ │ @@ -19383,8 +19383,7 @@
│ │  alias vport-type-3 vport_gre
│ │  alias net-pf-40 vmw_vsock_vmci_transport
│ │  alias vmware_vsock vmw_vsock_vmci_transport
│ │  alias virtio:d00000013v* vmw_vsock_virtio_transport
│ │  alias fs-vboxsf vboxsf
│ │  alias pci:v000080EEd0000BEEFsv*sd*bc*sc*i* vboxvideo
│ │  alias pci:v000080EEd0000CAFEsv00000000sd00000000bc*sc*i* vboxguest
│ │ -alias fs-aufs aufs
[...]
── /lib/modules/4.9.0-3-amd64/modules.dep
│ │ @@ -3395,8 +3395,7 @@
│ │  kernel/lib/mpi/mpi.ko:
│ │  kernel/lib/asn1_decoder.ko:
│ │  kernel/lib/oid_registry.ko:
│ │  kernel/virt/lib/irqbypass.ko:
│ │  updates/vboxsf.ko: updates/vboxguest.ko
│ │  updates/vboxvideo.ko: updates/vboxguest.ko kernel/drivers/gpu/drm/ttm/ttm.ko kernel/drivers/gpu/drm/drm_kms_helper.ko kernel/drivers/gpu/drm/drm.ko
│ │  updates/vboxguest.ko:
│ │ -kernel/fs/aufs/aufs.ko:

I wonder: is this what you’d expect if depmod wasn’t run after the aufs module was built?


Files


Subtasks


Related issues

Blocked by Tails - Bug #13480: The Vagrant VM has too little memory for disk builds Resolved 2017-07-17

History

#1 Updated by anonym 2017-06-19 10:24:25

  • Assignee set to lamby
  • QA Check set to Info Needed

Could you have a look?

#2 Updated by anonym 2017-06-19 10:58:17

  • Description updated

#3 Updated by anonym 2017-06-19 11:01:10

arnaud, do you still have the .buildlog file from the image you built (the .iso’s SHA-256 is acbe13f4e88b7e2d9622d75a1e834cf81a38e30992559756d9890d81e6f6c14a)? If so, please attach it to this ticket. It’d be interesting to see if something unusual was logged when the aufs modules were built.

#4 Updated by arnaud 2017-06-19 11:15:53

Yes I didn’t touch anything since the day I built. I can confirm that the iso’s SHA didn’t change and is acbe13f4e88b7e2d9622d75a1e834cf81a38e30992559756d9890d81e6f6c14a. Let me attach here all the output files.

#5 Updated by anonym 2017-06-19 12:26:28

  • Assignee changed from lamby to anonym
  • QA Check deleted (Info Needed)

So this is the excerpt of arnaud’s .buildlog from when config/chroot_local-hooks/50-dkms runs:

[...]
Loading new virtualbox-guest-5.1.22 DKMS files...
It is likely that 4.9.0-0.bpo.2-amd64 belongs to a chroot's host
Building for 4.9.0-3-amd64
Building initial module for 4.9.0-3-amd64
Error! Bad return status for module build on kernel: 4.9.0-3-amd64 (x86_64)
Consult /var/lib/dkms/virtualbox-guest/5.1.22/build/make.log for more information.
Setting up linux-headers-4.9.0-3-common (4.9.25-1) ...
Setting up linux-compiler-gcc-6-x86 (4.9.25-1) ...
Setting up linux-kbuild-4.9 (4.9.25-1) ...
Setting up aufs-dkms (4.9+20161219-1) ...
Loading new aufs-4.9+20161219 DKMS files...
It is likely that 4.9.0-0.bpo.2-amd64 belongs to a chroot's host
Building for 4.9.0-3-amd64
Building initial module for 4.9.0-3-amd64
Error! Bad return status for module build on kernel: 4.9.0-3-amd64 (x86_64)
Consult /var/lib/dkms/aufs/4.9+20161219/build/make.log for more information.
Setting up linux-headers-4.9.0-3-amd64 (4.9.25-1) ...
/etc/kernel/header_postinst.d/dkms:
Error! Bad return status for module build on kernel: 4.9.0-3-amd64 (x86_64)
Consult /var/lib/dkms/aufs/4.9+20161219/build/make.log for more information.

Kernel preparation unnecessary for this kernel.  Skipping...

Building module:
cleaning build area...
make -j8 KERNELRELEASE=4.9.0-3-amd64 -C /lib/modules/4.9.0-3-amd64/build M=/var/lib/dkms/virtualbox-guest/5.1.22/build......
cleaning build area...

DKMS: build completed.

vboxguest:
Running module version sanity check.

Good news! Module version 5.1.22_Debian for vboxguest.ko
exactly matches what is already found in kernel 4.9.0-3-amd64.
DKMS will not replace this module.
You may override by specifying --force.

vboxsf.ko:
Running module version sanity check.

vboxvideo.ko:
Running module version sanity check.

Good news! Module version 5.1.22_Debian for vboxsf.ko
exactly matches what is already found in kernel 4.9.0-3-amd64.
DKMS will not replace this module.
You may override by specifying --force.

Good news! Module version 5.1.22_Debian for vboxvideo.ko
exactly matches what is already found in kernel 4.9.0-3-amd64.
DKMS will not replace this module.
You may override by specifying --force.

depmod...

DKMS: install completed.

Let’s “fix” this by adding a sanity check in that build hook, making sure there were no errors during DKMS, else we abort the build. If we get reports of builds failing because of this sanity check, then we actually spend time on investigating the reason why this can fail at all.

arnaud, if it’s easy for you, can you rebuild 3.0~rc2 and see if you get the same ISO image as you did before, and check in the .buildlog if DKMS failed as above? It’d also be interesting if you build 3.0 (expected SHA-256: 676f1322166536dc1e27b8db22462ae73f0891888cfcb09033ebc38f586e834a). In both cases, just checking out the Git tag (e.g. git checkout 3.0-rc2) and then building as usual should do the trick.

#6 Updated by arnaud 2017-06-19 13:15:39

Ok I will rebuild 3.0-rc2 tongight and let you know what I get.

#7 Updated by arnaud 2017-06-20 00:14:59

Here’s the result of this night build.

First, just to be sure, I checked where I was in the git maze. I’m on the branch feature/3.0-fake, and at the tag 3.0-rc2. Last commit is: 1678c015e8 Let's pretend we just released Tails 3.0~rc2 instead.

Then, I issued a git clean -dfx to ensure a clean state. This wiped out the result from my previous build, which is not very smart. But anyway, I uploaded it all to you already.

Ok, so here we go for the sha !

$ sha256sum tails-amd64-3.0~rc2.iso
b81a8b305b59446f93e05e5e793268272f3355bd50b7e5d065c63f74ec9c744a tails-amd64-3.0~rc2.iso

It seems like a new sha…

And here’s the snippet related to DKMS.

Loading new virtualbox-guest-5.1.22 DKMS files...
It is likely that 4.9.0-0.bpo.2-amd64 belongs to a chroot's host
Building for 4.9.0-3-amd64
Building initial module for 4.9.0-3-amd64
Done.

vboxguest:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/4.9.0-3-amd64/updates/

vboxsf.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/4.9.0-3-amd64/updates/

vboxvideo.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/4.9.0-3-amd64/updates/

depmod.....

DKMS: install completed.
Setting up linux-headers-4.9.0-3-common (4.9.25-1) ...
Setting up linux-compiler-gcc-6-x86 (4.9.25-1) ...
Setting up linux-kbuild-4.9 (4.9.25-1) ...
Setting up aufs-dkms (4.9+20161219-1) ...
Loading new aufs-4.9+20161219 DKMS files...
It is likely that 4.9.0-0.bpo.2-amd64 belongs to a chroot's host
Building for 4.9.0-3-amd64
Building initial module for 4.9.0-3-amd64
Error! Bad return status for module build on kernel: 4.9.0-3-amd64 (x86_64)
Consult /var/lib/dkms/aufs/4.9+20161219/build/make.log for more information.
Setting up linux-headers-4.9.0-3-amd64 (4.9.25-1) ...
/etc/kernel/header_postinst.d/dkms:
Error! Bad return status for module build on kernel: 4.9.0-3-amd64 (x86_64)
Consult /var/lib/dkms/aufs/4.9+20161219/build/make.log for more information.

Kernel preparation unnecessary for this kernel.  Skipping...

Building module:
cleaning build area...
make -j8 KERNELRELEASE=4.9.0-3-amd64 -C /lib/modules/4.9.0-3-amd64/build M=/var/lib/dkms/virtualbox-guest/5.1.22/build......
cleaning build area...

DKMS: build completed.

vboxguest:
Running module version sanity check.

Good news! Module version 5.1.22_Debian for vboxguest.ko
exactly matches what is already found in kernel 4.9.0-3-amd64.
DKMS will not replace this module.
You may override by specifying --force.

vboxsf.ko:
Running module version sanity check.

Good news! Module version 5.1.22_Debian for vboxsf.ko
exactly matches what is already found in kernel 4.9.0-3-amd64.
DKMS will not replace this module.
You may override by specifying --force.

vboxvideo.ko:
Running module version sanity check.

Good news! Module version 5.1.22_Debian for vboxvideo.ko
exactly matches what is already found in kernel 4.9.0-3-amd64.
DKMS will not replace this module.
You may override by specifying --force.

depmod...

DKMS: install completed.

I attach here the files. Let me know if you also want the iso.

#8 Updated by intrigeri 2017-06-22 13:49:26

#9 Updated by lamby 2017-06-24 12:12:13

AIUI indeed the problem is that sometimes it doesn’t build the aufs module successfully

 Preparing to unpack .../0-linux-kbuild-4.9_4.9.25-1_amd64.deb ...
@@ -12819,8 +12819,32 @@
 It is likely that 4.9.0-0.bpo.2-amd64 belongs to a chroot's host
 Building for 4.9.0-3-amd64
 Building initial module for 4.9.0-3-amd64
-Error! Bad return status for module build on kernel: 4.9.0-3-amd64 (x86_64)
-Consult /var/lib/dkms/virtualbox-guest/5.1.22/build/make.log for more information.
+Done.
+
+vboxguest:
+Running module version sanity check.
+ - Original module
+   - No original module exists within this kernel
+ - Installation
+   - Installing to /lib/modules/4.9.0-3-amd64/updates/
+
+vboxsf.ko:
+Running module version sanity check.
+ - Original module
+   - No original module exists within this kernel
+ - Installation
+   - Installing to /lib/modules/4.9.0-3-amd64/updates/
+
+vboxvideo.ko:
+Running module version sanity check.
+ - Original module
+   - No original module exists within this kernel
+ - Installation
+   - Installing to /lib/modules/4.9.0-3-amd64/updates/
+
+depmod.....
+
+DKMS: install completed.
 Setting up linux-headers-4.9.0-3-common (4.9.25-1) ...
 Setting up linux-compiler-gcc-6-x86 (4.9.25-1) ...
 Setting up linux-kbuild-4.9 (4.9.25-1) ...
@@ -12856,14 +12880,14 @@
 vboxsf.ko:
 Running module version sanity check.

-vboxvideo.ko:
-Running module version sanity check.
-
 Good news! Module version 5.1.22_Debian for vboxsf.ko
 exactly matches what is already found in kernel 4.9.0-3-amd64.
 DKMS will not replace this module.
 You may override by specifying --force.

+vboxvideo.ko:
+Running module version sanity check.
+
 Good news! Module version 5.1.22_Debian for vboxvideo.ko
 exactly matches what is already found in kernel 4.9.0-3-amd64.
 DKMS will not replace this module.

We should try and get hold of a /var/lib/dkms/virtualbox-guest/5.1.22/build/make.log

#10 Updated by anonym 2017-07-07 20:06:06

lamby wrote:
> We should try and get hold of a /var/lib/dkms/virtualbox-guest/5.1.22/build/make.log

Absolutely! arnaud, if you are still up for it, can you please try the following:

Try building e.g. the stable branch. If DKMS fails again, then apply this diff (or something similar), commit and build again:

--- a/config/chroot_local-hooks/50-dkms
+++ b/config/chroot_local-hooks/50-dkms
@@ -25,6 +25,10 @@ dkms install \
     -a amd64 -k "${KERNEL_VERSION}-amd64" \
     -m virtualbox-guest -v "$MODULES_VERSION"

+echo
+echo PAUSED
+while true; do sleep 1; done
+
 # clean the build directory
 # rm -r /var/lib/dkms/virtualbox-guest/

I.e. the build will pause after DKMS attempted (successfully or not) to build the module. So once you’ve seen that the build has paused, ssh into the VM with rake vm:ssh and have a look for something like

/tmp/tails-build*/chroot/var/lib/dkms/virtualbox-guest/5.1.22/build/make.log

(Something lazy like find /tmp/tails-build* -name make.log will probably be enough)

Attach that log to this ticket!

#11 Updated by arnaud 2017-07-10 03:41:39

Hey,

I tried to build stable as suggested. The good news is that dkms fails reliably. The bad news is that the logs don’t really help. Logs are attached, which are the output straight out of the console, with a newline or two added here and there for readability.

#12 Updated by arnaud 2017-07-10 04:00:21

> The bad news is that the logs don’t really help.

Or maybe it does. A quick research on the net, and it seems that if gcc gets killed, it’s likely to be a lack of RAM. Damned ! I forgot to browse the kernel logs for the oomkiller !

Ok I’ll retry later today and update you on what I see.

#13 Updated by arnaud 2017-07-10 08:17:53

I can confirm oom-killing.

[ 3077.099701] Out of memory: Kill process 23208 (cc1) score 126 or sacrifice child
[ 3077.099732] Killed process 23208 (cc1) total-vm:113196kB, anon-rss:65172kB, file-rss:0kB, shmem-rss:0kB
[ 3077.114620] oom_reaper: reaped process 23208 (cc1), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

So I guess I should just bump the memory in vagrant/lib/tails_build_settings.rb to something like 768 and then retry.

VM_MEMORY_FOR_DISK_BUILDS = 768

Which tag should I build now to see if I get the expected sha ? 3.0.1 ?

#14 Updated by intrigeri 2017-07-13 19:00:22

> Which tag should I build now to see if I get the expected sha ? 3.0.1 ?

Yes, please build that one and check your ISO vs. the detached OpenPGP signature we published :)

#15 Updated by arnaud 2017-07-14 06:32:20

I just build 3.0.1.

The good news is that I don’t get a oomkiller anymore after increasing the VM memory to 768M. You can find the patch on GitLab:

However, the image is still different from the one you published.

$ gpg --keyid-format 0xlong --verify tails-amd64-3.0.1.iso.sig tails-amd64-3.0.1.iso
gpg: Signature made Tue 04 Jul 2017 09:41:53 PM +07
gpg:                using RSA key BA2C222F44AC00ED9899389398FEC6BC752A3DB6
gpg: BAD signature from "Tails developers (offline long-term identity key) <tails@boum.org>" [full]

I’m wondering if the fact that I added a commit can make a difference. I tricked the build system a bit, by deleting the original tag 3.0.1, and tagging my additional commit (the one that increases the RAM) to 3.0.1, before doing the build. I did that because, as far as I remember, the build system looks if we’re on a tag, and take some decisions based on that.

#16 Updated by intrigeri 2017-07-17 15:18:36

  • blocked by Bug #13480: The Vagrant VM has too little memory for disk builds added

#17 Updated by intrigeri 2017-07-17 15:21:41

  • Assignee changed from anonym to arnaud
  • QA Check set to Info Needed

> The good news is that I don’t get a oomkiller anymore after increasing the VM memory to 768M. You can find the patch on GitLab:

Extracted this into Bug #13480, marked as a blocker for this ticket.

> However, the image is still different from the one you published.

Can you please diffoscope it? It would be interesting to see what exactly differs, i.e. if we can close this ticket once Bug #13480 is resolved.

> I’m wondering if the fact that I added a commit can make a difference.

Yes: git grep amnesia/version -- auto/config.

#18 Updated by arnaud 2017-07-19 03:58:24

Hey,

I had a first try with the diffoscope.

First thing is that it can’t compare the 2 iso directly. When I try it with --debug option, I see some lines about unrecognized file format, followed by an hexadecimal comparison of the 2 files. Don’t know if it’s expected or not.

So I unpacked the two isos, I saw that only the two squashfs differ, so I compared them. This time the diffoscope understand the squashfs format and does a proper comparison.

It took around 20 hours on my laptop, then it spit out a huge diff that is bigger than the number of lines on my terminal. So the result is unusable, I have to do it all over again, this time redirecting the output in a file, I guess. Not sure exactly when I’ll have 20 hours free for that, cause I tend to move around with my laptop, and I rarely leave it there for 20 hours in a row.

I’ll let you know. In the meantime, if you have good tips about the right way to use the diffoscope, or how to make it quicker, let me know.

Cheers ;)

#19 Updated by lamby 2017-07-19 07:20:43

arnaud wrote:
> It took around 20 hours on my laptop

Which exact version of diffoscope are you using? I did a lot of work on performance with Tails in mind! :)

Also, if you do not have some of the auxilliary tools installed it will fallback to a binary comparison. For example, if you didn’t have the tool to unpack squashfs images it would do a stupid and useless diff. It does tell you this, but at the top of the output which is about four miles back in your scrollback, if at all…

Hope that helps.

#20 Updated by arnaud 2017-07-19 07:50:03

> Which exact version of diffoscope are you using?

I used the diffoscope available in Debian stretch, aka version 78. I think

> It does tell you this, but at the top of the output which is about four miles back in your scrollback, if at all…

Yep I noticed that when it tried to binary compare the 2 isos.

I think I’ll tried again with diffoscope 84 in a container, I’ve seen that here https://github.com/tianon/dockerfiles/blob/master/diffoscope/Dockerfile.

#21 Updated by lamby 2017-07-19 07:52:30

arnaud wrote:
> > Which exact version of diffoscope are you using?
>
> I used the diffoscope available in Debian stretch, aka version 78. I think

Yes, I think that misses pretty much all the performance improvements I was refering to :)

You can just install diffoscope from git and run it from there if you are perhaps thinking of contributing .. grin

#22 Updated by arnaud 2017-07-19 08:00:03

Well, indeed, why not, now that the gigabyte of dependencies is installed on my machine anyway, let’s keep going … :/

#23 Updated by intrigeri 2017-07-19 11:53:29

FWIW on our CI infra we use diffoscope 84 with the following options:

diffoscope \
--text diffoscope.txt" \
--html diffoscope.html" \
--max-report-size 262144000 \
--max-diff-block-lines 10000 \
--max-diff-input-lines 10000000 \
tails-*.iso

#24 Updated by arnaud 2017-07-23 01:25:22

Thanks for the hints, so I used diffoscope 84 with the options mentioned above. I attach the html output (tell me if you also want the txt output).

The difference I see are:

- the file /etc/amnesia/version which differs, due to my additional commit on VM RAM size

- all the gconf ordering stuff that has been discussed
- some .cache files

#25 Updated by intrigeri 2017-07-23 05:37:54

  • Assignee changed from arnaud to anonym
  • QA Check changed from Info Needed to Dev Needed

Thanks! I’ll let anonym analyze your results and decide if this ticket can be closed once your branch is merged.

#26 Updated by anonym 2017-08-07 14:02:48

  • Status changed from Confirmed to In Progress
  • Assignee deleted (anonym)
  • % Done changed from 0 to 100
  • QA Check changed from Dev Needed to Pass

arnaud wrote:
> Thanks for the hints, so I used diffoscope 84 with the options mentioned above. I attach the html output (tell me if you also want the txt output).
>
> The difference I see are:
> - the file /etc/amnesia/version which differs, due to my additional commit on VM RAM size

I.e. “expected”. :)

> - all the gconf ordering stuff that has been discussed

That is Bug #12738, which should be fixed in current stable, and so in Tails 3.1.

> - some .cache files

So Bug #12740 and children.

So it seems Bug #13480 is the fix for this ticket => closing!

Thanks a bunch, arnaud, both for you patience, and for finding and fixing the problem yourself! You rock!

#27 Updated by intrigeri 2017-08-24 13:15:50

  • Status changed from In Progress to Resolved

#28 Updated by lamby 2017-08-24 14:17:29

Why does /etc/amnesia/version change depending on VM size?

#29 Updated by anonym 2017-08-24 14:51:44

lamby wrote:
> Why does /etc/amnesia/version change depending on VM size?

The file contains the commit Tails was built from, so any commit will change it (ignoring crazy things like commit id collisions :)).

#30 Updated by lamby 2017-08-24 15:01:14

Oh, and the VM size was changed in a commit? If so, makes total sense. :)

Give me the SHA…collision!

#31 Updated by anonym 2017-08-24 15:28:34

lamby wrote:
> Oh, and the VM size was changed in a commit? If so, makes total sense. :)

It is set in vagrant/lib/tails_build_settings.rb which is tracked in Git, so yes. :)