Bug #16476

Allow building a branch using unpublished Git submodule changes

Added by anonym 2019-02-21 14:21:02 . Updated 2019-03-20 14:25:50 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Build system
Target version:
Start date:
2019-02-21
Due date:
% Done:

100%

Feature Branch:
bugfix/16476-clone-submodules-locally-optimized
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

If you commit something to a submodule without pushing, and point Tails to use this new commit of the submodule, then building will fail.

Explanation: in the build VM, the locally cloned repo we build from has these submodules configured to track the remote repos, not the host’s repo with contain these unpublushed commits.

Fix: in the build VM, the locally cloned repo we build from must have its submodules tracking the host’s local repos.


Subtasks


Related issues

Related to Tails - Bug #16607: FTBFS with "repository '/amnesia.git/.git/modules/submodules/aufs4-standalone' does not exist" Resolved 2019-03-23
Related to Tails - Bug #16730: Reproducible build CI job uses the HEAD of the current branch instead of the commit the 1st build was built from Resolved
Blocks Tails - Feature #15507: Core work 2019Q1: Foundations Team Resolved 2018-04-08

History

#1 Updated by anonym 2019-02-21 15:35:36

  • Status changed from Confirmed to In Progress

Applied in changeset commit:tails|df8a338ea6e078113ef6a03d7e3fbb3ffc7467d1.

#2 Updated by anonym 2019-02-21 15:52:28

  • Assignee deleted (anonym)
  • Target version deleted (Tails_3.13)
  • % Done changed from 0 to 50
  • QA Check set to Ready for QA
  • Feature Branch set to bugfix/16476-clone-submodules-locally

Fixed, and while I was at it I managed to optimize our Git usage a bit as well.

@intrigeri, I’ll count this as FT work, but I’m highlighting you in case you have a better idea.

The rationale for me doing this is that I personally have wasted an hour at two separate occasions from this type of build failure when I have worked on submodules. But I think it’s even worse than that. IMHO, this is a bug: so far we have failed to exactly replicate the Git state from the host into the builder VM. I bet this problem will bite (and waste the time of) contributors working on submodules in Tails, and the workaround to push to the submodules’ repos isn’t obvious. That workaround won’t work for contributor’s without push rights to the submodules repos, so they will have to go even further to be able to build and test their work (e.g. they can update their .gitmodules to point to a repo they control in a temporary commit… not very obvious and quite a lot of work).

#3 Updated by anonym 2019-02-21 15:53:04

#4 Updated by intrigeri 2019-02-21 16:15:18

  • Assignee set to intrigeri

#5 Updated by intrigeri 2019-02-22 07:07:59

  • Assignee changed from intrigeri to anonym
  • QA Check changed from Ready for QA to Dev Needed

This branch FTBFS on Jenkins: at first glance, it seems these changes break the “mergebasebranch” feature.

#6 Updated by anonym 2019-02-22 09:22:24

On it! I actually tried mergebasebranch successfully before I pushed, but I think all that is missing it a git fetch origin "$(cat config/base_branch)".

#7 Updated by anonym 2019-02-22 11:11:10

  • % Done changed from 50 to 60
  • QA Check changed from Dev Needed to Ready for QA

anonym wrote:
> On it! I actually tried mergebasebranch successfully before I pushed, but I think all that is missing it a git fetch origin "$(cat config/base_branch)".

For some reason that doesn’t work on Jenkins (works fine for me locally though). Whatever, I’ll just revert the fanciness I’m trying with only fetching the exact refs we need which would shave ~30 seconds off the build time. KISS!

Also, I worried about whether this branch might have some strange effects when building a release from a tag, but I managed to reproduce Tails 3.12.1 (I imported the changes into tails-build without commiting and built with ignorechanges) so it looks good!

Let’s see what Jenkins thinks: https://jenkins.tails.boum.org/job/build_Tails_ISO_bugfix-16476-clone-submodules-locally/4/

#8 Updated by anonym 2019-02-22 16:26:18

  • Assignee changed from anonym to intrigeri
  • % Done changed from 60 to 70

Success!

commit:d910948ef2f141a7469f7675dec1e29c06ec3712 is a bit of a shame, though, since a full fetch takes around 30 seconds. Do you have a quick answer (i.e. don’t investigate) why git fetch origin $(cat config/base_branch) fails on Jenkins?

#9 Updated by intrigeri 2019-02-22 16:51:20

  • Assignee changed from intrigeri to anonym

> Do you have a quick answer (i.e. don’t investigate) why git fetch origin $(cat config/base_branch) fails on Jenkins?

I don’t remember if the tweaks we do to change the meaning of “origin” and/or for mapping remote refs to local refs is 100% in tails.git or if there are other, special bits in Jenkins too (puppet-tails.git, jenkins-jobs.git). I suspect there’s Jenkins-specific stuff somewhere in those repos.

#10 Updated by anonym 2019-02-25 10:09:58

  • Assignee changed from anonym to intrigeri

intrigeri wrote:
> > Do you have a quick answer (i.e. don’t investigate) why git fetch origin $(cat config/base_branch) fails on Jenkins?
>
> I don’t remember if the tweaks we do to change the meaning of “origin” and/or for mapping remote refs to local refs is 100% in tails.git or if there are other, special bits in Jenkins too (puppet-tails.git, jenkins-jobs.git). I suspect there’s Jenkins-specific stuff somewhere in those repos.

Thanks! I had a look but quickly determined that I would have to RTFM way more than I am willing for a non-critical time optimization, which was only a bonus as far as this ticket/branch is concerned. Let’s skip the git fetch optimization! Please review’n’merge!

#11 Updated by intrigeri 2019-02-27 08:13:36

> Thanks! I had a look but quickly determined that I would have to RTFM way more than I am willing for a non-critical time optimization, which was only a bonus as far as this ticket/branch is concerned.

anonym, actually the problem you've fixed in commit:d910948ef2f141a7469f7675dec1e29c06ec3712 is not Jenkins-specific. You've not seen it locally only because you happened to have a local stable@ ref; anyone who builds a topic branch without having ever checked out its base branch locally would have been hit by this bug; the way Jenkins clones tails.git before building happens to have this problem. I think the only way to ensure we refer to an existing base branch ref would be to do git fetch origin origin/"$(cat config/base_branch)" or similar (which will point to the real online remote, not to the local one, because of the git config remote.origin.fetch +refs/remotes/origin/*:refs/remotes/origin/* tweak we’re doing). I’m not 100% sure this solution would work, in particular when one has a local ref with the base branch name too: I don’t know how Git behaves with our non-injective refs mapping, when 2 different refs are mapped to the same one.

> Let’s skip the git fetch optimization! Please review’n’merge!

Deal!

#12 Updated by anonym 2019-02-28 14:48:30

intrigeri wrote:
> > Thanks! I had a look but quickly determined that I would have to RTFM way more than I am willing for a non-critical time optimization, which was only a bonus as far as this ticket/branch is concerned.
>
> anonym, actually the problem you've fixed in commit:d910948ef2f141a7469f7675dec1e29c06ec3712 is not Jenkins-specific. You've not seen it locally only because you happened to have a local stable@ ref; anyone who builds a topic branch without having ever checked out its base branch locally would have been hit by this bug; the way Jenkins clones tails.git before building happens to have this problem. I think the only way to ensure we refer to an existing base branch ref would be to do git fetch origin origin/"$(cat config/base_branch)" or similar (which will point to the real online remote, not to the local one, because of the git config remote.origin.fetch +refs/remotes/origin/*:refs/remotes/origin/* tweak we’re doing). I’m not 100% sure this solution would work, in particular when one has a local ref with the base branch name too: I don’t know how Git behaves with our non-injective refs mapping, when 2 different refs are mapped to the same one.

Ah, now I get it (and something clicked about what cloning a repo actually means). But if it is that easy, let’s try it! I pushed bugfix/16476-clone-submodules-locally-optimized (Tails repo only) to test it.

> > Let’s skip the git fetch optimization! Please review’n’merge!
>
> Deal!

Feel free to go ahead with this. If so I’ll open a new ticket with a PR for the above (if it works).

#13 Updated by anonym 2019-02-28 15:06:38

It worked: https://jenkins.tails.boum.org/job/build_Tails_ISO_bugfix-16476-clone-submodules-locally-optimized/1/console

14:59:51 + git fetch origin origin/stable

14:59:52 From /amnesia.git/
14:59:52  * remote-tracking branch  origin/stable -> FETCH_HEAD
14:59:52  * [new ref]               origin/stable -> origin/stable
[...]
15:00:13 + echo Merging base branch origin/stable
15:00:13 + echo (at commit d72efd1f7baf2037e58813ffbdf3b45a7e809b5c)...
15:00:13 + faketime -f 2019-02-13 16:14:23 git merge --no-edit origin/stable
15:00:13 Already up-to-date.

So, IMHO, let’s merge the -optimized branch.

#14 Updated by anonym 2019-02-28 20:40:34

  • Assignee changed from intrigeri to anonym
  • QA Check changed from Ready for QA to Dev Needed

Actually, I want to spend some more time thinking about various scenarios before I push a time bomb that will cause mass-breakage on Jenkins some time soon.

I have a specific scenario in mind: using mergebasebranches when the base branch has bumped submodules. I just want to make sure that the new submodule commits actually are present in our submodule clones, or if we might have to do some extra fetching.

#15 Updated by intrigeri 2019-03-01 07:51:23

Also, I would suggest recommending — in contribute/git — developers to set the push.recurseSubmodules=check Git pref, to avoid issues like we’ve seen already recently (one forgets to push required commits from submodules).

#16 Updated by anonym 2019-03-01 10:44:01

  • Assignee changed from anonym to intrigeri
  • QA Check changed from Dev Needed to Ready for QA
  • Feature Branch changed from bugfix/16476-clone-submodules-locally to bugfix/16476-clone-submodules-locally-optimized

anonym wrote:
> Actually, I want to spend some more time thinking about various scenarios before I push a time bomb that will cause mass-breakage on Jenkins some time soon.
>
> I have a specific scenario in mind: using mergebasebranches when the base branch has bumped submodules. I just want to make sure that the new submodule commits actually are present in our submodule clones, or if we might have to do some extra fetching.

I tested this without issue, and haven’t been able to come up with any other potentially problematic scenarios. Let’s see what you can find!

intrigeri wrote:
> Also, I would suggest recommending — in contribute/git — developers to set the push.recurseSubmodules=check Git pref, to avoid issues like we’ve seen already recently (one forgets to push required commits from submodules).

That’s great! I just set it myself, and added the above recommendation.

#17 Updated by intrigeri 2019-03-04 11:58:58

  • Target version set to Tails_3.13

#18 Updated by intrigeri 2019-03-04 13:05:45

anonym wrote:
> But I think it’s even worse than that. IMHO, this is a bug: so far we have failed to exactly replicate the Git state from the host into the builder VM.

@anonym I’m not sure I understand the “worse than that” correctly. Indeed, in the failure mode this ticket was originally about (FTBFS when the tails.git branch references a submodule commit that’s not been pushed), we fail to replicate the Git state but then we fail loudly, so at least it’s obvious that something is wrong. What exactly is “worse than that”? Is there any case where we would successfully build but using the wrong code?

#19 Updated by intrigeri 2019-03-04 13:21:03

I guess it’s safe to drop the --tags Git fetch option because $GIT_REF is built from git_current_head_name, which returns the tag name when building from a tag. I assume git fetch origin 3.42 will forcefully replace an existing 3.42 tag that exists locally. Correct? For background, see commit:a6aa7504de1c3c139c660d299ec08349d8973229.

#20 Updated by intrigeri 2019-03-04 13:22:48

I can’t parse this sentence: “We work around this by fetching the ref that the host’s origin (most likely the official Tails repo) where the base branch refs must exist and be as up-to-date as possible”.

#21 Updated by intrigeri 2019-03-04 13:40:00

  • Assignee changed from intrigeri to anonym
  • QA Check changed from Ready for QA to Dev Needed

anonym, see above comments & questions. Apart of that, code review passes. I’ve triggered a build on Jenkins after updated the stable branch, which will trigger mergebasebranch and tell us how this code behaves when the branch lags behind its base branch. I’ll monitor that build and will report back if anything breaks.

#22 Updated by intrigeri 2019-03-04 15:00:45

intrigeri wrote:
> I’ve triggered a build on Jenkins after updated the stable branch, which will trigger mergebasebranch and tell us how this code behaves when the branch lags behind its base branch. I’ll monitor that build and will report back if anything breaks.

Worked fine.

#23 Updated by anonym 2019-03-05 14:27:31

  • Assignee changed from anonym to intrigeri
  • Target version deleted (Tails_3.13)
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:
> anonym wrote:
> > But I think it’s even worse than that. IMHO, this is a bug: so far we have failed to exactly replicate the Git state from the host into the builder VM.
>
> anonym I’m not sure I understand the “worse than that” correctly. Indeed, in the failure mode this ticket was originally about (FTBFS when the tails.git branch references a submodule commit that’s not been pushed), we fail to replicate the Git state but then we fail loudly, so at least it’s obvious that something is wrong. What exactly is “worse than that”?

Edit: I misread you first so might first answer (i.e. what’s in your email notification) might not make sense. Let me just avoid any confusion by answering this question:

> Is there any case where we would successfully build but using the wrong code?

No, running git submodule update under set -e ensures that we won’t ever be able to successfully build with the submodules in the wrong state, so there is/was never any risk for that. Just forget about the “worse” part. :)

> I guess it’s safe to drop the —tags Git fetch option because $GIT_REF is built from git_current_head_name, which returns the tag name when building from a tag. I assume git fetch origin 3.42 will forcefully replace an existing 3.42 tag that exists locally. Correct?

I tested, and actually no. I also RTFM:ed and think what we’d need is: git fetch origin --force tag ${REF} (so a special case vs when $REF is a branch).

> For background, see a6aa7504de1c3c139c660d299ec08349d8973229.

Indeed that used to be an RM annoyance before the rework of our build system we did during the reproducibility adventure. Before the rework the builder VM was never shutdown automatically so you would reuse it for the next build, and that is what enabled the annoyance to happen. But since the rework we shut down the VM on success, so I haven’t seen this issue ever since.

I definitely could implement the fix mentioned above if you think it is useful, though! Personally I think it isn’t worth it since it would only benefit those playing with TAILS_BUILD_OPTIONS=keeprunning (or rescue) and retagging, and it would “magically” be fixed after a rake vm:destroy, i.e. the advertised remedy for all problems. :)

> I can’t parse this sentence: “We work around this by fetching the ref that the host’s origin (most likely the official Tails repo) where the base branch refs must exist and be as up-to-date as possible”.

Sorry, fixed in commit:8ac09db5fc (s/that/from/).

#24 Updated by intrigeri 2019-03-05 14:52:20

anonym wrote:
> intrigeri wrote:
>> anonym wrote:
>> > But I think it’s even worse than that. IMHO, this is a bug: so far we have failed to exactly replicate the Git state from the host into the builder VM.
>>
>> anonym I’m not sure I understand the “worse than that” correctly. Indeed, in the failure mode this ticket was originally about (FTBFS when the tails.git branch references a submodule commit that’s not been pushed), we fail to replicate the Git state but then we fail loudly, so at least it’s obvious that something is wrong. What exactly is “worse than that”? Is there any case where we would successfully build but using the wrong code?

> We are on the exact same page. With “that” (in “worse than that”) I was referring to the previous sentence, where I was only talking about this issue being a time waste for me occasionally. I.e. I think building the wrong code is worse than wasting my time occasionally.

@anonym, I’m now even more confused :) I agree with “building the wrong code is worse than wasting my time occasionally” but as I said, I don’t understand how this can possibly happen. Don’t we ensure the submodules are in the state referenced by the tails.git branch we’re building (and fail if we can’t e.g. missing commits)?

#25 Updated by intrigeri 2019-03-05 14:53:53

  • Target version set to Tails_3.13

(I don’t want this to remain in a WIP state for too long; let’s get this into 3.13 so we can switch to other matters.)

#26 Updated by anonym 2019-03-05 17:25:54

  • Target version deleted (Tails_3.13)

intrigeri wrote:
> anonym, I’m now even more confused :)

Let me guess, you only read the email notification? I edited the message 1 minute after posting it because I realized I was confused, so please look at the online version! :)

#27 Updated by intrigeri 2019-03-06 06:52:37

  • Target version set to Tails_3.13

(Same rationale as last time.)

#28 Updated by intrigeri 2019-03-06 06:53:23

> Let me guess, you only read the email notification? I edited the message 1 minute after posting it because I realized I was confused, so please look at the online version! :)

ACK, everything is clarified now.

#29 Updated by intrigeri 2019-03-06 07:50:28

  • Assignee changed from intrigeri to anonym

> But since the rework we shut down the VM on success, so I haven’t seen this issue ever since.

@anonym, my understanding is that you can’t possibly have seen this issue, precisely because the command we have will overwrite a previously existing tag, which was the whole point of the aforementioned commit. So I’m not sure what relevant conclusion we can draw from “I haven’t seen this issue ever since”. Anyways, this does not matter much because:

> I definitely could implement the fix mentioned above if you think it is useful, though! Personally I think it isn’t worth it since it would only benefit those playing with TAILS_BUILD_OPTIONS=keeprunning (or rescue) and retagging, and it would “magically” be fixed after a rake vm:destroy, i.e. the advertised remedy for all problems. :)

I’m not a big fan of adding yet another potential failure mode in the release process, which I understand this branch does; the fact there’s a good chance the RM discovers the fix won’t make them much happier once they’ve stumbled upon this regression. This being said, I agree it’s probably not worth reworking the new code again (we’ve probably spent more than enough time on this already and I suspect we initially underestimated the cost/benefit of fixing this issue), so I’ve pushed a commit to the release process to clarify what a RM must do to avoid meeting this problem.

Please review & merge into stable → devel :)

#30 Updated by anonym 2019-03-06 10:36:13

  • Status changed from In Progress to Fix committed
  • % Done changed from 70 to 100

Applied in changeset commit:tails|365ade2ecc5f915cd74c5e97cb8557b42d0df56d.

#31 Updated by anonym 2019-03-06 10:58:01

  • Status changed from Fix committed to In Progress
  • Assignee changed from anonym to intrigeri
  • % Done changed from 100 to 90

intrigeri wrote:
> > But since the rework we shut down the VM on success, so I haven’t seen this issue ever since.
> >
> anonym, my understanding is that you can’t possibly have seen this issue, precisely because the command we have will overwrite a previously existing tag, which was the whole point of the aforementioned commit. So I’m not sure what relevant conclusion we can draw from “I haven’t seen this issue ever since”.

Eh, yeah that argument is just incorrect. :S

> Anyways, this does not matter much because:
>
> > I definitely could implement the fix mentioned above if you think it is useful, though! Personally I think it isn’t worth it since it would only benefit those playing with TAILS_BUILD_OPTIONS=keeprunning (or rescue) and retagging, and it would “magically” be fixed after a rake vm:destroy, i.e. the advertised remedy for all problems. :)
>
> I’m not a big fan of adding yet another potential failure mode in the release process, which I understand this branch does; the fact there’s a good chance the RM discovers the fix won’t make them much happier once they’ve stumbled upon this regression. This being said, I agree it’s probably not worth reworking the new code again (we’ve probably spent more than enough time on this already and I suspect we initially underestimated the cost/benefit of fixing this issue), so I’ve pushed a commit to the release process to clarify what a RM must do to avoid meeting this problem.
>
> Please review & merge into stable → devel :)

Merged!

But at least today the sound of “adding yet another potential failure mode in the release process” makes me more uneasy than when I thought about this last. So I pushed yet another commit fixing this. If you indeed feel like you’ve reached your limit with this ticket, feel free to ignore it and just close this ticket (and I’ll manually kill the branch since it won’t get garbage collected thanks to the extra commits), it was barely 10 minutes of hacking/testing so whatever. :)

#32 Updated by intrigeri 2019-03-06 11:32:10

  • Status changed from In Progress to Fix committed

> Merged!

:)

> But at least today the sound of “adding yet another potential failure mode in the release process” makes me more uneasy than when I thought about this last. So I pushed yet another commit fixing this. If you indeed feel like you’ve reached your limit with this ticket, feel free to ignore it and just close this ticket

Indeed, I’ve reached the limit of the time/energy I think this is worth.

> (and I’ll manually kill the branch since it won’t get garbage collected thanks to the extra commits), it was barely 10 minutes of hacking/testing so whatever. :)

It looks like you did not actually push your branch so there’s no clean up to do.

#33 Updated by intrigeri 2019-03-06 11:39:20

  • Assignee deleted (intrigeri)
  • % Done changed from 90 to 100
  • QA Check changed from Ready for QA to Pass

#34 Updated by CyrilBrulebois 2019-03-20 14:25:50

  • Status changed from Fix committed to Resolved

#35 Updated by intrigeri 2019-04-03 09:44:52

  • related to Bug #16607: FTBFS with "repository '/amnesia.git/.git/modules/submodules/aufs4-standalone' does not exist" added

#36 Updated by intrigeri 2019-05-19 05:10:48

  • related to Bug #16730: Reproducible build CI job uses the HEAD of the current branch instead of the commit the 1st build was built from added