Feature #15215

Iteration 1: Support unlocking VeraCrypt partitions in udisks

Added by segfault 2018-01-22 17:31:02 . Updated 2018-04-15 11:28:18 .

Status:
Resolved
Priority:
Normal
Assignee:
segfault
Category:
Target version:
Start date:
2018-01-22
Due date:
% Done:

50%

Feature Branch:
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Deliverable for:
299

Description

Development repository: https://@github.com/segfault3/udisks
Branch: support-tcrypt


Subtasks


Related issues

Blocks Tails - Feature #15220: Iteration 1: Upstream unlocking VeraCrypt partitions in udisks Resolved 2018-01-25

History

#1 Updated by segfault 2018-01-22 17:38:44

  • Subject changed from Support unlocking partitions in udisks to Support unlocking VeraCrypt partitions in udisks
  • Target version set to Tails_3.6
  • Feature Branch deleted (github.com/segfault3/udisks support-tcrypt)
  • Deliverable for set to 299

#2 Updated by segfault 2018-01-22 17:39:31

  • Description updated

#3 Updated by segfault 2018-01-22 17:39:48

  • Description updated

#4 Updated by segfault 2018-01-22 17:41:58

  • Description updated

#5 Updated by segfault 2018-01-22 18:04:41

  • blocks Feature #15220: Iteration 1: Upstream unlocking VeraCrypt partitions in udisks added

#6 Updated by segfault 2018-01-22 22:53:44

  • Assignee changed from segfault to anonym
  • % Done changed from 0 to 50
  • QA Check set to Ready for QA

#7 Updated by segfault 2018-01-23 02:33:03

  • Assignee changed from anonym to sajolida
  • QA Check deleted (Ready for QA)

During the UX sprint, we decided to always use cryptsetup’s --veracrypt flag for TCRYPT devices, because it removes a user-visible option and the negative impact (longer waiting time than necessary if a TrueCrypt user enters a wrong password) is acceptable because TrueCrypt users should migrate to VeraCrypt anyway.

I’m not sure if we also discussed the --tcrypt-hidden flag, which has the following effect: In addition to trying to decrypt the header in all TrueCrypt and VeraCrypt modes, cryptsetup will also try to decrypt a hidden header in all these modes. This means that the waiting time if a wrong password is entered doubles (from 15 seconds to 30 seconds on my system).

In our UI design, we included a “Hidden volume” checkbox. In case we did not discuss already at the sprint (I really don’t remember), we should quickly discuss if we want to keep it that way or remove the checkbox and always use the flag.

I looked at what the official VeraCrypt application is doing and it does not offer a “Hidden volume” checkbox, and it does indeed take exactly 30 seconds on my system to notice that I entered a wrong password.

I think I would still prefer keeping the checkbox, because having to wait 15 seconds if a wrong password is entered is bad enough, and the UX would further decrease if it was 30 seconds instead. What do you think?

The same also applies to the --tcrypt-system option for unlocking encrypted Windows system partitions (so we could remove that one too for another 15 seconds waiting on wrong password). The official VeraCrypt application does have a “Mount partition using system encryption” checkbox.

#8 Updated by segfault 2018-01-25 16:01:00

  • Subject changed from Support unlocking VeraCrypt partitions in udisks to Iteration 1: Support unlocking VeraCrypt partitions in udisks

#9 Updated by anonym 2018-01-25 16:29:05

Here’s a code review. It looks very nice so far! :)

commit be47a325153b0568e82c5764ab997524aac06a5c

First of all, please #define the magic numbers as nicely named constants! That might be a good place to list the assumptions about these numbers, and their relationship.

> gfloat e = sizeof(buf) / 256;

This just works by coincidence at the moment: both operands are integral, so this becomes integer division. Please make sure floating point division is used, i.e.:

gfloat e = (gfloat)sizeof(buf) / 256.0;

(Actually it is strictly enough that one operand is a float to coerce float arithmetic, but IMHO thoroughness is just nice in this instance.)

> for ( i = 0; i < sizeof(buf); i) symbols[buf[i]];

This is only safe because of the type of buf happens to have a max value that is < sizeof(symbols). Adding a check/assertion is overkill, but a comment making this assumption explicit would be nice.

> chi_square += (symbols[i] - e) * (symbols[i] - e) / e;

Premature optimization: instead of dividing by e in each iteration we can divide the whole sum by e once (after the sum loop), thus avoiding 255 unnecessary float divisions. (I bet GCC will do this optimization itself! :P)

commit 6e84d7b2c8d5823f98ecd5fd5df2786b1a162b03

Double newline?

commit 9f8732e55ddc7b917114ce385d7b1657a3bb73be

Why remove udisks_linux_block_is_luks()?

(Meta: I think me, other reviewers and maintainers (and future you!) will be thankful if you spend just one minute extra on each commit writing a motivation/explanation for it.)

If it turns out you remove this function, which you introduced, IMHO it would make the PR less noisy if you rewrote the history to hide the fact that this function ever existed. But let’s wait with this until the last minute!

commit bd9375a162b734a83dfac7f36243b663c12678df

Stray newlines.

commit 3bb1f663433e4236ccd8fdaa17e0509613173fa9

This looks like another candidate for Git history rewriting to make the PR less noisy.

commit bb01674d6b94a95a249fc49b63573a0f0608e42b

> /* save old encryption type to be able to restore it */
> old_hint_encryption_type = udisks_encrypted_get_hint_encryption_type(encrypted);

This only saves a pointer to the hint, not the data, so…

>

else if (veracrypt)


> udisks_encrypted_set_hint_encryption_type(encrypted, “veracrypt”);
> else
> udisks_encrypted_set_hint_encryption_type(encrypted, “truecrypt”);

… these will overwrite the value that old_hint_encryption_type also points to …

> udisks_encrypted_set_hint_encryption_type(encrypted, old_hint_encryption_type);

… so this is always a noop; it will read the current value of encrypted’s encryption hint, and then write it back.

So you need to make old_hint_encryption_type point to a copy of the old value (preferably on the stack, possibly with a “variable length array” (ANSI C99)).

  1. commit f1bfe35d4b2d826da41fb7448af04411a99ca013

Why is this commit separate from bb01674d6b94a95a249fc49b63573a0f0608e42b? What were these original "truecrypt" and "veracrypt" values?

  1. commit 85faed1492952a3b2e4313b6a26984f2e62558e5

This looks like another candidate for Git history rewriting to make the PR less noisy.

  1. commit edbfe57297aa47082a4921fa5969e1e03ee5ecc6

> #define MAX_KEYFILES 256

I think I’d suggest MAX_TCRYPT_KEYFILES.

> gboolean is_luks = FALSE;
> gboolean is_tcrypt = TRUE;
> gboolean seems_encrypted = FALSE;

The initialization to TRUE threw me off. I suggest FALSE for this reason (even though it doesn’t matter for execution).

> keyfiles[i] = path;

I really think it’s worth the readability improvement by separating the indexing and incrementing to separate lines.

#10 Updated by segfault 2018-01-28 16:49:31

anonym wrote:
> Here’s a code review. It looks very nice so far! :)
>
> h3. commit be47a325153b0568e82c5764ab997524aac06a5c
>
> First of all, please #define the magic numbers as nicely named constants! That might be a good place to list the assumptions about these numbers, and their relationship.

Ack. I defined the magic numbers and wrote an explanation for the values on the blueprint, and linked to it in the code, because I think it is too long for a comment in the code. I also chose more substantiated numbers.

> > gfloat e = sizeof(buf) / 256;
>
> This just works by coincidence at the moment: both operands are integral, so this becomes integer division. Please make sure floating point division is used, i.e.:
>

gfloat e = (gfloat)sizeof(buf) / 256.0;

Done.

> (Actually it is strictly enough that one operand is a float to coerce float arithmetic, but IMHO thoroughness is just nice in this instance.)

Agreed.

> > for ( i = 0; i < sizeof(buf); i) symbols[buf[i]];
>
> This is only safe because of the type of buf happens to have a max value that is < sizeof(symbols). Adding a check/assertion is overkill, but a comment making this assumption explicit would be nice.

Done.

> > chi_square += (symbols[i] - e) * (symbols[i] - e) / e;
>
> Premature optimization: instead of dividing by e in each iteration we can divide the whole sum by e once (after the sum loop), thus avoiding 255 unnecessary float divisions. (I bet GCC will do this optimization itself! :P)

Right! I implemented it and couldn’t see a performance difference in my test, but I’m still in favor of keeping it.

> h3. commit 6e84d7b2c8d5823f98ecd5fd5df2786b1a162b03
>
> Double newline?

Fixed.

> h3. commit 9f8732e55ddc7b917114ce385d7b1657a3bb73be
>
> Why remove udisks_linux_block_is_luks()?

I think I had two points in favor of removing udisks_linux_block_is_luks():

  1. Less noise in the pull request. IMO, extracting this check into its function is cleaner code. For reasons unknown to me, the developers didn’t do this. The udisks code is in general written in a very different code style than what I prefer - but I think it is not on me to change this. I (reluctantly) tried to adopt this code style.
  2. The TCRYPT check is different than the LUKS check, in that it has to actually read the block device, instead of just checking UDisksBlock properties. I don’t want to read the block device more often than required, so in udisks_linux_block_update(), I store the result in a UDisksBlock property, and use this property everywhere else. This is not required for the LUKS check, so I would use udisks_linux_block_is_luks() everywhere in the code. So it could be confusing that I use these two check functions differently.

Now that I’m writing this and am forced to think about it again, I see that I could also scrap the is_tcrypt_candidate/@seems_encrypted@ property and only set id_type to "crypto_unknown", and then use a check similar to the LUKS check everywhere else.

I think I’d prefer that, because this way I don’t introduce another exported property, and handle things more closely to how it is currently done.

Maybe I should then also rename “crypto_unknown” to “crypto_TCRYPT_candidate”.

And then I could also re-add udisks_linux_block_is_luks() and udisks_linux_block_is_tcrypt_candidate() functions, because then they are at least used in the same way (although point 1 still remains).

What do you think?

> (Meta: I think me, other reviewers and maintainers (and future you!) will be thankful if you spend just one minute extra on each commit writing a motivation/explanation for it.)
>
> If it turns out you remove this function, which you introduced, IMHO it would make the PR less noisy if you rewrote the history to hide the fact that this function ever existed. But let’s wait with this until the last minute!

Yes, I would like to clean up the git history a bit anyway, and then I can also add a short motivation/explanation on each remaining commit (sorry, I probably should have done that before your review, but IIRC you also asked me to not rewrite git history if you already reviewed some of it).

> h3. commit bd9375a162b734a83dfac7f36243b663c12678df
>
> Stray newlines.

Fixed.

> h3. commit 3bb1f663433e4236ccd8fdaa17e0509613173fa9
>
> This looks like another candidate for Git history rewriting to make the PR less noisy.

Ack.

> h3. commit bb01674d6b94a95a249fc49b63573a0f0608e42b
>
> > /* save old encryption type to be able to restore it */
> > old_hint_encryption_type = udisks_encrypted_get_hint_encryption_type(encrypted);
>
> This only saves a pointer to the hint, not the data, so…
>
> > […]
> > udisks_encrypted_set_hint_encryption_type(encrypted, “veracrypt”);
> > else
> > udisks_encrypted_set_hint_encryption_type(encrypted, “truecrypt”);
>
> … these will overwrite the value that old_hint_encryption_type also points to …
>
> > udisks_encrypted_set_hint_encryption_type(encrypted, old_hint_encryption_type);
>
> … so this is always a noop; it will read the current value of encrypted’s encryption hint, and then write it back.
>
> So you need to make old_hint_encryption_type point to a copy of the old value (preferably on the stack, possibly with a “variable length array” (ANSI C99)).

This actually works as expected the way it currently is. I tested this again. When I try to unlock a TCRYPT candidate partition with an incorrect password, hint_encryption_type is first set to TCRYPT, and when the unlock fails, it is reset to the empty string. I guess udisks_encrypted_get_hint_encryption_type() already creates a copy of the property (and handles freeing it when the UDisksEncrypted object is freed). If you prefer to change this anyway, we could use udisks_encrypted_dup_hint_encryption_type() to get our own copy, and handle freeing it ourselves.

> # commit f1bfe35d4b2d826da41fb7448af04411a99ca013
>
> Why is this commit separate from bb01674d6b94a95a249fc49b63573a0f0608e42b?

Huh, I don’t know how it happened that those two commits have the exact same commit date and message. I will squash them when I fix the commit history.

>What were these original "truecrypt" and "veracrypt" values?

I used those before I noticed that we can decrypt both TC and VC volumes using the veracrypt flag, so we don’t have to distinguish between the two anymore.

> # commit 85faed1492952a3b2e4313b6a26984f2e62558e5
>
> This looks like another candidate for Git history rewriting to make the [PR less noisy.

Ack.

> # commit edbfe57297aa47082a4921fa5969e1e03ee5ecc6
>
> > #define MAX_KEYFILES 256
>
> I think I’d suggest MAX_TCRYPT_KEYFILES.

Agreed.

> > gboolean is_luks = FALSE;
> > gboolean is_tcrypt = TRUE;
> > gboolean seems_encrypted = FALSE;
>
> The initialization to TRUE threw me off. I suggest FALSE for this reason (even though it doesn’t matter for execution).

I can’t remember why I would choose these values for initialization. I removed the initializiation altogether, because those variables are set below anyway.

> > keyfiles[i] = path;
>
> I really think it’s worth the readability improvement by separating the indexing and incrementing to separate lines.

Ok.

Thanks a lot for the review!

#11 Updated by segfault 2018-01-28 16:59:26

  • Assignee changed from sajolida to anonym
  • QA Check set to Ready for QA

> Now that I’m writing this and am forced to think about it again, I see that I could also scrap the is_tcrypt_candidate/seems_encrypted property and only set id_type to “crypto_unknown”, and then use a check similar to the LUKS check everywhere else.
>
> I think I’d prefer that, because this way I don’t introduce another exported property, and handle things more closely to how it is currently done.

I implemented this now in commit 9d11fa184b17bca2044f5f1cf63c9e6237f92b3b.

> Maybe I should then also rename “crypto_unknown” to “crypto_TCRYPT_candidate”.

I didn’t do that yet. Whether we should rename it or not depends on how we want those volumes presented to the user. Do we want them to be presented as TrueCrypt/VeraCrypt candidates, or unknown, but possibly encrypted volumes? I think I prefer the latter, because it will cause less confusion in case of false positives, i.e. the volume is not actually a TC/VC volume.

I will let you, anonym, review the fixes I implemented, and then assign this to sajolida to get his opinion on the above question and note-7.

#12 Updated by anonym 2018-01-30 20:12:18

  • Status changed from Confirmed to In Progress
  • Assignee changed from anonym to sajolida
  • QA Check changed from Ready for QA to Info Needed

segfault wrote:
> anonym wrote:
> > h3. commit 9f8732e55ddc7b917114ce385d7b1657a3bb73be
> >
> > Why remove udisks_linux_block_is_luks()?
>
> I think I had two points in favor of removing udisks_linux_block_is_luks():
>
> # Less noise in the pull request. IMO, extracting this check into its function is cleaner code. For reasons unknown to me, the developers didn’t do this. The udisks code is in general written in a very different code style than what I prefer - but I think it is not on me to change this. I (reluctantly) tried to adopt this code style.

Ok. IMHO you should no lower your standards for this reason, but whatever.

> # The TCRYPT check is different than the LUKS check, in that it has to actually read the block device, instead of just checking UDisksBlock properties. I don’t want to read the block device more often than required, so in udisks_linux_block_update(), I store the result in a UDisksBlock property, and use this property everywhere else. This is not required for the LUKS check, so I would use udisks_linux_block_is_luks() everywhere in the code. So it could be confusing that I use these two check functions differently.

Makes sense!

> Now that I’m writing this and am forced to think about it again, I see that I could also scrap the is_tcrypt_candidate/@seems_encrypted@ property and only set id_type to "crypto_unknown", and then use a check similar to the LUKS check everywhere else.
>
> I think I’d prefer that, because this way I don’t introduce another exported property, and handle things more closely to how it is currently done.

Agreed!

> Maybe I should then also rename “crypto_unknown” to “crypto_TCRYPT_candidate”.

Perhaps. But crypto_unknown seems more future-proof, since other headerless disk encryption formats could use the same mechanism.

BTW, related: in udisks_linux_block_update (), I think you should remove the is_tcrypt_candidate boolean and just call udisks_linux_block_seems_encrypted() straight in the if’s conditional. After all, nothing “tcrypt”-specific going on so far. :)

> And then I could also re-add udisks_linux_block_is_luks() and udisks_linux_block_is_tcrypt_candidate() functions, because then they are at least used in the same way (although point 1 still remains).
>
> What do you think?

This sounds ideal to me.

> > (Meta: I think me, other reviewers and maintainers (and future you!) will be thankful if you spend just one minute extra on each commit writing a motivation/explanation for it.)
> >
> > If it turns out you remove this function, which you introduced, IMHO it would make the PR less noisy if you rewrote the history to hide the fact that this function ever existed. But let’s wait with this until the last minute!
>
> Yes, I would like to clean up the git history a bit anyway, and then I can also add a short motivation/explanation on each remaining commit (sorry, I probably should have done that before your review, but IIRC you also asked me to not rewrite git history if you already reviewed some of it).

Fair enough! :)

> > h3. commit bb01674d6b94a95a249fc49b63573a0f0608e42b
> >
> > > /* save old encryption type to be able to restore it */
> > > old_hint_encryption_type = udisks_encrypted_get_hint_encryption_type(encrypted);
> >
> > This only saves a pointer to the hint, not the data, so…
> >
> > > […]
> > > udisks_encrypted_set_hint_encryption_type(encrypted, “veracrypt”);
> > > else
> > > udisks_encrypted_set_hint_encryption_type(encrypted, “truecrypt”);
> >
> > … these will overwrite the value that old_hint_encryption_type also points to …
> >
> > > udisks_encrypted_set_hint_encryption_type(encrypted, old_hint_encryption_type);
> >
> > … so this is always a noop; it will read the current value of encrypted’s encryption hint, and then write it back.
> >
> > So you need to make old_hint_encryption_type point to a copy of the old value (preferably on the stack, possibly with a “variable length array” (ANSI C99)).
>
> This actually works as expected the way it currently is. I tested this again. When I try to unlock a TCRYPT candidate partition with an incorrect password, hint_encryption_type is first set to TCRYPT, and when the unlock fails, it is reset to the empty string.

I am not so worried about what “works”, I am worried about undefined behavior. For example: if calling udisks_encrypted_set_hint_encryption_type() results in the old hint being free()d and a new one is allocated and set to the new value, and then the hint pointer (or whatever) in UDisksEncrypted is updated to point to this new hint, then what you do is undefined behavior: you rely on free()d memory to not be overwritten, which is often the case, but not always. For instance, running that code in Tails should fail because we enable the kernel memory poisoning feature (page_poison slub_debug=P) so free()d memory is immediately overwritten.

BTW, I wanted to look up what udisks_encrypted_set_hint_encryption_type() actually does, but it seems like it is generated, and I cannot figure out how. And clues?

> I guess udisks_encrypted_get_hint_encryption_type() already creates a copy of the property (and handles freeing it when the UDisksEncrypted object is freed).

Possibly, but this is something that has to be carefully verified if relied upon, and clearly commented.

> If you prefer to change this anyway, we could use udisks_encrypted_dup_hint_encryption_type() to get our own copy and handle freeing it ourselves.

I greatly prefer it. Even if what you do is safe for some reason, I suspect it is safe because of how it is implemented, so a future change (e.g. the hint is not free()d + reallocated, but simply overwritten in-memory) will break this.

And now I noticed something I should have noticed in my previous review:

  if (CHI_SQUARE_LOWER_LIMIT < chi_square && chi_square < CHI_SQUARE_UPPER_LIMIT)
    return TRUE;
  else
    return FALSE;

This pattern can (and should!) always be replaced with simply return ${conditional}.

#13 Updated by segfault 2018-01-30 21:40:51

anonym wrote:
> segfault wrote:
> > anonym wrote:
> > > h3. commit 9f8732e55ddc7b917114ce385d7b1657a3bb73be
> > >
> > > Why remove udisks_linux_block_is_luks()?
> >
> > I think I had two points in favor of removing udisks_linux_block_is_luks():
> >
> > # Less noise in the pull request. IMO, extracting this check into its function is cleaner code. For reasons unknown to me, the developers didn’t do this. The udisks code is in general written in a very different code style than what I prefer - but I think it is not on me to change this. I (reluctantly) tried to adopt this code style.
>
> Ok. IMHO you should no lower your standards for this reason, but whatever.
>
> > # The TCRYPT check is different than the LUKS check, in that it has to actually read the block device, instead of just checking UDisksBlock properties. I don’t want to read the block device more often than required, so in udisks_linux_block_update(), I store the result in a UDisksBlock property, and use this property everywhere else. This is not required for the LUKS check, so I would use udisks_linux_block_is_luks() everywhere in the code. So it could be confusing that I use these two check functions differently.
>
> Makes sense!
>
> > Now that I’m writing this and am forced to think about it again, I see that I could also scrap the is_tcrypt_candidate/@seems_encrypted@ property and only set id_type to "crypto_unknown", and then use a check similar to the LUKS check everywhere else.
> >
> > I think I’d prefer that, because this way I don’t introduce another exported property, and handle things more closely to how it is currently done.
>
> Agreed!
>
> > Maybe I should then also rename “crypto_unknown” to “crypto_TCRYPT_candidate”.
>
> Perhaps. But crypto_unknown seems more future-proof, since other headerless disk encryption formats could use the same mechanism.
>
> BTW, related: in udisks_linux_block_update (), I think you should remove the is_tcrypt_candidate boolean and just call udisks_linux_block_seems_encrypted() straight in the if’s conditional. After all, nothing “tcrypt”-specific going on so far. :)

Agreed.

> > And then I could also re-add udisks_linux_block_is_luks() and udisks_linux_block_is_tcrypt_candidate() functions, because then they are at least used in the same way (although point 1 still remains).
> >
> > What do you think?
>
> This sounds ideal to me.

Ok, I will reimplement these functions.

> > > (Meta: I think me, other reviewers and maintainers (and future you!) will be thankful if you spend just one minute extra on each commit writing a motivation/explanation for it.)
> > >
> > > If it turns out you remove this function, which you introduced, IMHO it would make the PR less noisy if you rewrote the history to hide the fact that this function ever existed. But let’s wait with this until the last minute!
> >
> > Yes, I would like to clean up the git history a bit anyway, and then I can also add a short motivation/explanation on each remaining commit (sorry, I probably should have done that before your review, but IIRC you also asked me to not rewrite git history if you already reviewed some of it).
>
> Fair enough! :)
>
> > > h3. commit bb01674d6b94a95a249fc49b63573a0f0608e42b
> > >
> > > > /* save old encryption type to be able to restore it */
> > > > old_hint_encryption_type = udisks_encrypted_get_hint_encryption_type(encrypted);
> > >
> > > This only saves a pointer to the hint, not the data, so…
> > >
> > > > […]
> > > > udisks_encrypted_set_hint_encryption_type(encrypted, “veracrypt”);
> > > > else
> > > > udisks_encrypted_set_hint_encryption_type(encrypted, “truecrypt”);
> > >
> > > … these will overwrite the value that old_hint_encryption_type also points to …
> > >
> > > > udisks_encrypted_set_hint_encryption_type(encrypted, old_hint_encryption_type);
> > >
> > > … so this is always a noop; it will read the current value of encrypted’s encryption hint, and then write it back.
> > >
> > > So you need to make old_hint_encryption_type point to a copy of the old value (preferably on the stack, possibly with a “variable length array” (ANSI C99)).
> >
> > This actually works as expected the way it currently is. I tested this again. When I try to unlock a TCRYPT candidate partition with an incorrect password, hint_encryption_type is first set to TCRYPT, and when the unlock fails, it is reset to the empty string.
>
> I am not so worried about what “works”, I am worried about undefined behavior. For example: if calling udisks_encrypted_set_hint_encryption_type() results in the old hint being free()d and a new one is allocated and set to the new value, and then the hint pointer (or whatever) in UDisksEncrypted is updated to point to this new hint, then what you do is undefined behavior: you rely on free()d memory to not be overwritten, which is often the case, but not always. For instance, running that code in Tails should fail because we enable the kernel memory poisoning feature (page_poison slub_debug=P) so free()d memory is immediately overwritten.

If I’m not mistaken, that would only work if the same address was re-allocated between when it was freed and when I access it again, otherwise it should result in a segfault. I doubt that this is the case, so I’m pretty sure what I did was safe, but as I said, I’m also fine with using the duplicating function instead, which I will do now.

> BTW, I wanted to look up what udisks_encrypted_set_hint_encryption_type() actually does, but it seems like it is generated, and I cannot figure out how. And clues?

Yes, it’s generated and looks like this:

<code class="c">
void
udisks_encrypted_set_hint_encryption_type (UDisksEncrypted *object, const gchar *value)
{
  g_object_set (G_OBJECT (object), "hint-encryption-type", value, NULL);
}

</code>

I also looked at g_object_set which is defined in glib2.0/gobject/gobject.c and uses g_object_set_valist, which does complicated stuff I don’t want to waste a lot of time on to understand.

> > I guess udisks_encrypted_get_hint_encryption_type() already creates a copy of the property (and handles freeing it when the UDisksEncrypted object is freed).
>
> Possibly, but this is something that has to be carefully verified if relied upon, and clearly commented.

Agreed. I took a look at what udisks_encrypted_get_hint_encryption_type() does, and it uses g_dbus_proxy_get_cached_property(), which “Returns a reference to the #GVariant instance that holds the value for property_name or %NULL if the value is not in the cache.”. So what old_hint_encryption_type points to is a cache, which means that it is probably safe to use it as long as we don’t call udisks_encrypted_get_hint_encryption_type() again. Still, it might not be the best idea to rely on an invalid cache, so let’s just use the duplicating function.

> > If you prefer to change this anyway, we could use udisks_encrypted_dup_hint_encryption_type() to get our own copy and handle freeing it ourselves.
>
> I greatly prefer it. Even if what you do is safe for some reason, I suspect it is safe because of how it is implemented, so a future change (e.g. the hint is not free()d + reallocated, but simply overwritten in-memory) will break this.

Agreed.

> And now I noticed something I should have noticed in my previous review:
>
>

if (CHI_SQUARE_LOWER_LIMIT < chi_square && chi_square < CHI_SQUARE_UPPER_LIMIT)
  return TRUE;
else
  return FALSE;


>
> This pattern can (and should!) always be replaced with simply return ${conditional}.

I generally agree, but again, this is not how it is done in some many other gboolean type functions in the code. They always return the TRUE or FALSE macro. Do you think I should deviate from that?

Oh, and if you find my implementation overly complicated, look at this beautiful piece of code ;)

  gboolean ret = FALSE;

  /* could be partitioned by the kernel */
  if (g_strcmp0 (g_udev_device_get_devtype (block_object->device->udev_device), "partition") == 0)
    {
      ret = TRUE;
      goto out;
    }

  /* if blkid(8) already identified the device as a partition, it's all good */
  if (g_udev_device_has_property (block_object->device->udev_device, "ID_PART_ENTRY_SCHEME"))
    {
      ret = TRUE;
      goto out;
    }

 out:
  return ret;

#14 Updated by anonym 2018-01-30 22:53:02

segfault wrote:
> anonym wrote:
> > segfault wrote:
> > > anonym wrote:
> > > > h3. commit 9f8732e55ddc7b917114ce385d7b1657a3bb73be
> > > >
> > > > Why remove udisks_linux_block_is_luks()?
> > >
> > > I think I had two points in favor of removing udisks_linux_block_is_luks():
> > >
> > > # Less noise in the pull request. IMO, extracting this check into its function is cleaner code. For reasons unknown to me, the developers didn’t do this. The udisks code is in general written in a very different code style than what I prefer - but I think it is not on me to change this. I (reluctantly) tried to adopt this code style.
> >
> > Ok. IMHO you should no lower your standards for this reason, but whatever.
> >
> > > # The TCRYPT check is different than the LUKS check, in that it has to actually read the block device, instead of just checking UDisksBlock properties. I don’t want to read the block device more often than required, so in udisks_linux_block_update(), I store the result in a UDisksBlock property, and use this property everywhere else. This is not required for the LUKS check, so I would use udisks_linux_block_is_luks() everywhere in the code. So it could be confusing that I use these two check functions differently.
> >
> > Makes sense!
> >
> > > Now that I’m writing this and am forced to think about it again, I see that I could also scrap the is_tcrypt_candidate/@seems_encrypted@ property and only set id_type to "crypto_unknown", and then use a check similar to the LUKS check everywhere else.
> > >
> > > I think I’d prefer that, because this way I don’t introduce another exported property, and handle things more closely to how it is currently done.
> >
> > Agreed!
> >
> > > Maybe I should then also rename “crypto_unknown” to “crypto_TCRYPT_candidate”.
> >
> > Perhaps. But crypto_unknown seems more future-proof, since other headerless disk encryption formats could use the same mechanism.
> >
> > BTW, related: in udisks_linux_block_update (), I think you should remove the is_tcrypt_candidate boolean and just call udisks_linux_block_seems_encrypted() straight in the if’s conditional. After all, nothing “tcrypt”-specific going on so far. :)
>
> Agreed.
>
> > > And then I could also re-add udisks_linux_block_is_luks() and udisks_linux_block_is_tcrypt_candidate() functions, because then they are at least used in the same way (although point 1 still remains).
> > >
> > > What do you think?
> >
> > This sounds ideal to me.
>
> Ok, I will reimplement these functions.
>
> > > > (Meta: I think me, other reviewers and maintainers (and future you!) will be thankful if you spend just one minute extra on each commit writing a motivation/explanation for it.)
> > > >
> > > > If it turns out you remove this function, which you introduced, IMHO it would make the PR less noisy if you rewrote the history to hide the fact that this function ever existed. But let’s wait with this until the last minute!
> > >
> > > Yes, I would like to clean up the git history a bit anyway, and then I can also add a short motivation/explanation on each remaining commit (sorry, I probably should have done that before your review, but IIRC you also asked me to not rewrite git history if you already reviewed some of it).
> >
> > Fair enough! :)
> >
> > > > h3. commit bb01674d6b94a95a249fc49b63573a0f0608e42b
> > > >
> > > > > /* save old encryption type to be able to restore it */
> > > > > old_hint_encryption_type = udisks_encrypted_get_hint_encryption_type(encrypted);
> > > >
> > > > This only saves a pointer to the hint, not the data, so…
> > > >
> > > > > […]
> > > > > udisks_encrypted_set_hint_encryption_type(encrypted, “veracrypt”);
> > > > > else
> > > > > udisks_encrypted_set_hint_encryption_type(encrypted, “truecrypt”);
> > > >
> > > > … these will overwrite the value that old_hint_encryption_type also points to …
> > > >
> > > > > udisks_encrypted_set_hint_encryption_type(encrypted, old_hint_encryption_type);
> > > >
> > > > … so this is always a noop; it will read the current value of encrypted’s encryption hint, and then write it back.
> > > >
> > > > So you need to make old_hint_encryption_type point to a copy of the old value (preferably on the stack, possibly with a “variable length array” (ANSI C99)).
> > >
> > > This actually works as expected the way it currently is. I tested this again. When I try to unlock a TCRYPT candidate partition with an incorrect password, hint_encryption_type is first set to TCRYPT, and when the unlock fails, it is reset to the empty string.
> >
> > I am not so worried about what “works”, I am worried about undefined behavior. For example: if calling udisks_encrypted_set_hint_encryption_type() results in the old hint being free()d and a new one is allocated and set to the new value, and then the hint pointer (or whatever) in UDisksEncrypted is updated to point to this new hint, then what you do is undefined behavior: you rely on free()d memory to not be overwritten, which is often the case, but not always. For instance, running that code in Tails should fail because we enable the kernel memory poisoning feature (page_poison slub_debug=P) so free()d memory is immediately overwritten.
>
> If I’m not mistaken, that would only work if the same address was re-allocated between when it was freed and when I access it again, otherwise it should result in a segfault.

There will be no segfault. free() only marks previously allocated memory as “available for allocation”, and has nothing to do with memory access permissions; malloc() must only allocate accessible memory, so any pointer returned by it is always accessible, even after free().

To be extra sure that I am not missing some recent security enhancement in GCC or the kernel I tried it (copy-paste friendly):

cat > noseg.c <<EOF 
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main() {
  char *s = malloc(8);
  strcpy(s, "magic");
  printf("Before: %s\n", s);
  free(s);
  printf("After : %s\n", s);
}
EOF
gcc -o noseg noseg.c
./noseg


and I got:

Before: magic
After : 


i.e. no access violation. Like I predicted for Tails above, the value was zeroed because I have memory poisoning enabled on my system.

> > > I guess udisks_encrypted_get_hint_encryption_type() already creates a copy of the property (and handles freeing it when the UDisksEncrypted object is freed).
> >
> > Possibly, but this is something that has to be carefully verified if relied upon, and clearly commented.
>
> Agreed. I took a look at what udisks_encrypted_get_hint_encryption_type() does, and it uses g_dbus_proxy_get_cached_property(), which “Returns a reference to the #GVariant instance that holds the value for property_name or %NULL if the value is not in the cache.”. So what old_hint_encryption_type points to is a cache, which means that it is probably safe to use it as long as we don’t call udisks_encrypted_get_hint_encryption_type() again. Still, it might not be the best idea to rely on an invalid cache, so let’s just use the duplicating function.

Eh, well that explains it. And yes, let’s not rely on invalid cache data! IMHO this is a bug in udisks, with the fix being that the setter invalidates the affected cache entry.

> > And now I noticed something I should have noticed in my previous review:
> >
> > […]
> >
> > This pattern can (and should!) always be replaced with simply return ${conditional}.
>
> I generally agree, but again, this is not how it is done in some many other gboolean type functions in the code. They always return the TRUE or FALSE macro. Do you think I should deviate from that?

Ah, don’t do it then. IMHO it would be nicer to use some int_to_gboolean macro/function to cast the value correctly, but my quick search didn’t find anything.

> Oh, and if you find my implementation overly complicated, look at this beautiful piece of code ;)
> […]

“In the future we potentially will have complex error handling here. Let’s just be done with it and introduce the complexity from the beginning. No easy mode!” :)

#15 Updated by segfault 2018-01-30 23:20:24

anonym wrote:
> segfault wrote:
> > If I’m not mistaken, that would only work if the same address was re-allocated between when it was freed and when I access it again, otherwise it should result in a segfault.
>
> There will be no segfault. free() only marks previously allocated memory as “available for allocation”, and has nothing to do with memory access permissions; malloc() must only allocate accessible memory, so any pointer returned by it is always accessible, even after free().
>
> To be extra sure that I am not missing some recent security enhancement in GCC or the kernel I tried it (copy-paste friendly):
> […]
> and I got:
> […]
> i.e. no access violation. Like I predicted for Tails above, the value was zeroed because I have memory poisoning enabled on my system.

Ok, you were right. And it makes sense, of course, because the page in the heap is still mapped (unless you freed enough memory to unmap up a whole page, in which case it would result in a segfault).

> > > > I guess udisks_encrypted_get_hint_encryption_type() already creates a copy of the property (and handles freeing it when the UDisksEncrypted object is freed).
> > >
> > > Possibly, but this is something that has to be carefully verified if relied upon, and clearly commented.
> >
> > Agreed. I took a look at what udisks_encrypted_get_hint_encryption_type() does, and it uses g_dbus_proxy_get_cached_property(), which “Returns a reference to the #GVariant instance that holds the value for property_name or %NULL if the value is not in the cache.”. So what old_hint_encryption_type points to is a cache, which means that it is probably safe to use it as long as we don’t call udisks_encrypted_get_hint_encryption_type() again. Still, it might not be the best idea to rely on an invalid cache, so let’s just use the duplicating function.
>
> Eh, well that explains it. And yes, let’s not rely on invalid cache data! IMHO this is a bug in udisks, with the fix being that the setter invalidates the affected cache entry.
>
> > > And now I noticed something I should have noticed in my previous review:
> > >
> > > […]
> > >
> > > This pattern can (and should!) always be replaced with simply return ${conditional}.
> >
> > I generally agree, but again, this is not how it is done in some many other gboolean type functions in the code. They always return the TRUE or FALSE macro. Do you think I should deviate from that?
>
> Ah, don’t do it then. IMHO it would be nicer to use some int_to_gboolean macro/function to cast the value correctly, but my quick search didn’t find anything.

We don’t have to cast int to gboolean, at least the compiler doesn’t complain if I just return 0 or 1.

> > Oh, and if you find my implementation overly complicated, look at this beautiful piece of code ;)
> > […]
>
> “In the future we potentially will have complex error handling here. Let’s just be done with it and introduce the complexity from the beginning. No easy mode!” :)

That’s probably it :D

#16 Updated by segfault 2018-01-30 23:46:43

I just pushed fixes for the discussed issues.

segfault wrote:
> anonym wrote:
> > segfault wrote:
> > > > This pattern can (and should!) always be replaced with simply return ${conditional}.
> > >
> > > I generally agree, but again, this is not how it is done in some many other gboolean type functions in the code. They always return the TRUE or FALSE macro. Do you think I should deviate from that?
> >
> > Ah, don’t do it then. IMHO it would be nicer to use some int_to_gboolean macro/function to cast the value correctly, but my quick search didn’t find anything.
>
> We don’t have to cast int to gboolean, at least the compiler doesn’t complain if I just return 0 or 1.

I found gboolean functions in glib which also simply return the result of logical operators, so I think it’s fine if we also do this. I implemented it and it works without compiler complaints.

#17 Updated by anonym 2018-02-01 10:18:28

segfault wrote:
> I just pushed fixes for the discussed issues.

All looks good until commit:c3c2cc883a7545ef3a351119ed3cf2b9379946c4!

> segfault wrote:
> > anonym wrote:
> > > segfault wrote:
> > > > > This pattern can (and should!) always be replaced with simply return ${conditional}.
> > > >
> > > > I generally agree, but again, this is not how it is done in some many other gboolean type functions in the code. They always return the TRUE or FALSE macro. Do you think I should deviate from that?
> > >
> > > Ah, don’t do it then. IMHO it would be nicer to use some int_to_gboolean macro/function to cast the value correctly, but my quick search didn’t find anything.
> >
> > We don’t have to cast int to gboolean, at least the compiler doesn’t complain if I just return 0 or 1.
>
> I found gboolean functions in glib which also simply return the result of logical operators, so I think it’s fine if we also do this.

That sounds like what I was looking for! (I used the term “cast” very loosely, I just wanted a compat layer between C “booleans” (i.e. unsigned int) and gboolean.

> I implemented it and it works without compiler complaints.

In my HEAD the result of && is returned, e.g. unsigned int, i.e. I don’t see this function used.

#18 Updated by segfault 2018-02-01 16:41:41

> All looks good until commit:c3c2cc883a7545ef3a351119ed3cf2b9379946c4!

Great! :)

> > I found gboolean functions in glib which also simply return the result of logical operators, so I think it’s fine if we also do this.
>
> That sounds like what I was looking for! (I used the term “cast” very loosely, I just wanted a compat layer between C “booleans” (i.e. unsigned int) and gboolean.
>
> > I implemented it and it works without compiler complaints.
>
> In my HEAD the result of && is returned, e.g. unsigned int, i.e. I don’t see this function used.

There seems to be a misunderstanding. What I was trying to say is that I found other functions of type gboolean which don’t give a crap about treating results of logical operators (i.e. unsigned int) any different than gboolean, so I decided to do the same.

#19 Updated by anonym 2018-02-02 16:56:11

segfault wrote:
> (unless you freed enough memory to unmap up a whole page, in which case it would result in a segfault).

Full disclosure: I am ignorant of how Linux’ virtual memory model actually works in practice, so I’d be delighted if you could elaborate on your above statement. I have (apparently incorrectly?) assumed that if a process can dereference a pointer p, then p will remain dereferencible for the duration of the process.

> > > I found gboolean functions in glib which also simply return the result of logical operators, so I think it’s fine if we also do this.
> >
> > That sounds like what I was looking for! (I used the term “cast” very loosely, I just wanted a compat layer between C “booleans” (i.e. unsigned int) and gboolean.
> >
> > > I implemented it and it works without compiler complaints.
> >
> > In my HEAD the result of && is returned, e.g. unsigned int, i.e. I don’t see this function used.
>
> There seems to be a misunderstanding. What I was trying to say is that I found other functions of type gboolean which don’t give a crap about treating results of logical operators (i.e. unsigned int) any different than gboolean, so I decided to do the same.

Ah. Yeah, let’s not care that much about being future-proof against crazy compiler updates or C standards changes. :)

#20 Updated by sajolida 2018-02-04 20:46:44

  • Assignee changed from sajolida to segfault
  • QA Check deleted (Info Needed)

Regarding the “System partition” and “Hidden volume” check boxes, I have no strong opinion. We can also see what GNOME says and these would be easy candidates for removal if GNOME thinks that we have too much options in our interface :)

#21 Updated by segfault 2018-02-08 10:43:48

  • Assignee changed from segfault to anonym
  • QA Check set to Ready for QA

I forgot to add support for TCRYPT system volumes. See commit 87189211b24f32ed59ca0665341ac0c94ea43e29.

#22 Updated by anonym 2018-02-08 10:54:21

segfault wrote:
> I forgot to add support for TCRYPT system volumes. See commit 87189211b24f32ed59ca0665341ac0c94ea43e29.

I see no new commits, so I suspect you forgot to push your branch.

#23 Updated by anonym 2018-02-08 13:32:23

  • Assignee changed from anonym to segfault
  • QA Check deleted (Ready for QA)

anonym wrote:
> segfault wrote:
> > I forgot to add support for TCRYPT system volumes. See commit 87189211b24f32ed59ca0665341ac0c94ea43e29.
>
> I see no new commits, so I suspect you forgot to push your branch.

Now I do! LGTM!

#24 Updated by segfault 2018-02-09 20:04:32

  • Assignee changed from segfault to anonym
  • QA Check set to Ready for QA

I cleaned up the git history and added support for VeraCrypt’s PIM value.

#25 Updated by anonym 2018-02-11 16:46:45

  • Assignee changed from anonym to segfault
  • QA Check deleted (Ready for QA)

LGTM!

#27 Updated by bertagaz 2018-03-14 11:32:19

  • Target version changed from Tails_3.6 to Tails_3.7

#28 Updated by segfault 2018-04-15 11:28:18

  • Status changed from In Progress to Resolved

The patches are ready, next step is upstreaming (Feature #15220).