Discussion:
[dm-crypt] LUKS header recovery attempt from apparently healthy SSD
protagonist
2017-04-21 14:26:30 UTC
Permalink
Hello all,
someone found his way into our local hackerspace looking for help and
advice with recovering his OS partition from a LUKS-encrypted INTEL SSD
(SSDSC2CT240A4), and I've decided to get onto the case. Obviously,
there is no backup, and he's aware of the consequences of this basic
mistake by now.

The disk refused to unlock on boot in the original machine from one day
to the other. Opening it form any other of several machines with
different versions of Ubuntu/Debian, including Debian Stretch with a
recent version of cryptsetup have been completely unsuccessful,
indicating a MK digest mismatch and therefore "wrong password". The
password is fairly simple and contains no special characters or
locale-sensitive characters and had been written down. Therefore I
assume it is known correctly and the header must be partially faulty.

After reading the header specification, the FAQs, relevant recovery
threads on here as well as going through the header with a hex editor
and deducing some of it's contents by hand, it is obvious to me that
losing any significant portion (more than a few bytes) of the relevant
LUKS header sections, either the critical parts of the meta-area or the
actual key slot, would make the device contents provably irrecoverable,
as even brute forcing becomes exponentially hard with the number of
missing pseudo-randomly distributed bits.

Normally, one would move directly to grief stage number five -
"Acceptance" - if the storage device in question was known to have data
loss.

However, upon closer inspection, I can detect no obvious signs of
multiple-byte data loss. There had been no intentional changes to the
LUKS header, linux system upgrade or any other (known) relevant event to
the system between it booting one day and refusing to unlock the day
after. I realize that for *some* reasoning related to anti-forensics,
the LUKS header specification contains no checksum over actual raw byte
fields at all, making it very hard to detect the presence of minor
defects in the header or providing any help in pinpointing their location.

Looking for major defects with the keyslot_checker reveals no obvious
problems:

parameters (commandline and LUKS header):
sector size: 512
threshold: 0.900000

- processing keyslot 0: keyslot not in use
- processing keyslot 1: start: 0x040000 end: 0x07e800
- processing keyslot 2: keyslot not in use
- processing keyslot 3: keyslot not in use
- processing keyslot 4: keyslot not in use
- processing keyslot 5: keyslot not in use
- processing keyslot 6: keyslot not in use
- processing keyslot 7: keyslot not in use

this is also the case if we increase the desired entropy to -t 0.935:

parameters (commandline and LUKS header):
sector size: 512
threshold: 0.935000

- processing keyslot 0: keyslot not in use
- processing keyslot 1: start: 0x040000 end: 0x07e800
- processing keyslot 2: keyslot not in use
[...]

Going through the sectors reported with -v at a higher -t value, I'm
unable to find any suspicious groupings, for example unusual numbers of
00 00 or FF FF. Multi-byte substitution with a non-randomized pattern
seems unlikely.

------------------

The luksDump header information looks sane as well. The encryption had
been created by the Mint 17.1 installation in the second half of 2014 on
a fairly weak laptop and it's password later changed to a better one,
which accounts for the use of keyslot #1 and fairly low iteration counts.

LUKS header information for /dev/sda5

Version: 1
Cipher name: aes
Cipher mode: xts-plain64
Hash spec: sha1
Payload offset: 4096
MK bits: 512
MK digest: ff 5c 64 48 bc 1f b2 f2 66 23 d3 66 38 41 c9 60 8a 7e
de 0a
MK salt: 04 e3 04 8c 51 fd 07 ee d1 f3 4a 5e c1 8c b9 88
ab 0d cf dc 55 7c fa bc ca 1a b7 02 5a 55 ac 2c
MK iterations: 35125
UUID: 24e05704-f8ed-4391-9a3d-a59330a919d2

Key Slot 0: DISABLED
Key Slot 1: ENABLED
Iterations: 144306
Salt: b8 6f 20 a7 fe 8b 6a 9a 21 58 92 13 ce 1a 43 12 9c
4e a0 bf 7c 51 5e a1 78 47 05 ca b6 32 da a4
Key material offset: 512
AF stripes: 4000
Key Slot 2: DISABLED
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED
Key Slot 6: DISABLED
Key Slot 7: DISABLED

The disabled key slot #0 salt is correctly filled up with nulls, making
it unusable for any recovery attempt. All magic bytes of the key slots,
including 2 to 7 look good. The uuid is "version: 4 (random data based)"
according to uuid -d output and therefore not of much help.
------------------

smartctl indicates fairly standard use for a 240GB desktop ssd, with
about ~3.7TB written at 2650h runtime, 1 reallocated sector and 0
"Reported Uncorrectable Errors". The firmware version 335u seems to be
the latest available, from what I've read. Smartctl tests with "-t
short", "-t offline" and "-t long" test show no errors:
# 1 Extended offline Completed without error 00% 2648
-
# 2 Offline Completed without error 00% 2646
-
# 3 Short offline Completed without error 00% 2572
-
The device also shows no issues during idle or read states hinting at
physical problems.

Checksumming the 240GB of data read blockwise from the device by dd with
sha512sum lead to identical results on three runs, so the device isn't
mixing sectors or lying about their content in a different fashion
differently each time we ask for data.

All in all, the failure mode is still a mystery to me. I can think of
mainly three explanations:

I. silent data corruption events that have gone undetected by the
SSD-internal sector-wide checksumming, namely bit/byte level changes on
* MK salt / digest
* key slot #1 iterations count / salt
* key slot #1 AF stripe data

II. actual passphrase mistakes
* "constant" mistake or layout mismatch
This seems quite unlikely, as none of the characters change between a US
layout and the DE layout that was used. There are also no characters
that can be easily confused such as O/0.

III. some failure I've overlooked, like an OS-level bug or devilish
malware causing "intentional" writes to the first 2M of the drive.

Failure case #I is still the most likely, but from my understanding, a
four-digit number of system bootups and associated read events over the
lifetime of the header shouldn't be able to cause any kind of flash
wearout, let alone silent data corruption, unless the firmware is broken
in a subtle way. Assuming it is - what to do besides bruteforcing the
AF section for bit flips?

I would be delighted about any advice or idea for further tests to
narrow down whatever happened to this header.
Regards,
protagonist
David Christensen
2017-04-21 23:25:08 UTC
Permalink
Post by protagonist
someone found his way into our local hackerspace looking for help and
advice with recovering his OS partition from a LUKS-encrypted INTEL SSD
(SSDSC2CT240A4), and I've decided to get onto the case. Obviously,
there is no backup, and he's aware of the consequences of this basic
mistake by now.
Have you tested the drive with the Intel SSD Toolbox?

http://www.intel.com/content/www/us/en/support/memory-and-storage/ssd-software/intel-ssd-toolbox.html


David
Arno Wagner
2017-04-22 00:25:48 UTC
Permalink
Hi Protagonist,

this is an impressive analysis and I basically agree with
all of it.

Personally, I stropnglys suspect your option "I". This design
here is 5 years old and MLC. MLC requires the firmware to do
regular scanning, error correction and rewrites in order to
be reliable. 5 years ago the state of the firmware for that
was more "experimental" than "stable".

For example, I have one old SSD from back then (OCZ trash),
that has silent single bit-errors on average on one of 5 full
reads. If such a bit-error happens on scrubbing or
garbage-collection or regular writes to a partial internal
(very large) sector, parts of the LUKS header may get rewritten
with a permanent bit-error, even if the LUKS header itself was
not written from outside at all.

Such corruption can of course also be due to a failing SSD
controller, bad RAM in the SSD, bus-problems, etc. In
particular, single-bit errors in an MLC-design will not
result from corrupted FLASH, but from other problems.

Now, are there any recovery options?

Aassume 1 bit has been corrupted in a random place.
A key-slot is 256kB, i.e. 2Mbit. That means trying it
out (flip one bit, do an unlock attempt) would take
2 million seconds on the original PC, i.e. 23 days.
This can maybe be brought down by a factor of 5 or so
with the fastest avaliable CPU (the oteration count of
150k is pretty low), i.e. still roughly 5 days.

This may be worth giving it a try, but it requires some
serious coding with libcryptsetup and it will only
help on a single bit-error.

It may of course be a more complex error, especially
when ECC in the disk has corrected an error to the
wrong value, because the original was too corrupted.
A sane design prevents this by using a second,
independent checksum on the ECC result, but as I said,
5 years ago SSD design was pretty experimental and
beginner's mistakes were made.

The keyslot checker is no help here, it is intendend
to find gross localized corruption, for example a
new MBR being right in there in a keyslot. Chesckums
on LUKS-level were not implemented because they are
not really needed as classical HDDs are very good at
detecting read-errors. Unless you go to ZFS ot the like,
filesystems do not do this either, for the same reasons.
There is one gobal "checksum" in LUKS though, exactly
the one that now tells you that there is no matching
keyslot, and on entry of a good passphrase that means
the keyslot is corrupted.

My take is that apart from making absolutely sure
the passphrase is correct (it sounds very much like it
is though) and running the manufacturers diagnostic
tools on the SSD, there is not much more you can do.

Regards,
Arno
Post by protagonist
Hello all,
someone found his way into our local hackerspace looking for help and
advice with recovering his OS partition from a LUKS-encrypted INTEL SSD
(SSDSC2CT240A4), and I've decided to get onto the case. Obviously,
there is no backup, and he's aware of the consequences of this basic
mistake by now.
The disk refused to unlock on boot in the original machine from one day
to the other. Opening it form any other of several machines with
different versions of Ubuntu/Debian, including Debian Stretch with a
recent version of cryptsetup have been completely unsuccessful,
indicating a MK digest mismatch and therefore "wrong password". The
password is fairly simple and contains no special characters or
locale-sensitive characters and had been written down. Therefore I
assume it is known correctly and the header must be partially faulty.
After reading the header specification, the FAQs, relevant recovery
threads on here as well as going through the header with a hex editor
and deducing some of it's contents by hand, it is obvious to me that
losing any significant portion (more than a few bytes) of the relevant
LUKS header sections, either the critical parts of the meta-area or the
actual key slot, would make the device contents provably irrecoverable,
as even brute forcing becomes exponentially hard with the number of
missing pseudo-randomly distributed bits.
Normally, one would move directly to grief stage number five -
"Acceptance" - if the storage device in question was known to have data
loss.
However, upon closer inspection, I can detect no obvious signs of
multiple-byte data loss. There had been no intentional changes to the
LUKS header, linux system upgrade or any other (known) relevant event to
the system between it booting one day and refusing to unlock the day
after. I realize that for *some* reasoning related to anti-forensics,
the LUKS header specification contains no checksum over actual raw byte
fields at all, making it very hard to detect the presence of minor
defects in the header or providing any help in pinpointing their location.
Looking for major defects with the keyslot_checker reveals no obvious
sector size: 512
threshold: 0.900000
- processing keyslot 0: keyslot not in use
- processing keyslot 1: start: 0x040000 end: 0x07e800
- processing keyslot 2: keyslot not in use
- processing keyslot 3: keyslot not in use
- processing keyslot 4: keyslot not in use
- processing keyslot 5: keyslot not in use
- processing keyslot 6: keyslot not in use
- processing keyslot 7: keyslot not in use
sector size: 512
threshold: 0.935000
- processing keyslot 0: keyslot not in use
- processing keyslot 1: start: 0x040000 end: 0x07e800
- processing keyslot 2: keyslot not in use
[...]
Going through the sectors reported with -v at a higher -t value, I'm
unable to find any suspicious groupings, for example unusual numbers of
00 00 or FF FF. Multi-byte substitution with a non-randomized pattern
seems unlikely.
------------------
The luksDump header information looks sane as well. The encryption had
been created by the Mint 17.1 installation in the second half of 2014 on
a fairly weak laptop and it's password later changed to a better one,
which accounts for the use of keyslot #1 and fairly low iteration counts.
LUKS header information for /dev/sda5
Version: 1
Cipher name: aes
Cipher mode: xts-plain64
Hash spec: sha1
Payload offset: 4096
MK bits: 512
MK digest: ff 5c 64 48 bc 1f b2 f2 66 23 d3 66 38 41 c9 60 8a 7e
de 0a
MK salt: 04 e3 04 8c 51 fd 07 ee d1 f3 4a 5e c1 8c b9 88
ab 0d cf dc 55 7c fa bc ca 1a b7 02 5a 55 ac 2c
MK iterations: 35125
UUID: 24e05704-f8ed-4391-9a3d-a59330a919d2
Key Slot 0: DISABLED
Key Slot 1: ENABLED
Iterations: 144306
Salt: b8 6f 20 a7 fe 8b 6a 9a 21 58 92 13 ce 1a 43 12 9c
4e a0 bf 7c 51 5e a1 78 47 05 ca b6 32 da a4
Key material offset: 512
AF stripes: 4000
Key Slot 2: DISABLED
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED
Key Slot 6: DISABLED
Key Slot 7: DISABLED
The disabled key slot #0 salt is correctly filled up with nulls, making
it unusable for any recovery attempt. All magic bytes of the key slots,
including 2 to 7 look good. The uuid is "version: 4 (random data based)"
according to uuid -d output and therefore not of much help.
------------------
smartctl indicates fairly standard use for a 240GB desktop ssd, with
about ~3.7TB written at 2650h runtime, 1 reallocated sector and 0
"Reported Uncorrectable Errors". The firmware version 335u seems to be
the latest available, from what I've read. Smartctl tests with "-t
# 1 Extended offline Completed without error 00% 2648
-
# 2 Offline Completed without error 00% 2646
-
# 3 Short offline Completed without error 00% 2572
-
The device also shows no issues during idle or read states hinting at
physical problems.
Checksumming the 240GB of data read blockwise from the device by dd with
sha512sum lead to identical results on three runs, so the device isn't
mixing sectors or lying about their content in a different fashion
differently each time we ask for data.
All in all, the failure mode is still a mystery to me. I can think of
I. silent data corruption events that have gone undetected by the
SSD-internal sector-wide checksumming, namely bit/byte level changes on
* MK salt / digest
* key slot #1 iterations count / salt
* key slot #1 AF stripe data
II. actual passphrase mistakes
* "constant" mistake or layout mismatch
This seems quite unlikely, as none of the characters change between a US
layout and the DE layout that was used. There are also no characters
that can be easily confused such as O/0.
III. some failure I've overlooked, like an OS-level bug or devilish
malware causing "intentional" writes to the first 2M of the drive.
Failure case #I is still the most likely, but from my understanding, a
four-digit number of system bootups and associated read events over the
lifetime of the header shouldn't be able to cause any kind of flash
wearout, let alone silent data corruption, unless the firmware is broken
in a subtle way. Assuming it is - what to do besides bruteforcing the
AF section for bit flips?
I would be delighted about any advice or idea for further tests to
narrow down whatever happened to this header.
Regards,
protagonist
_______________________________________________
dm-crypt mailing list
http://www.saout.de/mailman/listinfo/dm-crypt
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: ***@wagner.name
GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718
----
A good decision is based on knowledge and not on numbers. -- Plato

If it's in the news, don't worry about it. The very definition of
"news" is "something that hardly ever happens." -- Bruce Schneier
Robert Nichols
2017-04-22 13:33:28 UTC
Permalink
Post by Arno Wagner
Aassume 1 bit has been corrupted in a random place.
A key-slot is 256kB, i.e. 2Mbit. That means trying it
out (flip one bit, do an unlock attempt) would take
2 million seconds on the original PC, i.e. 23 days.
This can maybe be brought down by a factor of 5 or so
with the fastest avaliable CPU (the oteration count of
150k is pretty low), i.e. still roughly 5 days.
This may be worth giving it a try, but it requires some
serious coding with libcryptsetup and it will only
help on a single bit-error.
It may of course be a more complex error, especially
when ECC in the disk has corrected an error to the
wrong value, because the original was too corrupted.
The drive would almost certainly have detected and corrected a single-bit error.
Post by Arno Wagner
The keyslot checker is no help here, it is intendend
to find gross localized corruption,
It is still worth running the keyslot checker to detect gross corruption before spending 5+ days in a (probably futile) search for a single bit flip.
--
Bob Nichols "NOSPAM" is really part of my email address.
Do NOT delete it.
Arno Wagner
2017-04-22 13:45:58 UTC
Permalink
Post by Robert Nichols
Post by Arno Wagner
Aassume 1 bit has been corrupted in a random place.
A key-slot is 256kB, i.e. 2Mbit. That means trying it
out (flip one bit, do an unlock attempt) would take
2 million seconds on the original PC, i.e. 23 days.
This can maybe be brought down by a factor of 5 or so
with the fastest avaliable CPU (the oteration count of
150k is pretty low), i.e. still roughly 5 days.
This may be worth giving it a try, but it requires some
serious coding with libcryptsetup and it will only
help on a single bit-error.
It may of course be a more complex error, especially
when ECC in the disk has corrected an error to the
wrong value, because the original was too corrupted.
The drive would almost certainly have detected and corrected a single-bit error.
Only when the error happened in FLASH. It can happen in
RAM and on a bus and there it would not have been corrected.
Can even be a transient error (charged cosmic particle
impacting a RAM cell, e.g.), these things happen.
Post by Robert Nichols
Post by Arno Wagner
The keyslot checker is no help here, it is intendend
to find gross localized corruption,
It is still worth running the keyslot checker to detect gross corruption
before spending 5+ days in a (probably futile) search for a single bit
flip.
That has already been done. But I agree that
the chances for a single-bit error are not good.

Regards,
Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: ***@wagner.name
GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718
----
A good decision is based on knowledge and not on numbers. -- Plato

If it's in the news, don't worry about it. The very definition of
"news" is "something that hardly ever happens." -- Bruce Schneier
protagonist
2017-04-22 18:02:27 UTC
Permalink
Post by Arno Wagner
Post by Robert Nichols
Post by Arno Wagner
Aassume 1 bit has been corrupted in a random place.
A key-slot is 256kB, i.e. 2Mbit. That means trying it
out (flip one bit, do an unlock attempt) would take
2 million seconds on the original PC, i.e. 23 days.
This can maybe be brought down by a factor of 5 or so
with the fastest avaliable CPU (the oteration count of
150k is pretty low), i.e. still roughly 5 days.
This may be worth giving it a try, but it requires some
serious coding with libcryptsetup and it will only
help on a single bit-error.
It may of course be a more complex error, especially
when ECC in the disk has corrected an error to the
wrong value, because the original was too corrupted.
The drive would almost certainly have detected and corrected a single-bit error.
Only when the error happened in FLASH. It can happen in
RAM and on a bus and there it would not have been corrected.
Can even be a transient error (charged cosmic particle
impacting a RAM cell, e.g.), these things happen.
I agree: there is at least the remote possibility that data wasn't
adequately protected "during flight", such as during repair operations.

If the device really is telling the truth about it's internal operations
and there has only been a single sector reallocation event on the
physical layer, the chances of this happening within our relevant 2MBit
area on a 240GB disk are very slim for a start, and the lack of writes
anywhere near this area makes flash wearout extra unlikely, as described
before.
Yet if there was such an error, it would fit perfectly to our scenario,
and it's not impossible that there have been many errors that have
happened unnoticed and unreported in other, less critical parts of the disk.

One thing to mention, perhaps for later readers with similar recovery
problems: if we knew the exact "logical" sector that was reallocated and
is thought to contain a flaw, we could limit our keyslot-bruteforcing
efforts to this area and reduce our efforts by factor 500 (for a MK size
of 512 bits) for a given operation. Unfortunately, I don't see a way to
get the positions of reallocated sector(s) out of the drive. There is no
known serial debug header for low-level firmware access available, as it
is sometimes the case with HDDs. Perhaps there is an unofficial vendor
tool by Intel with verbose debugging output that can accomplish this,
but the available documentation for the Intel SSD Toolbox doesn't
mention anything beyond the SMART values (that only include error
counts), which is why I haven't bothered with this proprietary tool yet.

If I had actual bad blocks, the "long" SMART-scan would have shown
positions, and there might be the option to hammer the drive with the
badblocks utility to find the position of (further) defective spots,
similar to the memtest86+ utility for main memory. Given the current
health reads, I don't really think such a treatment will discover any
"helpful" clues for misbehavior in the first 2MB of the drive.
The march 2015 techreport.com test on SSD endurance suggests this drive
model+capacity can take about ~600TB+ of writes before the reserved
sectors for wear leveling run out. I doubt heavy write loads will show
the issue we're looking for.

-----

I've spent some time researching likely bruteforce performance on
available machines and possible shortcuts.

Our bruteforce efforts are generally limited by the pbkdf2 sha1
iteration speed, which is hard to optimize for, just as designed.
On an Intel Ivy Bridge i7 and cryptsetup 1.7.3 (Debian Stretch), the
"cryptsetup benchmark" counts about 1.25M iterations/s:
PBKDF2-sha1 1248304 iterations per second for 256-bit key

I've manually compiled https://github.com/mbroz/pbkdf2_cryptsetup_test
as well as cryptsetup itself to find possible improvements with
different crypto backends, gcc optimizations such as -Ofast and
-march=native, but I've been unsuccessful to improve on the 1.25M/s
number so far. Openssl beats gcrypt, but the default kernel backend
still seems faster.
(As a side note: what's up with the 9 years old INSTALL? It's no fun
scraping together the necessary libraries while deciphering
autogen.sh/configure errors!)
There might still be some side channel mitigations in the actual
implementation that could be omitted, as we don't really care about
that, but from what I read, this is much less of a low-hanging fruit for
SHA1 than it would be for AES. Also, the amount of data handled during
these iterations is small, which decreases the impact of optimized
memory access methods (cpu cache level hints) and negates others
(hugepages mmap).

Now, according to my current understanding, bruteforcing any meta-header
values or actual key slot password requires us to iterate through the
following:
* the key slot data (AF stripes) is kept fixed
1) perform 144306 PBKDF2-sha1 iterations
2) "decrypt" the AF sections using the resulting hash
3) do a AFmerge() to get the raw masterKeyCandidate
4) perform 35125 PBKDF2-sha1 iterations (at 160bit 'key' size) to
derive the digest of the masterKey
5) memcmp comparison against the digest to see if we've got a valid
master key.

See
https://gitlab.com/cryptsetup/cryptsetup/blob/master/docs/on-disk-format.pdf
for algorithm pseudocode.

Please correct me if I'm wrong, but while looking at the second case of
bruteforcing some bit changes in the key slot data, I've noticed that
things are likely 5 times faster:
1) the key slot data (AF stripes) is changed in cache/RAM
* no meta-header value has changed, so a pre-computed version of the
144306 PBKDF2-sha1 iterations can be used. Keyslot iteration count is
therefore largely here irrelevant.
2) "decrypt" the AF sections using the resulting hash
3) do an AFmerge to get the raw masterKeyCandidate
4) perform 35125 PBKDF2-sha1 iterations (at 160bit 'key' size) to
derive the digest of the masterKey
5) memcmp comparison against the digest to see if we've got a valid
master key.

This general speedup in the second case is irrelevant for attacks
against password or salt of a key slot, but might help significantly in
our case, as performance might increase by a factor of five to something
in the order of 35 tries per second & core for pure sha1 iteration
performance (if we disregard cost of all other computations). If this
scales well to all four real cores or even all logical eight with the
available hyperthreading, then overall runtime for a single-bit test
might be well within a day.

One could go even further and replace the time-consuming steps 5) and 6)
with a routine that decrypts an encrypted part of the disk with the
masterKeyCandidate and compares it to a known plaintext (or a more
elaborate heuristic, such as several entropy checks looking for
"correctly" decryptable areas on disk leading to low-entropy output),
which might be a lot faster given AES-NI support and AES-xts throughput
speeds, but since we don't know much about actual disk content, this
seems to be too much of a gamble to be worth the effort at this point.
It's easily thinkable one tests against several areas containing a
well-compressed movie or a similarly high-entropy area, leading to a
false-negative and missing the only correct master key.

There is also another shortcut I can think of: checking for multi-byte
errors in the MK digest is as easy as doing one default cryptsetup run
with the correct password + unmodified key slot and comparing the
computed MK digest to the value on disk.

For this, one can patch the LUKS_verify_volume_key() function in
lib/luks1/keymanage.c to print the buffer values before the memcmp() call:
LUKS_verify_volume_key checkHashBuf:
e6 8e 79 bf a5 51 cc fe 7d 12 4c 4c 8d 46 d3 6c ae 30 0d 28
LUKS_verify_volume_key hdr->mkDigest:
ff 5c 64 48 bc 1f b2 f2 66 23 d3 66 38 41 c9 60 8a 7e de 0a

As one can see, the hdr->mkDigest is the MK digest output on disk read
in a similar fashion to the luksDump output noted in my last post. The
master key digest of the key candidate derived with the current keyslot
data and supposedly correct password does not match in even one byte
position (!), ruling out a partially corrupted MK digest field in the
meta header data.

Regards,
protagonist
protagonist
2017-04-23 20:03:28 UTC
Permalink
This post might be inappropriate. Click to display it.
Dominic Raferd
2017-04-24 05:50:01 UTC
Permalink
Post by protagonist
I've manually compiled
​...​
​This is pretty impressive stuff to someone like me who is new to dm-crypt.
But I wondered if the chances of the passphrase being misrecorded or
misread have been fully considered. In your OP you wrote: 'The password is
fairly simple and contains no special characters or locale-sensitive
characters and had been written down... none of the characters change
between a US layout and the DE layout that was used. There are also no
characters that can be easily confused such as O/0.'

I note the 'written down' but if by this you meant 'recorded in a Word
document', say, then perhaps a capitalisation error has crept in. By far
the most likely is that the first character is recorded as capitalised when
it isn't (as Word likes to capitalise the letter at the beginning of a
sentence).​ Other possibilities include an extra space or spaces (at the
beginning or end?), or a period being read as part or not part of the
passphrase. It would also be worth re-reviewing the possibility that some
characters have been confused - if the passphrase was written down by hand
the chances greatly increase. And to be quite sure it isn't a keyboard
issue, can you try with a DE keyboard?

As it happens a single capitalisation error would be picked up by a brute
force method that tests for a single bit flip...
protagonist
2017-04-24 13:26:20 UTC
Permalink
Post by protagonist
I've manually compiled
​...​
​This is pretty impressive stuff to someone like me who is new to
dm-crypt.
Thanks.
But I wondered if the chances of the passphrase being
misrecorded or misread have been fully considered.
You make a good point, but as the password has been written down on
paper the old-fashioned way, I have decided to take it as a "known good"
value.
One can speculate about the password being wrong on paper, or some
laptop-specific oddity, but as the owner had been entering it daily for
more than a year, I don't think a simple single-character swap for
neighboring keys or capitalization changes will help. In other
situations, they might, and bruteforce complexity only grows linearly
with the number of changes and password length, respectively, if one
looks for a single error, so it's definitely something to consider for
passwords that can't be remembered perfectly.
As it happens a single capitalisation error would be picked up by a
brute force method that tests for a single bit flip...
This is not the case for any of the bit error tests discussed earlier,
as they concern the necessary "decryption ingredients" on disk where bit
errors may have occurred, which of course don't include the password itself.

Regards,
protagonist
Dominic Raferd
2017-04-24 17:00:27 UTC
Permalink
Post by protagonist
Post by protagonist
I've manually compiled
​...​
as the password has been written down on
paper the old-fashioned way, I have decided to take it as a "known good"
value.
One can speculate about the password being wrong on paper, or some
laptop-specific oddity, but as the owner had been entering it daily for
more than a year, I don't think a simple single-character swap for
neighboring keys or capitalization changes will help. In other
situations, they might, and bruteforce complexity only grows linearly
with the number of changes and password length, respectively, if one
looks for a single error, so it's definitely something to consider for
passwords that can't be remembered perfectly.
You seem to have considered the options pretty thoroughly. ​If the original
owner has come to you, so s/he knows they have been typing in the same
passphrase until one day it stopped working - and they have told you that
passphrase, then an error in recording the passphrase can be discounted. If
the situation is otherwise then a wrong passphrase still seems to me more
likely than a corrupted LUKS header, especially when everything you can
test on the disk seems ok.

Is there any possibility that a malicious third party (disgruntled
ex-sysadmin perhaps) gained root access to the machine during its last
session and changed the passphrase? As an aside, of no help to OP I'm
afraid: is a prior backup of the LUKS header a protection against this
scenario (i.e. against a subsequently deleted, or changed and now unknown,
passphrase)?
Michael Kjörling
2017-04-24 17:44:04 UTC
Permalink
Post by Dominic Raferd
Is there any possibility that a malicious third party (disgruntled
ex-sysadmin perhaps) gained root access to the machine during its last
session and changed the passphrase?
Does that not require knowledge of a current passphrase? I believe it
does. Which of course said third party _could_ have.
Post by Dominic Raferd
As an aside, of no help to OP I'm afraid: is a prior backup of the
LUKS header a protection against this scenario (i.e. against a
subsequently deleted, or changed and now unknown, passphrase)?
Yes. A copy of the LUKS header and a passphrase that was valid at the
time the header copy was made will allow access, as long as the master
key is unchanged (no cryptsetup-reencrypt in the interim). The only
way to mitigate this threat AFAIK is to change the master key of the
container.
--
Michael Kjörling • https://michael.kjorling.se • ***@kjorling.se
“People who think they know everything really annoy
those of us who know we don’t.” (Bjarne Stroustrup)
protagonist
2017-04-24 23:49:46 UTC
Permalink
You seem to have considered the options pretty thoroughly. ​If the
original owner has come to you, so s/he knows they have been typing in
the same passphrase until one day it stopped working - and they have
told you that passphrase, then an error in recording the passphrase can
be discounted.
This is the case. The disk had been used in a privately owned laptop.
Is there any possibility that a malicious third party (disgruntled
ex-sysadmin perhaps) gained root access to the machine during its last
session and changed the passphrase? As an aside, of no help to OP I'm
afraid: is a prior backup of the LUKS header a protection against this
scenario (i.e. against a subsequently deleted, or changed and now
unknown, passphrase)?
If a malicious program had been able to run as root and deliberately
wrote into the LUKS header sectors to corrupt them, it definitely did so
in a very "plausible" fashion in terms of writing pseudo-random values
in the allowed areas. Given the fact that this was a fairly low-value
target, I doubt there was any reason to do this in such a "stealthy"
fashion if making the disk unusable had been the intention of such a
hypothetical malware. It's basically impossible to find out at this
point whether or not that was the case, but it's a scary thought that
should make everyone do header backups.

Regarding a "change" of passphrases:
A program would need access to the master key of the disk to create a
new, working key slot. As far as I know, a valid passphrase would be
needed during the normal cryptsetup procedures to open one of the
existing key slots, extract the master key and build the new keyslot
data containing a new copy of the master key.
However, I assume it is likely that a determined attacker running as
root might be able to extract the master key from RAM if the encrypted
volume in question is still open at the time of attack, so technically,
there would be a way to do this without the password.

I've asked the owner about mnemonics for the password, and they indeed
checked out, so I'd consider the passphrase integrity question as
settled in this case.

Regards,
protagonist
Robert Nichols
2017-04-25 13:14:52 UTC
Permalink
Post by protagonist
However, I assume it is likely that a determined attacker running as
root might be able to extract the master key from RAM if the encrypted
volume in question is still open at the time of attack, so technically,
there would be a way to do this without the password.
It's trivial. Just run "dmsetup table --showkeys" on the device.
--
Bob Nichols "NOSPAM" is really part of my email address.
Do NOT delete it.
Dominic Raferd
2017-04-25 13:44:50 UTC
Permalink
Post by Robert Nichols
Post by protagonist
However, I assume it is likely that a determined attacker running as
root might be able to extract the master key from RAM if the encrypted
volume in question is still open at the time of attack, so technically,
there would be a way to do this without the password.
It's trivial. Just run "dmsetup table --showkeys" on the device.
Wowzer. 'cryptsetup luksDump <device> --dump-master-key' can also provide
this info but it requires a passphrase, which 'dmsetup table --showkeys'
does not. So must we assume that anyone who has ever had root access while
the encrypted device is mounted can thereafter ​break through the
encryption regardless of passphrases? At least until cryptsetup-reencrypt
is run on the device, which is a big step.
Robert Nichols
2017-04-25 14:37:19 UTC
Permalink
Post by protagonist
However, I assume it is likely that a determined attacker running as
root might be able to extract the master key from RAM if the encrypted
volume in question is still open at the time of attack, so technically,
there would be a way to do this without the password.
It's trivial. Just run "dmsetup table --showkeys" on the device.
Wowzer. 'cryptsetup luksDump <device> --dump-master-key' can also provide this info but it requires a passphrase, which 'dmsetup table --showkeys' does not. So must we assume that anyone who has ever had root access while the encrypted device is mounted can thereafter ​break through the encryption regardless of passphrases? At least until cryptsetup-reencrypt is run on the device, which is a big step.
It's in the FAQ, section 6.10, so not really a great revelation.

BTW, it's "--showkey", not "--showkeys". Minor typo there, sorry.

Also, anyone who has had access to the device has had the ability to save a copy of the LUKS header, so the ability to revoke passphrases really isn't as great as it cracked up to be.
--
Bob Nichols "NOSPAM" is really part of my email address.
Do NOT delete it.
Robert Nichols
2017-04-25 14:43:50 UTC
Permalink
Post by Robert Nichols
BTW, it's "--showkey", not "--showkeys". Minor typo there, sorry.
Upon review, "dmsetup" accepts option abbreviations as long as there is no ambiguity, so just "dmsetup table --sh" is sufficient.
--
Bob Nichols "NOSPAM" is really part of my email address.
Do NOT delete it.
Ondrej Kozina
2017-04-25 14:45:48 UTC
Permalink
Post by Robert Nichols
BTW, it's "--showkey", not "--showkeys". Minor typo there, sorry.
In fact, it doesn't matter. As long as it's a unique substring of
"--showkeys" (unique wrt to other --options known to dmsetup) dmsetup
accepts even shorter version: i.e.: --showk

my 2 cents:)
Sven Eschenberg
2017-04-25 16:16:17 UTC
Permalink
Post by protagonist
However, I assume it is likely that a determined attacker running as
root might be able to extract the master key from RAM if the encrypted
volume in question is still open at the time of attack, so technically,
there would be a way to do this without the password.
It's trivial. Just run "dmsetup table --showkeys" on the device.
Wowzer. 'cryptsetup luksDump <device> --dump-master-key' can also
provide this info but it requires a passphrase, which 'dmsetup table
--showkeys' does not. So must we assume that anyone who has ever had
root access while the encrypted device is mounted can thereafter ​break
through the encryption regardless of passphrases? At least until
cryptsetup-reencrypt is run on the device, which is a big step.
Furthermore, everyone who had access to /dev/mem and was able to locate
the keys knows, them. On second thought, this holds certainly true for
the 'new central kernel key storage' (Forgot the name), depending on the
allover kernel configuration and userspace, that is.

At the end of the day dm-crypt (etc.) needs to store the key somewhere,
where it can be accessed at all times when an IO-Request comes in. There
is not that many options for that ;-).
Milan Broz
2017-04-25 16:30:00 UTC
Permalink
Post by Sven Eschenberg
Furthermore, everyone who had access to /dev/mem and was able to locate
the keys knows, them. On second thought, this holds certainly true for
the 'new central kernel key storage' (Forgot the name), depending on the
allover kernel configuration and userspace, that is.
At the end of the day dm-crypt (etc.) needs to store the key somewhere,
where it can be accessed at all times when an IO-Request comes in. There
is not that many options for that ;-).
Crypto API stores the key in memory as well (even the round keys etc), obviously.

We have already support for kernel keyring in dm-crypt (so the key will
not be directly visible in dmsetup table), this will be supported in next major
version of cryptsetup/LUKS.

But as you said, if you have access to the kernel memory, it is there anyway...

Milan
Sven Eschenberg
2017-04-25 17:09:35 UTC
Permalink
Post by Milan Broz
Post by Sven Eschenberg
Furthermore, everyone who had access to /dev/mem and was able to locate
the keys knows, them. On second thought, this holds certainly true for
the 'new central kernel key storage' (Forgot the name), depending on the
allover kernel configuration and userspace, that is.
At the end of the day dm-crypt (etc.) needs to store the key somewhere,
where it can be accessed at all times when an IO-Request comes in. There
is not that many options for that ;-).
Crypto API stores the key in memory as well (even the round keys etc), obviously.
We have already support for kernel keyring in dm-crypt (so the key will
not be directly visible in dmsetup table), this will be supported in next major
version of cryptsetup/LUKS.
But as you said, if you have access to the kernel memory, it is there anyway...
Milan
Ah, thanks Milan, kernel keyring it is called. Anyhow, the only solution
would be, to store the key in some device and retrieve it for IO-Ops,
but then again, it would make more sense, to pass the io blocks to that
(secured blackbox) device. Which would in turn mean that such a device
needs computational power and massive IO-bandwidth.

Maybe crypto acceleration cards with PCIe3 and 8+ Lanes would be an
option, if they provide a secured keyring storage etc. . I am thinking
of something like the Intel QA 8950 with respects to the concept. (The
QA 8950 aims rather at communication streams, AFAIK, I am not sure how
keys are handled, i.e. if they are passed into the adapter during engine
initialization or if an additional permanent secured keyring service is
offered, or if the key needs to be passed in for every block together
with the data)

And yes, I know, it would increase the IO Latency a bit, but offload the
CPU at the same time.

Regards

-Sven
Hendrik Brueckner
2017-04-26 14:45:03 UTC
Permalink
Hi Sven,
Post by Sven Eschenberg
Post by Milan Broz
Post by Sven Eschenberg
Furthermore, everyone who had access to /dev/mem and was able to locate
the keys knows, them. On second thought, this holds certainly true for
the 'new central kernel key storage' (Forgot the name), depending on the
allover kernel configuration and userspace, that is.
At the end of the day dm-crypt (etc.) needs to store the key somewhere,
where it can be accessed at all times when an IO-Request comes in. There
is not that many options for that ;-).
Crypto API stores the key in memory as well (even the round keys etc), obviously.
We have already support for kernel keyring in dm-crypt (so the key will
not be directly visible in dmsetup table), this will be supported in next major
version of cryptsetup/LUKS.
But as you said, if you have access to the kernel memory, it is there anyway...
Ah, thanks Milan, kernel keyring it is called. Anyhow, the only
solution would be, to store the key in some device and retrieve it
for IO-Ops, but then again, it would make more sense, to pass the io
blocks to that (secured blackbox) device. Which would in turn mean
that such a device needs computational power and massive
IO-bandwidth.
a colleague of mine and I investigated in this kind of topic. For strong
security, having the clear key accessible in the memory is not an option.
Of course, the alternative, is to deal with hardware security modules (HSM)
to perform the cryptographic operations by having the clear never leaving
the HSM.

We worked on this area and provided some cryptsetup enhancements to support
wrapped keys for disk encryption to prevent having keys in clear text in the
memory of the operating system. Recently, we submitted this merge request:

https://gitlab.com/cryptsetup/cryptsetup/merge_requests/19

Basically, it seamlessly integrates support for ciphers that can use wrapped
keys instead of clear keys.

For Linux on z Systems (our background), there is a tamper-safe hardware
security module (HSM) that provides "secure keys". Secure keys are BLOBs
containing the clear key wrapped with a master key of the HSM. Of course,
the overhead is typically considerable because each cryptographic operation
comes at the cost of an I/O operation to the HSM. However, the z Systems
firmware provides acceleration for this by re-wrapping a secure key to a
protected key (that is valid for the Linux instance (LPAR) only). Then,
you can use some special instructions to perform, for example, AES with a
protected key at CPU speed. In both cases, the clear key resides in the
HSM/firmware only and is exposed to the OS in a wrapped form only.

The merge request above, also introduces this protected-AES (paes) as
sample for a wrapped key cipher. (paes itself is an in-kernel crypto module).
Post by Sven Eschenberg
Maybe crypto acceleration cards with PCIe3 and 8+ Lanes would be an
option, if they provide a secured keyring storage etc. . I am
thinking of something like the Intel QA 8950 with respects to the
concept. (The QA 8950 aims rather at communication streams, AFAIK, I
am not sure how keys are handled, i.e. if they are passed into the
adapter during engine initialization or if an additional permanent
secured keyring service is offered, or if the key needs to be passed
in for every block together with the data)
I am not familiar with the QA 8950, but there might be a similar approach
possible as we did with paes. Perhaps, another kind of wrapped key cipher
that fits into concept.

Thanks and kind regards,
Hendrik
--
Hendrik Brueckner
***@linux.vnet.ibm.com | IBM Deutschland Research & Development GmbH
Linux on z Systems Development | Schoenaicher Str. 220, 71032 Boeblingen


IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
Milan Broz
2017-04-26 18:46:38 UTC
Permalink
Post by Hendrik Brueckner
Hi Sven,
Post by Sven Eschenberg
Post by Milan Broz
Post by Sven Eschenberg
Furthermore, everyone who had access to /dev/mem and was able to locate
the keys knows, them. On second thought, this holds certainly true for
the 'new central kernel key storage' (Forgot the name), depending on the
allover kernel configuration and userspace, that is.
At the end of the day dm-crypt (etc.) needs to store the key somewhere,
where it can be accessed at all times when an IO-Request comes in. There
is not that many options for that ;-).
Crypto API stores the key in memory as well (even the round keys etc), obviously.
We have already support for kernel keyring in dm-crypt (so the key will
not be directly visible in dmsetup table), this will be supported in next major
version of cryptsetup/LUKS.
But as you said, if you have access to the kernel memory, it is there anyway...
Ah, thanks Milan, kernel keyring it is called. Anyhow, the only
solution would be, to store the key in some device and retrieve it
for IO-Ops, but then again, it would make more sense, to pass the io
blocks to that (secured blackbox) device. Which would in turn mean
that such a device needs computational power and massive
IO-bandwidth.
a colleague of mine and I investigated in this kind of topic. For strong
security, having the clear key accessible in the memory is not an option.
Of course, the alternative, is to deal with hardware security modules (HSM)
to perform the cryptographic operations by having the clear never leaving
the HSM.
We worked on this area and provided some cryptsetup enhancements to support
wrapped keys for disk encryption to prevent having keys in clear text in the
https://gitlab.com/cryptsetup/cryptsetup/merge_requests/19
It would be better if you start new thread here about this because this
thread is really about something else.

Anyway, I will handle that merge request later but in short - your
approach works only IBM zSystem (no other system implements this
wrapped encryption, so it is very platform specific).

LUKS1 is portable format, we cannot bind the format to specific hardware.

So do not expect this to be merged as it is and specifically not to
LUKS1 format I consider stable and where it major advantage is
portability among all possible Linux distributions and architectures.

Anyway the discussion could be interesting. But I do not think
the mainframe approach can be applied to low-end systems where this kind
of FDE is mostly used. The FDE threat model is also usually focused
to offline systems (stolen disk) so here we do not need to care
if key is in memory when system is online.

Milan
Post by Hendrik Brueckner
Basically, it seamlessly integrates support for ciphers that can use wrapped
keys instead of clear keys.
For Linux on z Systems (our background), there is a tamper-safe hardware
security module (HSM) that provides "secure keys". Secure keys are BLOBs
containing the clear key wrapped with a master key of the HSM. Of course,
the overhead is typically considerable because each cryptographic operation
comes at the cost of an I/O operation to the HSM. However, the z Systems
firmware provides acceleration for this by re-wrapping a secure key to a
protected key (that is valid for the Linux instance (LPAR) only). Then,
you can use some special instructions to perform, for example, AES with a
protected key at CPU speed. In both cases, the clear key resides in the
HSM/firmware only and is exposed to the OS in a wrapped form only.
The merge request above, also introduces this protected-AES (paes) as
sample for a wrapped key cipher. (paes itself is an in-kernel crypto module).
Post by Sven Eschenberg
Maybe crypto acceleration cards with PCIe3 and 8+ Lanes would be an
option, if they provide a secured keyring storage etc. . I am
thinking of something like the Intel QA 8950 with respects to the
concept. (The QA 8950 aims rather at communication streams, AFAIK, I
am not sure how keys are handled, i.e. if they are passed into the
adapter during engine initialization or if an additional permanent
secured keyring service is offered, or if the key needs to be passed
in for every block together with the data)
I am not familiar with the QA 8950, but there might be a similar approach
possible as we did with paes. Perhaps, another kind of wrapped key cipher
that fits into concept.
Thanks and kind regards,
Hendrik
protagonist
2017-04-28 15:51:25 UTC
Permalink
Good news:
my improvised simulate-AF-bitflip-and-decode bruteforce program based on
cryptsetup core code is working since yesterday. After finding several
issues in my initial code, including a failing call to the kernel crypto
API asking for a aes "xts-plain64" cipher (instead of the correct aes
"xts" version) that had me scratching my head and looking at strace
outputs for a while, I've managed to run it successfully against a
specially corrupted LUKS-encrypted disk containing an ext4 filesystem
(the first 512 bytes are guaranteed to be 0x00, as noted before) and
detected the error that was deliberately written to one of the AF
keyslot bits.

A run against the actual target device was unsuccessful, which could
have several reasons, including the fact that the LVM header might not
start with 512 bytes of 0x00, which is part of the decryption check
currently used. I will add a test against the magic "LABELONE" bytes in
the header for LVM-specific, but given the presence of ECC on most SSDs,
it's very unlikely I'll actually manage to find any undetected simple
error pattern and recover the disk data.

--

The brute force speed yesterday was 250 AF-keyslot changes per second
and core against the target.
Today I've managed to further improve this up to about ~285 iterations /
second and core by limiting the first-stage decode of the AF-keyslot
sectors to exactly those that were influenced by the bit flip. These
values are for aes-xts plain64, sha1 and a 512 bit masterkey.

Given that a 512-bit masterkey leads to 4000 * 512 bits of AF-keyslot
data, this gives (4000⋅512)/(285⋅60⋅4) = ~30 minutes of total bruteforce
duration on a modern quadcore CPU using four threads for trying a fixed
localized error pattern such as a single bit flip against the disk, as
long as the partial contents of a fixed sector are known, such as
ext4/LVM file header constants in sector #0 and #1.

The main processing time during bruteforce is spent on AF_merge(), which
in turn spends most of it's time on the diffuse() function that is
concerned with hashing the AF-key data (with the sha1 hash, in our
case). Note that there is no adjustable number of hashing operations for
this step, as opposed to other steps of the LUKS header decryption. It
is therefore obvious that a adversary with access to hashing
accelerators (GPU, FPGA, ASIC, see bitcoin hardware developments for
specialized hashing) can easily benefit from several magnitudes of
performance gain over the speed obtainable on a CPU without any
parameter to make this harder except changing to a more demanding hash
algorithm. (This has little impact on a CPU, as their speed differences
are within a factor of two)

IMHO this approach is only interesting for recovery purposes, but not a
real attack vector in terms of opportunity for significantly improved
offline attacks against a drive, as it only applies for the very rare
cases where an attacker has the complete password as well as almost all
of the AF data in their original form but lacks a few bits or bytes
(depending on his knowledge of the corrupted sections) that he now has
to test against.
If the diffuse() function and therefore the hash have excellent
cryptographic properties (which might not be the case), "reconstructing"
AF-sections on disk that were properly overwritten on the physical level
is be exponentially hard in a sense that quickly makes a brute-force
attack against the actual symmetric XTS key the better choice, at least
for this "naive" way of regarding the AF-keyslot as a large symmetric key.

A quick example of this:
* overwriting the first 4 bytes of your AF-keyslot with 0x42 0x42 0x42
0x42 with the intention to make it unusable will allow a break it within
~43 days ( (255^4)/(285⋅4⋅60⋅60⋅24) ) on a single desktop machine and
the current performance, and half that time that on average
* overwriting the first 64 bytes will lead to a similar complexity for
this bruteforce algorithm as trying the masterkey from null.
Note: There are likely other shortcuts, as hinted at before, so be sure
to overwrite as much of the LUKS header bytes in practice if you have
the intention of destroying it.

Looking for further opportunities at optimizing this bruteforce process,
it is clear that given the highly-optimized hashing code of the nettle
library mentioned before, we're unlikely to push significantly more
unique hash operations through a given core. But most of the hashing
operations and their results will be identical between the different
fault positions we're testing, at least as long as the processed data
hasn't changed yet compared to the last iteration (the simulated fault
is further down in the AFkeyslot), they don't all have to be performed
again each time. Maybe I'll manage to squeeze even more performance out
of this...

Regards,
protagonist
Post by protagonist
Post by protagonist
I've manually compiled https://github.com/mbroz/pbkdf2_cryptsetup_test
as well as cryptsetup itself to find possible improvements with
different crypto backends, gcc optimizations such as -Ofast and
-march=native, but I've been unsuccessful to improve on the 1.25M/s
number so far. Openssl beats gcrypt, but the default kernel backend
still seems faster.
Update: the nettle library is the fastest crypto backend by a
significant margin according to my tests. Also, contrary to my previous
remark, the kernel crypto appears to be slower than openssl, at least
under Debian Jessie with 3.16.
nettle > openssl > kernel 3.16 > gcrypt > nss
PBKDF2-sha1 1680410 iterations per second for 256-bit key
This value includes benefits by switching the CFLAGS from "-O2" to
"-Ofast -march=native", which are mostly insignificant (<1% improvement)
and probably rarely worth the effort.
Switching from libnettle 4.7 to freshest libnettle 6.3 brings only minor
PBKDF2-sha1 1702233 iterations per second for 256-bit key
Given that they provide hand-optimized x86 assembler code for the sha
computation, this is not entirely surprising.
Post by protagonist
One could go even further and replace the time-consuming steps 5) and 6)
with a routine that decrypts an encrypted part of the disk with the
masterKeyCandidate and compares it to a known plaintext (or a more
elaborate heuristic, such as several entropy checks looking for
"correctly" decryptable areas on disk leading to low-entropy output),
which might be a lot faster given AES-NI support and AES-xts throughput
speeds, but since we don't know much about actual disk content, this
seems to be too much of a gamble to be worth the effort at this point.
It's easily thinkable one tests against several areas containing a
well-compressed movie or a similarly high-entropy area, leading to a
false-negative and missing the only correct master key.
Correction: I meant omitting steps 4) and 5).
While thinking about the disk layout, I've realized that the encrypted
storage almost definitely contains a LVM scheme commonly used by
full-disk encryption installer setups to store separate logical volumes
for both root and swap. In the related case of a manual setup of an
external storage device, this would often be plain ext4.
Now, there is more "known plaintext" available than I initially
suspected: both LVM and ext4 don't just contain special headers with
magic numbers, labels and checksums in well defined positions, but they
actually include at least one unused sector right in the beginning.
According to my tests, those bytes are set (or kept) to 0x00.
"The physical volume label is stored in the second sector of the
physical volume." [^1]
Given the SSD sector size of 512 bytes and the fact that they reserve
four sectors for the label, we have at least three sectors à 512 bytes
of known zeros at sector #0, #2 and #3, which should be more than plenty
for some fast & simple decryption check that doesn't assume much about
the specific version and configuration of the logical volumes.
For ext4, the "Group 0 Padding" of 1024 Bytes would serve a similar
purpose. [^2]
Now that this shortcut seemed attractive, I've started cannibalizing the
LUKS_open_key() function in lib/luks1/keymanage.c to host my bruteforce
approach and made some progress already.
Small-scale benchmarking with 10000 rounds of AF-decryption, AF_merge
and (unfinished) block target decryption calls take about 50s combined,
including "normal" cryptsetup program initialization, initial disk reads
and most other necessary computations.
This sets overall performance at something in the order of 150 to 200
individual checks per second and core for the final routine, which is a
good improvement over the naive bruteforce version once it's ready.
Regards,
protagonist
[^1]
https://github.com/libyal/libvslvm/blob/master/documentation/Logical%20Volume%20Manager%20%28LVM%29%20format.asciidoc#2-physical-volume-label
[^2] https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Layout
_______________________________________________
dm-crypt mailing list
http://www.saout.de/mailman/listinfo/dm-crypt
protagonist
2017-04-30 15:06:57 UTC
Permalink
As a short update, I can confirm that when run with the default options,
pvcreate initializes the first 512 bytes of the LVM header block with
0x00, similarly to ext4, creating excellent known plaintext that is easy
to spot during debugging of decryption routines.

This is documented in the manpage of pvcreate:
"-Z, --zero {y|n}
Whether or not the first 4 sectors (2048 bytes) of the device should be
wiped. If this option is not given, the default is to wipe these sectors
unless either or both of the --restorefile or --uuid options were
specified." https://linux.die.net/man/8/pvcreate

My current memcmp of the first 512 bytes therefore works just as well on
LVM as on ext4 and has managed to find a bit flip on a deliberately
corrupted key slot.

However, this is bad news for my ultimate target of recovering the
actual master key of the SSD in question, as it seems my previous
1-error checks have been unsuccessful, but valid.
Regards,
protagonist
Arno Wagner
2017-04-30 18:39:21 UTC
Permalink
Post by protagonist
As a short update, I can confirm that when run with the default options,
pvcreate initializes the first 512 bytes of the LVM header block with
0x00, similarly to ext4, creating excellent known plaintext that is easy
to spot during debugging of decryption routines.
"-Z, --zero {y|n}
Whether or not the first 4 sectors (2048 bytes) of the device should be
wiped. If this option is not given, the default is to wipe these sectors
unless either or both of the --restorefile or --uuid options were
specified." https://linux.die.net/man/8/pvcreate
My current memcmp of the first 512 bytes therefore works just as well on
LVM as on ext4 and has managed to find a bit flip on a deliberately
corrupted key slot.
However, this is bad news for my ultimate target of recovering the
actual master key of the SSD in question, as it seems my previous
1-error checks have been unsuccessful, but valid.
Still impressive work. But it was a 10% thing at best.

Regards,
Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: ***@wagner.name
GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718
----
A good decision is based on knowledge and not on numbers. -- Plato

If it's in the news, don't worry about it. The very definition of
"news" is "something that hardly ever happens." -- Bruce Schneier
Loading...