GPG intermittent self-test failure #3543

claudioandre-br · 2018-12-20T11:36:22Z

Intermitent

. Inside my Circle

Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (3xOMP) FAILED (cmp_all(192))

The following issue is not GPG related. It is related to batch mode.

. GPG random test
. Or, you can also use ../run/john alltests.in --max-run=30 --format=sha512crypt

# wget http://openwall.info/wiki/_media/john/test.gpg.tar.gz
# ../run/gpg2john *.asc > ~/file4

# $JtR ~/file4
Using default input encoding: UTF-8
Loaded 4 password hashes with 4 different salts (gpg, OpenPGP / GnuPG Secret Key [32/64])
Cost 1 (s2k-count) is 65536 for all loaded hashes
Cost 2 (hash algorithm [1:MD5 2:SHA1 3:RIPEMD160 8:SHA256 9:SHA384 10:SHA512 11:SHA224]) is 2 for all loaded hashes
Cost 3 (cipher algorithm [1:IDEA 2:3DES 3:CAST5 4:Blowfish 7:AES128 8:AES192 9:AES256 10:Twofish 11:Camellia128 12:Camellia192 13:Camellia256]) is 3 for all loaded hashes
Will run 2 OpenMP threads
Proceeding with single, rules:Wordlist
recovery.c:424:28: runtime error: 1.63246e+18 is outside the range of representable values of type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior recovery.c:424:28 in 
Press 'q' or Ctrl-C to abort, almost any other key for status
qwerty2          (Brandon User)
alamakota        (Original)
normal           (Łąśóęśżźć Normal)
Warning: Only 4 candidates buffered for the current salt, minimum 8
needed for performance.
Almost done: Processing the remaining buffered candidate passwords, if any
Warning: Only 1 candidates buffered for the current salt, minimum 8
needed for performance.
Proceeding with wordlist:../run/password.lst, rules:Wordlist
qwerty           (Random User)
4g 0:00:00:07 DONE 2/3 (2018-12-20 11:31) 0.5012g/s 8251p/s 8252c/s 8252C/s 123456..john
Use the "--show" option to display all of the cracked passwords reliably
Session completed

=================================================================
==80==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 1024 byte(s) in 1 object(s) allocated from:
    #0 0x4ed437 in __interceptor_malloc (/cwd/bin/bleeding/run/john+0x4ed437)
    #1 0xff5fa3 in mem_alloc /cwd/src/memory.c:76:14
    #2 0xf6fd3f in crk_init /cwd/src/cracker.c:218:27
    #3 0x102bb31 in single_init /cwd/src/single.c:407:2
    #4 0x102a9bc in do_single_crack /cwd/src/single.c:839:2
    #5 0xf46fdd in do_single_pass /cwd/src/batch.c:25:2
    #6 0xf46d38 in do_batch_crack /cwd/src/batch.c:51:3
    #7 0xfa11ef in john_run /cwd/src/john.c:1875:4
    #8 0xf9f270 in main /cwd/src/john.c:2102:2
    #9 0x7f6405c7409a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

SUMMARY: AddressSanitizer: 1024 byte(s) leaked in 1 allocation(s).

The text was updated successfully, but these errors were encountered:

claudioandre-br · 2018-12-20T12:32:05Z

The ticket contains two unrelated issues.

If the leak is on purpose, just close the ticket.

magnumripper · 2018-12-21T09:17:21Z

Leak fixed now. What about the self-test fail?

claudioandre-br · 2018-12-21T16:32:30Z

What about the self-test fail?

It is intermittent. Happens once in a while.

magnumripper · 2018-12-21T16:59:23Z

Then let's check this out

kholia · 2018-12-23T03:15:00Z

These failures have been plaguing the GPG format for a long time it seems. I was never able to get to the bottom of the issue earlier.

Earlier, I was looking for the problem in the gpg-opencl format (when it had fall-back on CPU code functionality in it).

It now seems that the problem exists within the base CPU code itself? This makes sense to me and would explain those weird gpg-opencl failures earlier.

magnumripper · 2018-12-23T11:11:45Z

I bet something in OpenSSL is not thread safe. Have we ever seen problems with a non-omp build?

kholia · 2018-12-24T08:03:09Z

I cannot reproduce the GPG format failures with the following command,

$ while true; do OMP_NUM_THREADS=3 ../run/john --format=gpg --test; sleep 1; done

I am running Ubuntu 18.04.1 64-bit here.

magnumripper · 2018-12-24T09:13:08Z

@claudioandre-br what version of OpenSSL was that?

claudioandre-br · 2018-12-24T09:44:12Z

Crypto library: OpenSSL
OpenSSL library version: 01000208f
OpenSSL 1.0.2h  3 May 2016

kholia · 2018-12-26T15:32:58Z

@claudioandre-br How do I reproduce this problem locally? Are you able to reproduce the problem on Ubuntu 18.0.1 LTS or something similar?

magnumripper · 2018-12-26T20:27:47Z

Try --stress-test=0. How often does it fail? Once in 10? Once in 100? Using how many threads?

kholia · 2018-12-29T06:11:38Z

I am running ../run/john --format=gpg --stress-test=0 on Ubuntu 18.04.1 LTS which has OpenSSL 1.1.0g-2ubuntu4.3.

...
#529 Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (4xOMP) PASS
#530 Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (4xOMP) PASS
#531 Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (4xOMP) PASS

I see no failures at all. I tried OMP_NUM_THREADS=3 but no failures again.

@claudioandre-br Can you please give some pointers on how to reproduce this bug?

solardiz · 2024-06-04T12:43:45Z

Looks like this (or similar) is still an issue, gpg format often failing on Linux cfarm29 6.1.0-21-powerpc64le #1 SMP Debian 6.1.90-1 (2024-05-03) ppc64le GNU/Linux: #5491 (comment)

solardiz · 2024-06-04T12:58:05Z

On the above system, I am unable to trigger the issue with --stress-test=0, but I am with a shell while loop around a --test-full=0. This suggests that maybe the problem is exposed/hidden by variations in memory layout (ASLR).

The failure is usually on cmp_all of the full size, but on one occasion it was on get_key(47) (with mkpc 64), which suggests we have heap memory corruption.

solardiz · 2024-06-04T13:19:46Z

I could also get it to fail in an --enable-asan build, without triggering ASan. This suggests we may have memory corruption via a system library (a bug in there, or in our usage of the library). Or a race condition in our code, but the one failure on get_key suggests otherwise.

solar@cfarm29:~/john/run$ while :; do ../run/john-asan --test-full=0 --format=gpg -tune=2 || break; done
Will run 32 OpenMP threads
Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (32xOMP) PASS
Will run 32 OpenMP threads
Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (32xOMP) PASS
Will run 32 OpenMP threads
Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (32xOMP) PASS
Will run 32 OpenMP threads
Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (32xOMP) PASS
Will run 32 OpenMP threads
Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (32xOMP) PASS
Will run 32 OpenMP threads
Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (32xOMP) PASS
Will run 32 OpenMP threads
Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (32xOMP) PASS
Will run 32 OpenMP threads
Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (32xOMP) PASS
Will run 32 OpenMP threads
Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (32xOMP) PASS
Will run 32 OpenMP threads
Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (32xOMP) FAILED (cmp_all(64))

solardiz · 2024-06-04T13:30:03Z

This hack didn't catch anything:

+++ b/src/gpg_common_plug.c
@@ -1220,6 +1220,10 @@ int gpg_common_check(unsigned char *keydata, int ks)
        // other things (like mpz integer creation) are needed, we know that
        // our sizes will not overflow.
        out = mem_alloc((gpg_common_cur_salt->datalen) + 0x10000);
+       unsigned char *out_last = &out[gpg_common_cur_salt->datalen + 0x10000 - 1];
+       *out_last = 'c';
+#undef MEM_FREE
+#define MEM_FREE(out) { assert(*out_last == 'c'); free(out); out = NULL; }
        // Quick Hack
        if (!gpg_common_cur_salt->symmetric_mode)
                memcpy(ivec, gpg_common_cur_salt->iv, gpg_common_blockSize(gpg_common_cur_salt->cipher_algorithm));

solardiz · 2024-06-04T13:33:13Z

In fact, even this works:

+++ b/src/gpg_common_plug.c
@@ -1220,6 +1220,10 @@ int gpg_common_check(unsigned char *keydata, int ks)
        // other things (like mpz integer creation) are needed, we know that
        // our sizes will not overflow.
        out = mem_alloc((gpg_common_cur_salt->datalen) + 0x10000);
+       unsigned char *out_last = &out[gpg_common_cur_salt->datalen];
+       *out_last = 'c';
+#undef MEM_FREE
+#define MEM_FREE(out) { assert(*out_last == 'c'); free(out); out = NULL; }

So the extra 0x10000 looks unneeded (and in general it was a bad idea to be allocating/freeing memory during cracking).

solardiz · 2024-06-04T13:36:07Z

So the extra 0x10000 looks unneeded (and in general it was a bad idea to be allocating/freeing memory during cracking).

This and many other over-allocations were added by JimF in 7147798 in 2016, with commit title "gpg: bugs found in -test-full=0 using ASAN".

solardiz · 2024-06-04T13:44:19Z

Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (32xOMP) FAILED (cmp_all(64))

This triggers the same for -cost=0,-8,-7 vs. -cost=0,8,7, so the problem doesn't look specific to (usage of) any one S2K, hash, nor cipher.

Edit: perhaps I am wrong - I actually don't know if -cost affects our self-tests, or only later benchmark/usage. If so, that wasn't a valid test.

solardiz · 2024-06-04T13:51:31Z

perhaps I am wrong - I actually don't know if -cost affects our self-tests, or only later benchmark/usage. If so, that wasn't a valid test.

So I went to patch the test vectors in the source. With only the symmetric tests left (last 7 test vectors), the testing is really quick and it didn't fail for me. With the opposite change (removing these symmetric tests), after a while I got:

Testing: gpg, OpenPGP / GnuPG Secret Key [32/64]... (32xOMP) FAILED (get_key(57))

which is the second time I see this format fail on get_key. The build was with ASan, which didn't trigger. So this confirms memory corruption via library.

solardiz · 2024-06-04T14:01:07Z

The two gpg --homedir . --s2k-cipher-algo 3des --simple-sk-checksum --gen-key test vectors appear to be both required and sufficient to reproduce the issue.

solardiz · 2024-06-04T14:34:01Z

So the extra 0x10000 looks unneeded (and in general it was a bad idea to be allocating/freeing memory during cracking).

This and many other over-allocations were added by JimF in 7147798 in 2016, with commit title "gpg: bugs found in -test-full=0 using ASAN".

OK, this over-allocation was in fact needed - we have major over-reads (being reads, they were not caught by my canary). About the smallest that works is 0x2700 extra bytes. I wondered if these over-read of uninitialized memory could result in the cmp_all failure mode, but initializing this memory to 0 didn't appear to make a difference.

solardiz · 2024-06-04T22:01:12Z

I didn't look into this further yet, but just to write down my understanding from earlier today: when trying usually wrong passwords, we end up decrypting what isn't a valid key. Yet it has embedded length fields in it. Those may end up indicating bignums of up to 64 kbit each. This is why the over-reads. Then we pass those unusually huge bignums into OpenSSL routines, thereby torturing them (and wasting lots of CPU cycles, too, which I guess we'd rather avoid by detecting the over-reads instead). We also ask BN_bin2bn to do its own memory allocation for the output bignum - well, I'm not sure it's able to calculate its size correctly for an unusual input like this - will need to review/test OpenSSL code for that. This could very well be the issue.

solardiz · 2024-06-07T20:23:20Z

we pass those unusually huge bignums into OpenSSL routines, thereby torturing them (and wasting lots of CPU cycles, too, which I guess we'd rather avoid by detecting the over-reads instead). We also ask BN_bin2bn to do its own memory allocation for the output bignum - well, I'm not sure it's able to calculate its size correctly for an unusual input like this - will need to review/test OpenSSL code for that.

I reviewed current OpenSSL code - looks correct even for such inputs.

Our issue appears to be different. We have larger p and q fields in the salt struct to account for the up to 64kbit values, but our d field is tiny, yet we also write an up to 64kbit bignum in there (which we actually just skip).

I wonder why ASan does not trigger. The struct ends in unsigned char data[1]; which is idiomatic for a dynamically-allocated struct size. Are these maybe excluded from ASan protection?

Anyway, patching the code to skip this field without writing it to d makes the issue go away in my testing.

Importantly, such copying frequently overflowed the d[] array with a value that we skip anyway. Fixes openwall#3543

solardiz · 2024-06-08T14:25:24Z

I wonder why ASan does not trigger. The struct ends in unsigned char data[1]; which is idiomatic for a dynamically-allocated struct size. Are these maybe excluded from ASan protection?

That's not how ASan works, so was probably a wrong guess. Another probably wrong guess is the overwrite is somehow contained to the struct (which is valid behavior in C), but results in us making incorrect calls into OpenSSL later (because of the overwritten salt data, which maybe wouldn't have passed our valid). I do not have a really good idea as to what was going on, but I think the issue is now fixed - at least I cannot reproduce it anymore after my fixes.

claudioandre-br changed the title ~~gpg failure and leak~~ Batch mode leak Dec 20, 2018

magnumripper closed this as completed in e2b0efd Dec 21, 2018

magnumripper reopened this Dec 21, 2018

magnumripper changed the title ~~Batch mode leak~~ GPG intermittent self-test failure Dec 21, 2018

magnumripper assigned kholia Jan 16, 2019

solardiz unassigned kholia Jun 4, 2024

solardiz mentioned this issue Jun 4, 2024

crypt format occasionally fails a --test-full=0 --format=cpu run on Ubuntu 22 powerpc64le #5491

Open

solardiz added this to the Definitely 2.0.0 milestone Jun 5, 2024

solardiz added a commit to solardiz/john that referenced this issue Jun 7, 2024

gpg format: Avoid unnecessary copying into salt struct fields

f15e8ac

Importantly, such copying frequently overflowed the d[] array with a value that we skip anyway. Fixes openwall#3543

solardiz added a commit to solardiz/john that referenced this issue Jun 8, 2024

gpg formats: Avoid unnecessary copying into salt struct fields

1267ddc

Importantly, such copying frequently overflowed the d[] array with a value that we skip anyway. Fixes openwall#3543

solardiz closed this as completed in 80aed6f Jun 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPG intermittent self-test failure #3543

GPG intermittent self-test failure #3543

claudioandre-br commented Dec 20, 2018 •

edited

Loading

claudioandre-br commented Dec 20, 2018 •

edited

Loading

magnumripper commented Dec 21, 2018

claudioandre-br commented Dec 21, 2018

magnumripper commented Dec 21, 2018

kholia commented Dec 23, 2018 •

edited

Loading

magnumripper commented Dec 23, 2018

kholia commented Dec 24, 2018

magnumripper commented Dec 24, 2018

claudioandre-br commented Dec 24, 2018

kholia commented Dec 26, 2018

magnumripper commented Dec 26, 2018

kholia commented Dec 29, 2018

solardiz commented Jun 4, 2024 •

edited

Loading

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024 •

edited

Loading

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 7, 2024

solardiz commented Jun 8, 2024

GPG intermittent self-test failure #3543

GPG intermittent self-test failure #3543

Comments

claudioandre-br commented Dec 20, 2018 • edited Loading

claudioandre-br commented Dec 20, 2018 • edited Loading

magnumripper commented Dec 21, 2018

claudioandre-br commented Dec 21, 2018

magnumripper commented Dec 21, 2018

kholia commented Dec 23, 2018 • edited Loading

magnumripper commented Dec 23, 2018

kholia commented Dec 24, 2018

magnumripper commented Dec 24, 2018

claudioandre-br commented Dec 24, 2018

kholia commented Dec 26, 2018

magnumripper commented Dec 26, 2018

kholia commented Dec 29, 2018

solardiz commented Jun 4, 2024 • edited Loading

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024 • edited Loading

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 4, 2024

solardiz commented Jun 7, 2024

solardiz commented Jun 8, 2024

claudioandre-br commented Dec 20, 2018 •

edited

Loading

claudioandre-br commented Dec 20, 2018 •

edited

Loading

kholia commented Dec 23, 2018 •

edited

Loading

solardiz commented Jun 4, 2024 •

edited

Loading

solardiz commented Jun 4, 2024 •

edited

Loading