Skip to content
This repository has been archived by the owner on Mar 15, 2019. It is now read-only.

Critical - crash in multithreaded environment, when using nedrealloc (yes, again) #15

Open
Gerilgfx opened this issue Aug 22, 2013 · 40 comments

Comments

@Gerilgfx
Copy link
Contributor

Critical - crash in multithreaded environment, when using nedrealloc (yes, again)

crash appears when nedrealloc being called on multiple threads, reallocating small (or null) memory area to larger buffers again and again. The crash occurs mostly before reaching the first percent in the test. If the algo able to reach that point, software mostly survives. To reproduce the crash, its good to have other processes working too, for example, watching hd yourube video in the front.

Crash type: memory corruption

Version affected: newest (older versions not yet tested)

compiler flag:

g++ nedmalloctester3.c -o nedmalloctester -O3 -s -lpthread -m64

compiler version:

g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib64/gcc/x86_64-suse-linux/4.7/lto-wrapper
Target: x86_64-suse-linux
Configured with: ../configure --prefix=/usr --infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64 --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.7 --enable-ssp --disable-libssp --disable-libitm --disable-plugin --with-bugurl=http://bugs.opensuse.org/ --with-pkgversion='SUSE Linux' --disable-libgcj --disable-libmudflap --with-slibdir=/lib64 --with-system-zlib --enable-__cxa_atexit --enable-libstdcxx-allocator=new --disable-libstdcxx-pch --enable-version-specific-runtime-libs --enable-linker-build-id --program-suffix=-4.7 --enable-linux-futex --without-system-libunwind --with-arch-32=i586 --with-tune=generic --build=x86_64-suse-linux
Thread model: posix
gcc version 4.7.1 20120723 [gcc-4_7-branch revision 189773](SUSE Linux)

system version:

uname -r -a
Linux a1 3.4.6-2.10-desktop #1 SMP PREEMPT Thu Jul 26 09:36:26 UTC 2012 (641c197) x86_64 x86_64 x86_64 GNU/Linux

output:

g++ nedmalloctester3.c -o nedmalloctester -O3 -s -lpthread

./nedmalloctester

test 3 begins...
nedmalloc: nedprealloc() called with a block not created by nedmalloc!
Aborted

./nedmalloctester

test 3 begins...
0 percent finished
^C

./nedmalloctester

test 3 begins...
0 percent finished
^C

./nedmalloctester

test 3 begins...
0 percent finished
^C

./nedmalloctester

test 3 begins...
nedmalloc: nedprealloc() called with a block not created by nedmalloc!
Aborted

testcase:

// g++ nedmalloctester3.c -o nedmalloctester3 -O3 -s -pthread

include <stdio.h>

include <stdlib.h>

include <string.h>

include <pthread.h>

define USE_LOCKS 1

define USE_DL_PREFIX 1

define NDEBUG

define NO_NED_NAMESPACE

include "nedmalloc/nedmalloc_2013_apr/ori/nedmalloc.h"

include "nedmalloc/nedmalloc_2013_apr/ori/nedmalloc.c"

define malloc_vpool nedmalloc

define free_vpool nedfree

define realloc_vpool nedrealloc

/*#define malloc_vpool malloc

define free_vpool free

define realloc_vpool realloc*/

define TESTMEMMAX 1024_1024_2

void ** test=NULL;

int div_w=8; // block size to be sure that we touching pointers allocated from different thread ID-s

void malt(int thread){
for(int iteracio=1;iteracio<80;iteracio+=4){
for(int i=0;i<TESTMEMMAX;i++){
if(((i/div_w)%10)!=thread) continue; // 10 thread
// printf("%d\n", i);
test[i]=realloc_vpool(test[i], iteracio);
memset(test[i], 1, iteracio);
}
}
}

void *malt2(void * threadid){malt(1);}
void *malt3(void * threadid){malt(2);}
void *malt4(void * threadid){malt(3);}
void *malt5(void * threadid){malt(4);}
void *malt6(void * threadid){malt(5);}
void *malt7(void * threadid){malt(6);}
void *malt8(void * threadid){malt(7);}
void *malt9(void * threadid){malt(8);}
void *malt10(void * threadid){malt(9);}

void MallocStabTest3(){
printf("test 3 begins...\n");

test=(void**)malloc_vpool(128+(TESTMEMMAX*sizeof(void*)));
for(int i=0;i<(TESTMEMMAX);i++) test[i]=NULL;

for(int Z=0;Z<100;Z++){
    div_w=2+(rand()%40);  // random block size to be sure that we touching pointers allocated from different thread ID-s

    pthread_t TMP2=0;
    pthread_t TMP3=0;
    pthread_t TMP4=0;
    pthread_t TMP5=0;
    pthread_t TMP6=0;
    pthread_t TMP7=0;
    pthread_t TMP8=0;
    pthread_t TMP9=0;
    pthread_t TMP10=0;

    pthread_create(&TMP2, NULL, malt2, NULL);
    pthread_create(&TMP3, NULL, malt3, NULL);
    pthread_create(&TMP4, NULL, malt4, NULL);
    pthread_create(&TMP5, NULL, malt5, NULL);
    pthread_create(&TMP6, NULL, malt6, NULL);
    pthread_create(&TMP7, NULL, malt7, NULL);
    pthread_create(&TMP8, NULL, malt8, NULL);
    pthread_create(&TMP9, NULL, malt9, NULL);
    pthread_create(&TMP10, NULL, malt10, NULL);

    malt(0);

    pthread_join(TMP2, NULL);
    pthread_join(TMP3, NULL);
    pthread_join(TMP4, NULL);
    pthread_join(TMP5, NULL);
    pthread_join(TMP6, NULL);
    pthread_join(TMP7, NULL);
    pthread_join(TMP8, NULL);
    pthread_join(TMP9, NULL);
    pthread_join(TMP10, NULL);

    printf("%d percent finished\n", Z);
}

for(int i=0;i<(TESTMEMMAX);i++) if(test[i]) free_vpool(test[i]);
free_vpool(test);
printf("success.\n");

}

int main(){
MallocStabTest3();
}

@Gerilgfx
Copy link
Contributor Author

previous versions from 2013 crashing too

@ned14
Copy link
Owner

ned14 commented Aug 22, 2013

The last time I put together some form of release was for v1.10 beta 3 in 2012. I agree that nedmalloc definitely needs a regularly executed stress test suite, and in fact I have recently purchased a server for a Jenkins CI which you can see at https://ci.nedprod.com/.

As it happens, I was fired from BlackBerry on Monday, so I suddenly have some free time. I'll look into figuring out some form of automated solution to the many breakages which have slipped into nedmalloc over the years by accident.

Thanks for reporting the bug Geri. You're a trooper.

Niall

@Gerilgfx
Copy link
Contributor Author

thankyou for creating and supporting this wonderfull software. i suggest to create a stresstest based on my multithreaded testcases, like the current one, and those i posted before. they are simply enough, and its easy to debug them.

@Gerilgfx
Copy link
Contributor Author

hi. was you able to repeat the crash?

@ned14
Copy link
Owner

ned14 commented Aug 25, 2013

It'll be a few days yet. I'm currently mentoring gsoc and I need to finish two work items to enable the student to proceed as he is waiting on me.

@Gerilgfx
Copy link
Contributor Author

Gerilgfx commented Sep 4, 2013

i did some test:
-both -m64 and -m32 crashing
-both with -o3 and without o3, crashing, both -s or/without s crashing
so i guess its an algorithmic bug

@ned14
Copy link
Owner

ned14 commented Sep 6, 2013

It is on my radar. Integrating the new items from http://boostafio.uservoice.com/forums/218980-boost-afio-feature-request before GSoC ends in ten days has proved harder than expected.

@Gerilgfx
Copy link
Contributor Author

Gerilgfx commented Sep 7, 2013

i wrapped nedrealloc to nedfree and nedmalloc functions in my code until the fix done, no need to hurry

@ned14
Copy link
Owner

ned14 commented Oct 10, 2013

I should be able to look into this now. Can you put the test case above, which is too mangled to make much sense, into a gist so I can get it demangled? Thanks.

@Gerilgfx
Copy link
Contributor Author

https://gist.github.com/Gerilgfx/6953861 i hope it was this one.

ned14 pushed a commit that referenced this issue Oct 17, 2013
@ned14
Copy link
Owner

ned14 commented Oct 17, 2013

Bad news: I can't replicate this on my Ubuntu 12.04 x64 machine with a i7-3770K CPU. I tried:

GCC v4.6.4
GCC v4.7.3
GCC v4.8.1

It could be a timing issue where your CPU finds a race mine doesn't. Or it could be a bug in GCC 4.7.1 which has since been fixed. There is some order sensitive code in the threadcache, a slight reordering from what is specified in the code would introduce exactly this kind of race. In theory a compiler shouldn't do such a reorder, but maybe there was a bug in GCC v4.7.1.

Niall

@ned14
Copy link
Owner

ned14 commented Oct 17, 2013

Also, try setting THREADCACHEMAX to 0. That will help me determine if it's dlmalloc or the thread cache which is at fault.

@Gerilgfx
Copy link
Contributor Author

with:
#define THREADCACHEMAX 0

test 3 begins...
nedmalloc: nedprealloc() called with a block not created by nedmalloc!
Aborted

@Gerilgfx
Copy link
Contributor Author

changing:

test[i]=realloc_vpool(test[i], iteracio);

to:

if(test[i]) free_vpool(test[i]);
test[i]=malloc_vpool(iteracio);

works.

i think the bug is precisely in your realloc implementation.

@Gerilgfx
Copy link
Contributor Author

        if(test[i]) free_vpool(test[i]);
        test[i]=NULL;
        test[i]=realloc_vpool(test[i], iteracio);

this works too.

@Gerilgfx
Copy link
Contributor Author

if(!memsize)
{
    fprintf(stderr, "nedmalloc: nedprealloc() called with a block not created by nedmalloc!\n");
    abort();
}

changed to:

if(!memsize)
{
    fprintf(stderr, "nedmalloc: nedprealloc() called with a block not created by nedmalloc!\n");
    // abort();
}

result:

test 3 begins...
nedmalloc: nedprealloc() called with a block not created by nedmalloc!
*** glibc detected *** ./nedmalloctester3: munmap_chunk(): invalid pointer: 0x00007f5a29cf9010 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x78b56)[0x7f5a3d97bb56]
./nedmalloctester3[0x408528]
./nedmalloctester3[0x408796]
/lib64/libpthread.so.0(+0x7e0e)[0x7f5a3dcafe0e]
/lib64/libc.so.6(clone+0x6d)[0x7f5a3d9e72bd]
======= Memory map: ========
00400000-0040c000 r-xp 00000000 08:21 125785 nedmalloctester3
0060b000-0060c000 r--p 0000b000 08:21 125785 nedmalloctester3
0060c000-0060d000 rw-p 0000c000 08:21 125785 nedmalloctester3
00a7e000-00a9f000 rw-p 00000000 00:00 0 [heap]
7f5a24000000-7f5a24021000 rw-p 00000000 00:00 0
7f5a24021000-7f5a28000000 ---p 00000000 00:00 0
7f5a29af9000-7f5a37bf9000 rw-p 00000000 00:00 0
7f5a37bf9000-7f5a37bfa000 ---p 00000000 00:00 0
7f5a37bfa000-7f5a383fa000 rw-p 00000000 00:00 0
7f5a383fa000-7f5a383fb000 ---p 00000000 00:00 0
7f5a383fb000-7f5a38ffb000 rw-p 00000000 00:00 0
7f5a38ffb000-7f5a38ffc000 ---p 00000000 00:00 0
7f5a38ffc000-7f5a398fc000 rw-p 00000000 00:00 0
7f5a398fc000-7f5a398fd000 ---p 00000000 00:00 0
7f5a398fd000-7f5a3a0fd000 rw-p 00000000 00:00 0
7f5a3a0fd000-7f5a3a0fe000 ---p 00000000 00:00 0
7f5a3a0fe000-7f5a3a8fe000 rw-p 00000000 00:00 0
7f5a3a8fe000-7f5a3a8ff000 ---p 00000000 00:00 0
7f5a3a8ff000-7f5a3b0ff000 rw-p 00000000 00:00 0
7f5a3b0ff000-7f5a3b100000 ---p 00000000 00:00 0
7f5a3b100000-7f5a3b900000 rw-p 00000000 00:00 0 [stack:9567]
7f5a3b900000-7f5a3b901000 ---p 00000000 00:00 0
7f5a3b901000-7f5a3c101000 rw-p 00000000 00:00 0
7f5a3c101000-7f5a3c102000 ---p 00000000 00:00 0
7f5a3c102000-7f5a3c902000 rw-p 00000000 00:00 0
7f5a3c903000-7f5a3d903000 rw-p 00000000 00:00 0
7f5a3d903000-7f5a3da9e000 r-xp 00000000 08:06 130903 /lib64/libc-2.15.so
7f5a3da9e000-7f5a3dc9e000 ---p 0019b000 08:06 130903 /lib64/libc-2.15.so
7f5a3dc9e000-7f5a3dca2000 r--p 0019b000 08:06 130903 /lib64/libc-2.15.so
7f5a3dca2000-7f5a3dca4000 rw-p 0019f000 08:06 130903 /lib64/libc-2.15.so
7f5a3dca4000-7f5a3dca8000 rw-p 00000000 00:00 0
7f5a3dca8000-7f5a3dcbf000 r-xp 00000000 08:06 130835 /lib64/libpthread-2.15.so
7f5a3dcbf000-7f5a3debe000 ---p 00017000 08:06 130835 /lib64/libpthread-2.15.so
7f5a3debe000-7f5a3debf000 r--p 00016000 08:06 130835 /lib64/libpthread-2.15.so
7f5a3debf000-7f5a3dec0000 rw-p 00017000 08:06 130835 /lib64/libpthread-2.15.so
7f5a3dec0000-7f5a3dec4000 rw-p 00000000 00:00 0
7f5a3dec4000-7f5a3ded9000 r-xp 00000000 08:06 133636 /lib64/libgcc_s.so.1
7f5a3ded9000-7f5a3e0d8000 ---p 00015000 08:06 133636 /lib64/libgcc_s.so.1
7f5a3e0d8000-7f5a3e0d9000 r--p 00014000 08:06 133636 /lib64/libgcc_s.so.1
7f5a3e0d9000-7f5a3e0da000 rw-p 00015000 08:06 133636 /lib64/libgcc_s.so.1
7f5a3e0da000-7f5a3e1cf000 r-xp 00000000 08:06 130868 /lib64/libm-2.15.so
7f5a3e1cf000-7f5a3e3cf000 ---p 000f5000 08:06 130868 /lib64/libm-2.15.so
7f5a3e3cf000-7f5a3e3d0000 r--p 000f5000 08:06 130868 /lib64/libm-2.15.so
7f5a3e3d0000-7f5a3e3d1000 rw-p 000f6000 08:06 130868 /lib64/libm-2.15.so
7f5a3e3d1000-7f5a3e4b9000 r-xp 00000000 08:06 655189 /usr/lib64/libstdc++.so.6.0.17
7f5a3e4b9000-7f5a3e6b9000 ---p 000e8000 08:06 655189 /usr/lib64/libstdc++.so.6.0.17
7f5a3e6b9000-7f5a3e6c1000 r--p 000e8000 08:06 655189 /usr/lib64/libstdc++.so.6.0.17
7f5a3e6c1000-7f5a3e6c3000 rw-p 000f0000 08:06 655189 /usr/lib64/libstdc++.so.6.0.17
7f5a3e6c3000-7f5a3e6d8000 rw-p 00000000 00:00 0
7f5a3e6d8000-7f5a3e6f9000 r-xp 00000000 08:06 140264 /lib64/ld-2.15.so
7f5a3e7a1000-7f5a3e8a6000 rw-p 00000000 00:00 0
7f5a3e8f6000-7f5a3e8f9000 rw-p 00000000 00:00 0
7f5a3e8f9000-7f5a3e8fa000 r--p 00021000 08:06 140264 /lib64/ld-2.15.so
7f5a3e8fa000-7f5a3e8fb000 rw-p 00022000 08:06 140264 /lib64/ld-2.15.so
7f5a3e8fb000-7f5a3e8fc000 rw-p 00000000 00:00 0
7fff27325000-7fff27346000 rw-p 00000000 00:00 0 [stack]
7fff273ff000-7fff27400000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted

@ned14
Copy link
Owner

ned14 commented Oct 17, 2013

That's very useful - it'll either be dlmalloc or my changes to dlmalloc. I'll try rewalking the code path.

@Gerilgfx
Copy link
Contributor Author

pastelink.me/dl/15d838#sthash.Yt123CLX.dpuf

here is a binary compiled on my computer from this source.
(i hope this crap site dont replaces it with some fake crap)

@ned14
Copy link
Owner

ned14 commented Oct 18, 2013

I'm off to the GSoC mentors summit in California tomorrow, then onto Seattle returning in about a week. Thanks for the binaries, and I'll look into them when I get back.

@ned14
Copy link
Owner

ned14 commented Oct 28, 2013

Crap, sorry pastelink.me/dl/15d838#sthash.Yt123CLX.dpuf deletes its files after 7 days. I'll be at home for the next week though, definitely can run it and see what happens.

@Gerilgfx
Copy link
Contributor Author

okay, leave me a message and i will reupload

@ned14
Copy link
Owner

ned14 commented Oct 30, 2013

Message to here, or do you want me to PM you or something?

@Gerilgfx
Copy link
Contributor Author

just leave message here. it notifyes me in email.

@Gerilgfx
Copy link
Contributor Author

btw if you use skype or IM like that, i can pick you up there too.

@Gerilgfx
Copy link
Contributor Author

Gerilgfx commented Jan 4, 2014

any step forward?

@ned14
Copy link
Owner

ned14 commented Jan 4, 2014

If you remember (see thread above) I was waiting on some precompiled binaries from you as I was unable to replicate the problem here. I needed to rule out compiler/platform differences.

Note that currently everything I own is in a container being shipped from Canada to Ireland, and so any ability to run anything will be delayed until the container arrives in February. In particular, right now my access to Linux is very restricted, but I may be able to borrow time on someone's server.

@Gerilgfx
Copy link
Contributor Author

Gerilgfx commented Jan 5, 2014

i have sent it to your mail account back then. it seems you havent recived it. i will recompile them then and upload somewhere.

@Gerilgfx
Copy link
Contributor Author

Gerilgfx commented Jan 5, 2014

http://www.sendspace.com/file/wae5fj (click on ,,Click here to start download from sendspace'')

@ned14
Copy link
Owner

ned14 commented Jan 5, 2014

I have the file, I'll see if I can arrange access to a Linux box. Thanks Geri.

@Gerilgfx
Copy link
Contributor Author

Gerilgfx commented Jan 5, 2014

okay, i am curious to see if it crashes or not.

@Gerilgfx
Copy link
Contributor Author

any success?

@ned14
Copy link
Owner

ned14 commented Feb 12, 2014

Well I only got a hard line to the internet this past week, and therefore a stable SSH connection. But I admit I forgot, thanks for the remind.

On the machine I have access to, a dual core Intel(R) Core(TM) i3-3240 CPU @ 3.40GHz running Ubuntu 12.04 LTS x64, I ran nedmalloctester three times and saw nothing but success.

Given that you compiled it using your compiler, it's probably a CPU timing difference. They're very tricky to track down especially as the usual race condition testing tools don't work so well with nedmalloc. Sorry to be so unhelpful.

Niall

@Gerilgfx
Copy link
Contributor Author

i will have chance to test it on via cpu too after a few weeks, but that only have one core and 32 bit however.

@ned14
Copy link
Owner

ned14 commented Feb 13, 2014

Generally timing bugs appear most on faster clocked CPUs with as many cores as possible. If you can lay your hands on a 32 core 3.8Ghz CPU, that would be the most useful!

Probably better is to mark up nedmalloc with the metadata for a race condition solver. That's a lot of work though, and for me nedmalloc is nearing EOL.

Niall

@Gerilgfx
Copy link
Contributor Author

i have downclocked my computer to ~3.3 ghz (previously i used it on ~3.4 ghz)
the bug still happens.

@Gerilgfx
Copy link
Contributor Author

for(int iteracio=1;iteracio<90;iteracio+=1){

try replacing the for cycle to this, this causes even faster crash for me.

@Gerilgfx
Copy link
Contributor Author

i tested two dlmalloc version:

  • Version 2.8.6 Wed Aug 29 06:57:58 2012 Doug Lea
    and some random dlmalloc from 2006

both works fine.

i find a nedmalloc from 2010, that crashes too.

@Gerilgfx
Copy link
Contributor Author

setting affinity masks:

cpu_set_t cpuset;
CPU_ZERO(&cpuset);
for(int j=0;j<1;j++) CPU_SET(j, &cpuset);
pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);

the crash goes away.
maybe somewhere your header data isnt properly aligned?

@eravon
Copy link

eravon commented Jan 1, 2016

I have exactly same issue. On real world multithreaded app (1 main thread, 3 workers), on a 2 cores processor, it randomly crash with same error
nedmalloc: nedprealloc() called with a block not created by nedmalloc!
problem occurs with a LUA engine, that exclusivly use realloc functions

Can it a be a MALLOC_GLOBAL_LOCK issue ?

@ned14
Copy link
Owner

ned14 commented Jan 1, 2016

I'll be honest, nedmalloc is pretty much EOL for me, system allocators are nowadays plenty fast enough. https://github.com/jemalloc/jemalloc is likely an excellent substitute for nedmalloc.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants