-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coarray "get": I got a SIGSEGV, but I do not understand why... #6
Comments
Can you provide an example? I'm not sure I understand what is happening. I know that the compiler can do some IO buffering and there was potentially a problem shutting down the Fortran runtime library gracefully, so it's possible that you may need a As for the SIGSEGV error, how are you installing OpenCoarrays? I'm guessing that I haven't looked through the code in detail yet, but if it's not a bug in OpenCoarrays, then it's possible that it's a parallel programming bug. For example, if you have a put, you must ensure that there is an image control statement separating any get of that memory location so that the execution statements are ordered w.r.t. each other. |
@zbeekman Thank you very much for your help, it is very appreciated!
Sure, there is already a test, but it is not small/clean, today I'll clean it for you. Essentially, I was surprise that use iso_fortran_env
sync all
write(error_unit, *) ' I am image: ', this_image()
sync all
! some stuff
sync all
write(error_unit, *) ' Hello world from image: ', this_image()
sync all will generate an output like stdbuf -i0 -o0 -e0 caf_hello
I am image: 3
I am image: 2
Hello world from image: 2
I am image: 1
... namely mixing
Argggggh, Gandalf has always the right answer!. I have to check, but it is likely probable that I have installed the release version without the debugging symbols... As soon as arrive to office I try to install OpenCoarrays with debugging symbols (along side the release one 😄 )
This is likely the case: this the first CAF program (of a minimum complexity) that I am trying to write. It is probably a parallel programming bug.
There should be a pollution of Thank you very very much! |
I'm about to go to bed, but it may not be obvious getting traceback and debugging symbols activated. I thought traceback was on by default but I need to double check. I can guide you with CMake if you get stuck
This may in fact be a gfortran/OpenCoarrays bug... I need to take a look at the standard.... I'm guessing GFortran is buffering IO to stderr and stdout, so syncall may not be flushing these... I'll try a small reproducer on my systems and see what happens.
I wouldn't be so sure... OpenCoarrays needs more users to torture test it.... I'll give it even odds that it's a OpenCoarrays/GFortran bug.
Sorry... it means With Respect To. |
On Nov 30, 2016, at 9:26 PM, Izaak Beekman ***@***.***> wrote:
I wouldn't be so sure... OpenCoarrays needs more users to torture test it.... I'll give it even odds that it's a OpenCoarrays/GFortran bug.
Oh ye of little faith! I haven’t read the details of this thread but the above quote jumped out at me. :) I’ve been teaching a class for 10 weeks in which we use CAF in nearly every example in lecture and nearly every problem in homework assignments, and I only rarely encounter bugs. I will say, however, that CRITICAL is not a feature I’ve used because of its negative performance implications and I’ve only used unlimited polymorphism rarely because it just feels too limited to be of great value so this code is mixing two features that I’d imagine are quite rarely mixed.
Stefano, as you know, if you’d like to book some time to explain to me what you’re doing and especially the motivations for using CRITICAL, I’d be glad to offer any insights that come to mind. I can’t help but wonder if there’s a better design that would eliminate the need for CRITICAL. I think of that as a last resort only to be used when absolutely necessary — kind of like locks and atomics, which have better alternatives in Fortran 2015 (events).
Damian
|
Can you make a MCVE? I cannot reproduce because In particular, I want to be able to |
Dear Damian,
This is also my feeling: the issue is more related to my poor-fortranish than to a possible (improbable) OC bug.
Arggggghhhhh, do not consider exactly that test... the one uploaded is just last meaningless-modification of the baseline test, the addiction of
You are very very kind, but as we experienced that last time my spoken English is very bad, I do not like to waste your time for a not-so-important talk. Anyhow, soon I'll probably bother you for some CAF teachings (aside I invited @afanfa at my Institute for a lecture on CAF hoping that he will have the patience to talk with oompa loompa 😄 ) Cheers. |
I'll try today to dump a Minimal Complete and Verifiable Example, but I feel that the problem is really into Thank you very very much for your help! P.S. do you know if standard saying anything about buffering on standard error unit? |
Yep, I'll add such bias soon, sorry for the bothering. |
@szaghi Thanks. I was hoping to try Cray and Intel implementations to see if this issue is GCC-specific or not. |
@zbeekman @rouson @jeffhammond Dear All, I have done a small step over. I failed to build OpenCoarrays in debug mode: all the debug matching I found into all build/download scripts seem to be referred to the debugging of the build/download themselves and not to triggering the debug flags for building OpenCoarrays. However, this is quite not important (for the moment, but in the near future I like to have OC with all debug symbols activated), whereas I think I found my error. call self%bucket(b)[i]%get_clone(key=key, content=content) Note that the bucket index src/lib/hasty_hash_table.f90:237:0:
call self%bucket(b)[i]%get_clone(key=key, content=content)
internal compiler error: in gfc_get_tree_for_caf_expr, at fortran/trans-expr.c:1818
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://bugs.archlinux.org/> for instructions. Before asking for help about this ICE, I tried to circumvent it with the following workaround call dictionary_get_clone(self=self%bucket(b)[i], key=key, content=content) where What means passing the remote copy type :: mess
integer :: val=0
type(mess) :: next=>null()
end type mess
...
type(mess), pointer :: foo=>null()
type(mess), pointer :: foo1=>null()
type(mess) :: foo2
allocate(foo)
foo%val = 0
allocate(foo1)
foo1%val = 1
foo1%next=> foo
foo2 = foo1 After the last copy |
Dear Jeff, I added a makefile to build the test I am now playing with (only the branch). However, do not waste your time with it now: there is a great problem on the algorithm I am using to distributing the work-load among the CAF images that must be resolved before trying to fix this SIGSEGV bug... I simply like to say that I am not forgetting to create the MCVE for you, but I need more time. Cheers. |
On Dec 1, 2016, at 12:42 AM, Stefano Zaghi ***@***.***> wrote:
You are very very kind, but as we experienced that last time my spoken English is very bad, I do not like to waste your time for a not-so-important talk.
I have no recollection of difficulty understanding your English so please don’t let that stop you if I can be of any assistance.
(aside I invited @afanfa <https://github.com/afanfa> at my Institute for a lecture on CAF hoping that he will have the patience to talk with oompa loompa 😄 )
That’s great news! I hope it works out for him to visit.
Damian
|
On Dec 1, 2016, at 4:50 AM, Stefano Zaghi ***@***.***> wrote:
call self%bucket(b)[i]%get_clone(key=key, content=content)
@szaghi
As I’m sure you know an internal compiler error is always a compiler bug so you might submit this via the GCC Bugzilla <https://gcc.gnu.org/bugzilla/> site. If you do, then you should also email the bug report to [email protected] <mailto:[email protected]>. I’m pretty certain the above code is invalid, which doesn’t change the fact that an ICE is a compiler bug because the compiler should inform you of the invalid code.
I haven’t checked the standard for the exact language, but the one image is not allowed to execute code in another image, which could be one interpretation of the above code if it were standard-conforming code. Alternatively, if the intention is to get an object from another image and then invoke a TBP on that object, then I’m pretty certain you have to copy that object into a local data structure in a prior statement and then invoke the TBP on the local data structure or do something like what you do below.
Before asking for help about this ICE, I tried to circumvent it with the following workaround
call dictionary_get_clone(self=self%bucket(b)[i], key=key, content=content)
where dictionary_get_clone is the (publically exposed) get_clone method of the dictionary type. This non-TBP version compile correctly. However, I think here is (probably) my error.
What means passing the remote copy bucket(b)[i] as a dummy argument to a local procedure?
I see nothing wrong with your second version. It simply means to get the remote data and pass it as the actual argument to the keyword argument named “self”. It might be nice to see a short, compilable example so we can see the full procedure interface.
I really do not know if this is allowed and what implies. Anyhow, in the case this is allowed, I think that a (temporary) copy of the remote data must be done and, if so, the copy could be a mess... because it is a derived type containing pointers! Consider the following snippet
type :: mess
integer :: val=0
type(mess) :: next=>null()
end type mess
...
type(mess), pointer :: foo=>null()
type(mess), pointer :: foo1=>null()
type(mess) :: foo2
allocate(foo)
foo%val = 0
allocate(foo1)
foo1%val = 1
foo1%next=> foo
foo2 = foo1
After the last copy foo2 = foo1 I think that foo2%next is not pointing to foo, is this right? I check this now trying to dump a MCVE for Jeff...
Oh boy… you probably know how much I try to avoid pointers. They are especially dangerous with coarrays. If you communicate an object between images and that object contains a pointer, I’m pretty sure the pointer becomes undefined. I don’t see clearly that you’re doing this above, but it sounds like it might be happening based on the context. (On a possibly related note, I don’t think the standard allows associating a pointer with a coarray.
And I'm guessing form the “next” name above, that you’re doing this for purposes of constructing a linked list. Linked lists are next on my avoid list right after pointers, but at least in the rare case in which I found it useful to construct a linked list, we constructed it using arrays and indirect addressing rather than using pointers. It makes life much easier. Did you see the video <https://www.youtube.com/watch?v=YQs6IC-vgmo> I posted to c.l.f a while back in which C++ language inventor Bjarne Stroustrup argues that there is almost always a better choice of data structures than a linked list.
Damian
|
Dear Damian,
Yes, I know the ICE meaning, but, as you said, I was almost sure that the first version is invalid, thus I preferred to understand if it was really invalid or not. Submitting a GCC report is not so easy and I would like to write a correct, meaningful report to help my GCC superheroes and do not waste their time. Now that you have confirmed that is invalid I'll try to create a MCVE for GCC team.
This was my intuition when I tried this workaround, but my CAF knowledge is growing empirically. (OT please, consider the idea to write a new book devoted to CAF, it is really necessary...)
I am working on MCVE for (all) of you. Indeed, it is not so easy to reduce all: HASTY is a rather stupid, but not so simple... yesterday I have finished a MCVE for @jeffhammond that I supposed to have all the needed ingredients, but it works right without raising the SIGSEGV 😢
Me too, this was my mantra. Nevertheless, when I started to develop my Adaptive Mesh Refinement methods I needed a different data structure from arrays... and I had to play with pointers, my bad.
Ohhhh, do they become surely undefined, are there no solutions?
Yes, I was carefully to avoid to associate pointers between images. The hash table of HASTY has currently 2 getters,
I use linked list (doubly linked in this case) for the chaining collisions resolution that always happens when we use a not fully injection-hashing-function. I am not really expert, but at least for my use case I cannot use a perfect-hashing without collisions (the tables become huge) thus I need to resolve the collisions. To my knowledge, linked list works very well to this aim.
I see some of these approaches, but I found them rather more complex (for my poor software-engineer level) than a plain linked list. Can you give me some good reference about these approaches? It is preferred your own works (books, papers, reports), I found your English more understandable 😄
Mmm, I partially agree. As I said, I briefly (superficially) read some of these approaches and I found them not so simple. However, this was not the main reason for why I preferred a plain linked list (if I have to discard all things that I do not understand the first time probably I have to take care of my garden...). My main concern for AMR data structure is to have almost good access efficiency, namely on average to obtain something near O(1), while having efficient put/remove nodes O(1) that is typically a feature of linked lists. With the indirect arrays indexes is this put/remove efficiency possible?
No, I missed it. I'll see it when Angelica will sleep 😄 Damian, thank you very very much, as always, great teachings! |
Dear Damian, I created a (potential) bug report for the ICE I got with GNU gfortran, see it here, number 78682. I write a MCVE for raising the ICE, it is the following module core_module
implicit none
type :: core
integer :: core_value
contains
procedure :: core_value_print
end type
contains
subroutine core_value_print(self)
class(core), intent(in) :: self
print*, 'image: ', this_image(), ' core value: ', self%core_value
end subroutine core_value_print
end module core_module
program gfortran_ice_caf
use core_module
implicit none
type(core), allocatable :: core_caf[:]
allocate(core_caf[*])
if (mod(this_image(), 2)==0) then
core_caf%core_value = 2
else
core_caf%core_value = 1
endif
if (this_image()==2) call core_caf[1]%core_value_print
end program gfortran_ice_caf Building it on my workstation with GNU gfortran 6.2.1 I obtain stefano@zaghi(04:30 PM Mon Dec 05)
~/fortran/compilers_bug/gfortran-ice-caf-derived_type 1 files, 12Kb
→ gfortran -fcoarray=lib gfortran_ice_caf.f90
gfortran_ice_caf.f90:31:0:
if (this_image()==2) call core_caf[1]%core_value_print
internal compiler error: in gfc_get_tree_for_caf_expr, at fortran/trans-expr.c:1818
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://bugs.archlinux.org/> for instructions. As soon as it will confirmed a bug (maybe it is not), I can add it to your AdHoc if you like. My best regards. |
Thank for letting me know. I just added a comment on the bug report. |
I read it. Thank you too for your great help. I have re-build mpich/openmpi/opencoarrays with the latest 7.0.0 trunk and as Janus said the ICE vanished thus this is an issue for the versions before 7.x However, the 7.0.0 testing version compile the invalid code without any warnings/errors, and this is another issue... @rouson @zbeekman I have another question: the gfortran versions before 7.0.0 complain with derived type caf having allocatable members (but not for pointer ones) thus I had to do some tricks with gfortran 6.2.1, while the new 7.0.0 accepts also derived type with allocatable members. Is allocatable members into caf allowed by the standard 2008/2015? I am thinking to adopt 7.x trunk as the base version for HASTY... Cheers. |
On Dec 8, 2016, at 8:30 AM, Stefano Zaghi ***@***.***> wrote:
@rouson <https://github.com/rouson>
I read it. Thank you too for your great help.
I have re-build mpich/openmpi/opencoarrays with the latest 7.0.0 trunk and as Janus said the ICE vanished thus this is an issue for the versions before 7.x However, the 7.0.0 testing version compile the invalid code without any warnings/errors, and this is another issue...
The standard defines certain “constraints” that tell compiler vendors what they are required to detect and report at compile-item. I don’t think there is a constraint related to this issue and I can’t imagine there could be one. Except in trivial cases, this would be difficult or impossible for a compiler to detect without executing the code.
@rouson <https://github.com/rouson> @zbeekman <https://github.com/zbeekman> I have another question: the gfortran versions before 7.0.0 complain with derived type caf having allocatable members (but not for pointer ones) thus I had to do some tricks with gfortran 6.2.1, while the new 7.0.0 accepts also derived type with allocatable members. Is allocatable members into caf allowed by the standard 2008/2015? I am thinking to adopt 7.x trunk as the base version for HASTY...
Yes, allocatable components of derived-type coarrays are allowed in 2008 and 2015. The implementation of that feature in gfortran/OpenCoarrays is immature. It might work in some cases, but not in others so proceed with caution. I hope that the support will be more complete within a few weeks.
It’s probably not a great idea to adopt a pre-release version as a requirement because pre-release versions can be unstable. I’m a fan of staying at the bleeding edge and I regularly use pre-release versions of gfortran. In fact, I just laugh a whole graduate course based on pre-release versions of gfortran, but it was very tricky and only worked out because I was able to get quick responses on bug fixes and update the students to a new version of the compiler midway through the course. The major benefit of using the virtual machine is that I can roll whole new environment out to students at any time and know that each of them is working in the exact same environment even though each works on the system of their choosing.
Damian
|
Dear Damian,
Sure, what I meant is that is another issue for me not for the compiler: I am learning CAF empirically, if the compiler compiles and runs invalid codes is an issue for my way.
Good, I am going with allocatables.
Sure, I agree, but HASTY is an experiment: I have to demonstrate to my bosses that CAF is possible... for the time that HASTY is finished I will have gcc 8...
I am alone, I handle my system alone, I have a rolling release GNU Linux OS, no problem to follow gcc updates 😄 Cheers |
I had done a lot of dry-clean stuff on a MCVE and I finally understand that the SIGSEGV is caused by allocatable members into CAF: gfortran lower than 7.0.0 complains directly at compile time (but I used a workaround that shadows the issue), while gcc trunk 7.0.0 compiles the caf with allocatable memeber but generates a SIGSEGV at runtime: if I make the member static the SIGSEGV vanishes and the results are as expetect. The MCVE is not yet so minimal (more than 1000 slocs), but in a few hours I hope to reduce it at mimimum: as soon as I have a MCVE can you test it with Cray? |
The MCVE is not yet so minimal (more than 1000 slocs), but in a few hours I hope to reduce it at mimimum: as soon as I have a MCVE can you test it with Cray?
SLOC isn't that important. I just want to be able to type "make FC=ftn" and the "srun .. $BIN" and have it work, or something close to that.
|
Sure, it is going to be one file test 😄 |
Dear Jeff and Damian, in the following there is a the most minimal example that I was able to create. Unfortunately, it behaves slightly different with respect the HASY-dry example... read the following. @rouson Damian: this example generates an ICE with all gcc I have... do you think I have to submit to the same thicket I opened for the other issue? Or this is more related to allocatable members of caf? The MCVEI cannot attach a fortran code here, thus please cut/paste the following code module node_module
use, intrinsic :: iso_fortran_env
implicit none
private
public :: node
type :: node
#ifdef RAISE_ERROR
integer(int32), allocatable :: storage
#else
integer(int32) :: storage
#endif
contains
procedure :: add
procedure :: get
end type node
contains
subroutine add(self, storage)
class(node), intent(inout) :: self
integer(int32), intent(in) :: storage
#ifdef RAISE_ERROR
if (.not.allocated(self%storage)) allocate(self%storage)
#endif
self%storage = storage
end subroutine add
subroutine get(self, storage)
class(node), intent(in) :: self
integer(int32), allocatable, intent(out) :: storage
#ifdef RAISE_ERROR
if (allocated(self%storage)) allocate(storage, source=self%storage)
#else
allocate(storage, source=self%storage)
#endif
end subroutine get
end module node_module
module caf_module
use, intrinsic :: iso_fortran_env
use node_module
implicit none
private
public :: caf
type :: caf
private
type(node), allocatable :: array[:]
contains
procedure :: add
procedure :: get
procedure :: initialize
end type caf
contains
subroutine add(self, storage)
class(caf), intent(inout) :: self
integer(int32), intent(in) :: storage
call self%array%add(storage=storage)
end subroutine add
subroutine get(self, image, storage)
class(caf), intent(in) :: self
integer(int32), intent(in) :: image
integer(int32), allocatable, intent(out) :: storage
type(node) :: copy_of_remote_node
if (image/=this_image()) then
copy_of_remote_node = self%array[image]
call copy_of_remote_node%get(storage=storage)
else
call self%array%get(storage=storage)
endif
end subroutine get
subroutine initialize(self)
class(caf), intent(inout) :: self
if (allocated(self%array)) deallocate(self%array)
allocate(self%array[*])
end subroutine initialize
end module caf_module
program sigsegv_caf_dt
use, intrinsic :: iso_fortran_env
use caf_module
implicit none
type(caf) :: caf_storage
integer(int32), allocatable :: storage
call caf_storage%initialize
sync all
call caf_storage%add(storage=int(this_image(), int32))
sync all
if (this_image()==1) then
print*, 'hello from image: ', this_image()
print*, 'test get from image 2 by image 1'
call caf_storage%get(storage=storage, image=2_int32)
if (allocated(storage)) then
print*, 'storage cloned: ', storage
else
print*, 'get_clone failed'
endif
endif
end program sigsegv_caf_dt Save it with .F90 extension it has one cpp macro to enable/disable the SIGSEGV (namely, to enable/disable allocatable My logI compiled it with GNU Fortran (GCC) 7.0.0 20161206 (experimental), MPICH 3.2.0 (compiled by gcc 7.0.0) and OpenCoarrays 1.7.5 (compiled by gcc 7.0.0). Working goodIf I compile the static version with → caf -fcoarray=lib sigsegv_caf_dt.F90 I obtain the expected result stefano@zaghi(05:34 PM Fri Dec 09) desk {opencoarrays-1.7.5-gnu-7.0.0 - OpenCoarrays 1.7.5 with gcc 7.0.0 environment}
~/fortran/compilers_bug/gfortran_sigsegv_caf_dt_allocatable_member 4 files, 80Kb
→ cafrun -np 2 a.out
hello from image: 1
test get from image 2 by image 1
storage cloned: 2 If I enable the allocatability of → caf -fcoarray=lib sigsegv_caf_dt.F90 -DRAISE_ERROR I obtain... an ICE (sigh, my bad... with the HASTY example it compiles right, but generates a runtime SIGSEGV) stefano@zaghi(05:34 PM Fri Dec 09) desk {opencoarrays-1.7.5-gnu-7.0.0 - OpenCoarrays 1.7.5 with gcc 7.0.0 environment}
~/fortran/compilers_bug/gfortran_sigsegv_caf_dt_allocatable_member 4 files, 80Kb
→ caf -fcoarray=lib sigsegv_caf_dt.F90 -DRAISE_ERROR
sigsegv_caf_dt.F90:87:0:
end module caf_module
internal compiler error: Segmentation fault
0xc0db4f crash_signal
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/toplev.c:333
0xeb1764 recompute_tree_invariant_for_addr_expr(tree_node*)
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/tree.c:4317
0xeb1d7c build1_stat(tree_code, tree_node*, tree_node*)
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/tree.c:4414
0x92c76c build1_stat_loc
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/tree.h:3903
0x92c76c fold_build1_stat_loc(unsigned int, tree_code, tree_node*, tree_node*)
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fold-const.c:12139
0x6f204f gfc_build_addr_expr(tree_node*, tree_node*)
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans.c:298
0x70532b structure_alloc_comps
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans-array.c:8329
0x7827b3 gfc_trans_deallocate(gfc_code*)
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans-stmt.c:6477
0x6f1bf7 trans_code
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans.c:1942
0x7742f3 gfc_trans_if_1
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans-stmt.c:1303
0x77c39a gfc_trans_if(gfc_code*)
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans-stmt.c:1334
0x6f1ce7 trans_code
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans.c:1878
0x7742f3 gfc_trans_if_1
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans-stmt.c:1303
0x77c39a gfc_trans_if(gfc_code*)
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans-stmt.c:1334
0x6f1ce7 trans_code
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans.c:1878
0x77e271 gfc_trans_simple_do
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans-stmt.c:1924
0x77e271 gfc_trans_do(gfc_code*, tree_node*)
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans-stmt.c:2057
0x6f1cba trans_code
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans.c:1890
0x723038 gfc_generate_function_code(gfc_namespace*)
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans-decl.c:6271
0x6f6949 gfc_generate_module_code(gfc_namespace*)
/opt/arch/gcc/opencoarrays/prerequisites/downloads/trunk/gcc/fortran/trans.c:2164
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions. So my questions are:
Summary
As always, you help is priceless, you are my heroes! Cheers. |
Dear All,
I am sorry for bothering you...
In the just-uploaded feature/add-coarray-buckets branch I obtain a SIGSEGV that I am not able to debug... any hints are much more than welcome. In the following there is a full report.
The test
The test is minimal
The
get_clone
method isThe statement
call dictionary_get_clone(self%bucket(b)[i], key=key, content=content)
is where all evil starts.Results using OpenCoarrays/GNU gfortran
The call to
get_clone
raises a SIGSEGV if the number of images is greater than 1Valgrind inspection
There are memory leaks, but I cannot understand why
Digging deeper
I think that the final memory leak happens when I try to check if a node has a key here
Note that
self
is not defined as pointer, but when I invokehas_key
as method it is likely a pointer into a list. Moreover, before callinghas_key
on a pointer-node I check if the node is associated, see hereThe
call iterator...
statement is where I actually pass thehas_key
iterator check on pointer-node `p'.@LadaF @jeffhammond @MichaelSiehl @zbeekman @rouson have some suggestions? (do not think I want you to force to read all, just what do you make in such situations?).
In such situation I generally try other Compilers, but as you know for this project I have to stick on GNU gfortran (OpenCoarrays).
O.T. @rouson @MichaelSiehl @zbeekman I am failing to force a
sync all
for debugging output: even echoing on standard error unit an disabling all IO buffering of my shell thewrite(error_unit...)
of my tests seems to be unaffected bysync all
: issync all
really like mpi barrier or I am misunderstanding (a lot)?The text was updated successfully, but these errors were encountered: