Print gauges in a critical section, using a single unit number #536

rjleveque · 2022-04-23T01:02:45Z

I changed the print_gauges_and_reset_nextLoc function to print all gauge data using the same output file unit number OUTGAUGEUNIT rather than creating a unit number based on this base plus the OMP thread number. And I put this code in an OMP critical block so that only one thread can be doing this at a time.

The other approach allows writing in parallel, in principle, but recently I've been having seg faults when using lots of threads that seems related to this. It also seems potentially bad practice since we don't know what unit numbers will be used for writing gauges, and we use other unit number for other purposes, so there are potentially conflicts.

I don't think this will introduce any bottlenecks.

…r all

mandli · 2022-04-24T23:52:17Z

There's a simple fix for this that does not require us to serialize this process. Use a set number of unit numbers equal to the number of threads and reuse them. We would therefore create a pool of units that would be available as required but would not go above the number of threads available.

rjleveque · 2022-04-25T03:43:14Z

@mandli: Thanks for reviewing. I think what you are suggesting is the way it was before I changed it: there was a unit number for each thread defined by

   myunit = OUTGAUGEUNIT + mythread

But still I was running into segmentation faults for some runs with 20 threads that seemed to be caused by this, and some people use many more threads so we have to declare all unit numbers above 89 off limits for other purposes (since OUTGAUGEUNIT=89).

Do you think this could be a bottleneck if it's serialized?

mandli · 2022-04-26T00:32:21Z

...and it still causes you issues. I guess I remember that you have mentioned this. It definitely would not be great to serialize this if we have a large number of gauges but we also buffer the output to mitigate the problem as well. Is this reproducible by chance?

In the end it is only serializing the output so maybe that's not so bad as long as we have buffers that avoid times when multiple gauges will be writing but I would expect that these are going to still overlap. For instance, if a bunch of gauges are in the same area. I would be better to understand why there is a seg-fault than anything.

rjleveque · 2022-04-27T17:46:42Z

OK, we can hold off on merging this and I'll try to look into the seg fault issue some more.

This part is still WIP, see discussion at clawpack#536

* put print_gauges in critical block and use same output unit number for all * change a comment about gauge critical section * declare variable that should be integer * support binary gauge output in gauges_module.f90 * added binary gauge output to tests/dtopo1 note that pyclaw.gauges changes are required to read binary file * redo tests/dtopo1 so binary gauge data produced is compared with archived ascii data for portability * update claw_git_status for tests/dtopo1 * revert code that prints gauges in critical section This part is still WIP, see discussion at clawpack#536 * add support for binary32 gauge output Co-authored-by: Randy LeVeque <[email protected]> Co-authored-by: Kyle Mandli <[email protected]>

mjberger · 2022-10-11T08:46:19Z

i think it could be a bottleneck. People can use hundreds of gauges. Maybe we need a new approach. — Marsha

…

On Apr 24, 2022, at 11:43 PM, Randall J. LeVeque ***@***.***> wrote: @mandli <https://github.com/mandli>: Thanks for reviewing. I think what you are suggesting is the way it was before I changed it: there was a unit number for each thread defined by myunit = OUTGAUGEUNIT + mythread But still I was running into segmentation faults for some runs with 20 threads that seemed to be caused by this, and some people use many more threads so we have to declare all unit numbers above 89 off limits for other purposes (since OUTGAUGEUNIT=89). Do you think this could be a bottleneck if it's serialized? — Reply to this email directly, view it on GitHub <#536 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGC2Q4MF4W6LOGRFH4YTVGYIF3ANCNFSM5UD3WTAA>. You are receiving this because you are subscribed to this thread.

rjleveque · 2022-10-11T12:27:09Z

This change was reverted, but I just opened a new issue #543 to remind us to revisit this at some point.

rjleveque added 3 commits April 18, 2022 16:16

put print_gauges in critical block and use same output unit number fo…

3c1405a

…r all

change a comment about gauge critical section

c5549fe

declare variable that should be integer

4de31f7

rjleveque mentioned this pull request Apr 23, 2022

Support writing gauge binary output #537

Merged

rjleveque added a commit to rjleveque/geoclaw that referenced this pull request May 24, 2022

revert code that prints gauges in critical section

4f2d1dd

This part is still WIP, see discussion at clawpack#536

mandli merged commit 4de31f7 into clawpack:master May 31, 2022

rjleveque mentioned this pull request Oct 11, 2022

Possible problem with too many files open when writing gauges? #543

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Print gauges in a critical section, using a single unit number #536

Print gauges in a critical section, using a single unit number #536

rjleveque commented Apr 23, 2022

mandli commented Apr 24, 2022

rjleveque commented Apr 25, 2022

mandli commented Apr 26, 2022

rjleveque commented Apr 27, 2022

mjberger commented Oct 11, 2022 via email

rjleveque commented Oct 11, 2022

Print gauges in a critical section, using a single unit number #536

Print gauges in a critical section, using a single unit number #536

Conversation

rjleveque commented Apr 23, 2022

mandli commented Apr 24, 2022

rjleveque commented Apr 25, 2022

mandli commented Apr 26, 2022

rjleveque commented Apr 27, 2022

mjberger commented Oct 11, 2022 via email

rjleveque commented Oct 11, 2022