-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Print gauges in a critical section, using a single unit number #536
Print gauges in a critical section, using a single unit number #536
Conversation
There's a simple fix for this that does not require us to serialize this process. Use a set number of unit numbers equal to the number of threads and reuse them. We would therefore create a pool of units that would be available as required but would not go above the number of threads available. |
@mandli: Thanks for reviewing. I think what you are suggesting is the way it was before I changed it: there was a unit number for each thread defined by
But still I was running into segmentation faults for some runs with 20 threads that seemed to be caused by this, and some people use many more threads so we have to declare all unit numbers above 89 off limits for other purposes (since Do you think this could be a bottleneck if it's serialized? |
...and it still causes you issues. I guess I remember that you have mentioned this. It definitely would not be great to serialize this if we have a large number of gauges but we also buffer the output to mitigate the problem as well. Is this reproducible by chance? In the end it is only serializing the output so maybe that's not so bad as long as we have buffers that avoid times when multiple gauges will be writing but I would expect that these are going to still overlap. For instance, if a bunch of gauges are in the same area. I would be better to understand why there is a seg-fault than anything. |
OK, we can hold off on merging this and I'll try to look into the seg fault issue some more. |
This part is still WIP, see discussion at clawpack#536
* put print_gauges in critical block and use same output unit number for all * change a comment about gauge critical section * declare variable that should be integer * support binary gauge output in gauges_module.f90 * added binary gauge output to tests/dtopo1 note that pyclaw.gauges changes are required to read binary file * redo tests/dtopo1 so binary gauge data produced is compared with archived ascii data for portability * update claw_git_status for tests/dtopo1 * revert code that prints gauges in critical section This part is still WIP, see discussion at clawpack#536 * add support for binary32 gauge output Co-authored-by: Randy LeVeque <[email protected]> Co-authored-by: Kyle Mandli <[email protected]>
i think it could be a bottleneck. People can use hundreds of gauges. Maybe we need a new approach.
— Marsha
… On Apr 24, 2022, at 11:43 PM, Randall J. LeVeque ***@***.***> wrote:
@mandli <https://github.com/mandli>: Thanks for reviewing. I think what you are suggesting is the way it was before I changed it: there was a unit number for each thread defined by
myunit = OUTGAUGEUNIT + mythread
But still I was running into segmentation faults for some runs with 20 threads that seemed to be caused by this, and some people use many more threads so we have to declare all unit numbers above 89 off limits for other purposes (since OUTGAUGEUNIT=89).
Do you think this could be a bottleneck if it's serialized?
—
Reply to this email directly, view it on GitHub <#536 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGC2Q4MF4W6LOGRFH4YTVGYIF3ANCNFSM5UD3WTAA>.
You are receiving this because you are subscribed to this thread.
|
This change was reverted, but I just opened a new issue #543 to remind us to revisit this at some point. |
I changed the
print_gauges_and_reset_nextLoc
function to print all gauge data using the same output file unit numberOUTGAUGEUNIT
rather than creating a unit number based on this base plus the OMP thread number. And I put this code in an OMP critical block so that only one thread can be doing this at a time.The other approach allows writing in parallel, in principle, but recently I've been having seg faults when using lots of threads that seems related to this. It also seems potentially bad practice since we don't know what unit numbers will be used for writing gauges, and we use other unit number for other purposes, so there are potentially conflicts.
I don't think this will introduce any bottlenecks.