-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
misc: Add MPI0 logging level #2130
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2130 +/- ##
==========================================
+ Coverage 79.44% 79.46% +0.01%
==========================================
Files 232 232
Lines 43607 43618 +11
Branches 8072 8073 +1
==========================================
+ Hits 34645 34660 +15
+ Misses 8207 8203 -4
Partials 755 755 ☔ View full report in Codecov by Sentry. |
devito/mpi/distributed.py
Outdated
@@ -190,6 +191,10 @@ def __init__(self, shape, dimensions, input_comm=None, topology=None): | |||
global init_by_devito | |||
init_by_devito = True | |||
|
|||
# If MPI0 logging is selected, only emit from rank 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a safe way to do it. You also end up here quite often when devito is called from an external program that initialized MPI already and that would completely turn off the logging on all ranks including 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is something we want it needs to be done outside MPI.Is_initialized()
and on a per-rank
base using get_Rank
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I second Mathias. Here you're potentially altering the overarching application behavior.
Also you probably don't need to add a new log level... just restrict logging to rank 0 (if MPI active) inside op._emit_perf_profile or whatever is called. No?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is addressed now
devito/mpi/distributed.py
Outdated
@@ -190,6 +191,10 @@ def __init__(self, shape, dimensions, input_comm=None, topology=None): | |||
global init_by_devito | |||
init_by_devito = True | |||
|
|||
# If MPI0 logging is selected, only emit from rank 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I second Mathias. Here you're potentially altering the overarching application behavior.
Also you probably don't need to add a new log level... just restrict logging to rank 0 (if MPI active) inside op._emit_perf_profile or whatever is called. No?
ef111d5
to
76cca39
Compare
devito/operator/operator.py
Outdated
|
||
# In case 'MPI0' is selected for logging, restrict result printing to one rank | ||
if configuration['mpi']: | ||
set_log_level(configuration['log-level'], comm=args.comm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't really like changing configuration like that unless it's a temporary swithconfig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on this comment: #2130 (comment)
I thought that we would be happy to happen here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, this would change user-level setup after the first Operator run, so afterwards we would only see output from one rank
76cca39
to
e7f0d97
Compare
6a21332
to
3544c7a
Compare
devito/operator/operator.py
Outdated
|
||
# In case 'MPI0' is selected for logging, restrict result printing to one rank | ||
if configuration['mpi']: | ||
set_log_level(configuration['log-level'], comm=args.comm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, this would change user-level setup after the first Operator run, so afterwards we would only see output from one rank
devito/operator/operator.py
Outdated
@@ -952,7 +957,12 @@ def _emit_apply_profiling(self, args): | |||
if a in args: | |||
perf_args[a] = args[a] | |||
break | |||
perf("Performance[mode=%s] arguments: %s" % (self._mode, perf_args)) | |||
|
|||
if configuration['mpi']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpicking: the following way more easily generalisable to any number of items
items = []
items.append('mode=%s' % self._mode)
if configuration['mpi']: #Do we really want this `if` ? maybe we just spell it out regardless
items.append('mpi=%s' % configuration['mpi'])
perf("Performance[%s] arguments: %s" % (', '.join(items), perf_args))
To add: |
3544c7a
to
c571185
Compare
c571185
to
a5ef5eb
Compare
Would this PR help: #2175 |
a5ef5eb
to
51fb9f6
Compare
51fb9f6
to
97b9804
Compare
423e563
to
4543339
Compare
Is this PR fine? |
devito/operator/operator.py
Outdated
else: | ||
perf("Performance[mode=%s] arguments: %s" % (self._mode, perf_args)) | ||
|
||
# Restore logging configuration to all ranks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't u use switchconfig?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting to draft as apprantly switchcongfig does not work |
00a4c34
to
c67c473
Compare
devito/operator/operator.py
Outdated
# In case MPI is used restrict result logging to one rank only | ||
if configuration['mpi']: | ||
# Only temporarily change configuration | ||
with switchconfig(mpi=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does mpi=True mean?
devito/operator/operator.py
Outdated
if configuration['mpi']: | ||
# Only temporarily change configuration | ||
with switchconfig(mpi=True): | ||
set_log_level('DEBUG', comm=args.comm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again: you're switching to DEBUG and then it remains DEBUG once control is handed back to the user
it shouldn't be DEBUG either, it should be whatever was set up to this point
and finally we shouldn't switch to mpi=True (why btw?) because it will trigger a compiler rebuild which we should (can) avoid: https://github.com/devitocodes/devito/blob/master/devito/__init__.py#L74
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right. Makes sense. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to add a logger context manager as per https://devitocodes.slack.com/archives/C7JMLMSG0/p1711357031925169?thread_ts=1711125928.651539&cid=C7JMLMSG0
9dcab9f
to
5ff0adf
Compare
devito/logger.py
Outdated
# Add extra logging levels (note: INFO has value=20, WARNING has value=30) | ||
DEBUG = logging.DEBUG | ||
# Add extra logging levels | ||
DEBUG = logging.DEBUG # value=10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implementation detail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand what to change here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the numbers should be dropped
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am reverting as before, some detail is useful
devito/operator/operator.py
Outdated
# Output summary of performance achieved | ||
return self._emit_apply_profiling(args) | ||
# In case MPI is used restrict result logging to one rank only | ||
if configuration['mpi']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this shouldn't be needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed only if the user accidentally asks for copies of an execution.
i.e. DEVITO_MPI=0 mpirun -n 2.
if not it will break with:
Attempting to use an MPI routine (internal_Comm_rank) before initializing or after finalizing MPICH
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can get away with a try-except in mpi_switch_log ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added another condition in set_log_level
devito/operator/operator.py
Outdated
return self._emit_apply_profiling(args) | ||
# In case MPI is used restrict result logging to one rank only | ||
if configuration['mpi']: | ||
with mpi_switch_log(log_level='DEBUG', comm=args.comm): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why make the "switch log" mpi-specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid confusion between having sth like set_log_level and switch_log
I am happy if you propose a name for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switch_log_level
is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
devito/parameters.py
Outdated
|
||
def __init__(self, **params): | ||
self.params = {k.replace('_', '-'): v for k, v in params.items()} | ||
self.previous = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is never updated, how can this even work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was using the parent class enter
devito/parameters.py
Outdated
""" | ||
|
||
def __init__(self, **params): | ||
self.params = {k.replace('_', '-'): v for k, v in params.items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's params for?
this function should expect, imho, two arguments: a level and a comm (the latter being optional)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, maybe I can do it simpler
@@ -77,7 +77,7 @@ def set_log_level(level, comm=None): | |||
""" | |||
from devito import configuration | |||
|
|||
if comm is not None: | |||
if comm is not None and configuration['mpi']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this addition needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To elegantly alleviate the case of DEVITO_MPI=0 mpirun -n 2 (running copies of the problem) as discussed
devito/operator/operator.py
Outdated
# In case MPI is used restrict result logging to one rank only | ||
with switch_log_level(comm=args.comm): | ||
return self._emit_apply_profiling(args) | ||
|
||
return self._emit_apply_profiling(args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emitting twice ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ha! Actually not emmiting twice, but dead code never executed! dropping
devito/parameters.py
Outdated
class switch_log_level(object): | ||
""" | ||
A context manager subclassing `switchconfig` to temporarily change | ||
MPI logging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
old docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
devito/parameters.py
Outdated
self.comm = comm | ||
|
||
# Limit logging to rank 0 | ||
set_log_level(self.level, comm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
theoretically shouldn't this be inside __enter__
? though in practice it might not change much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
devito/parameters.py
Outdated
@@ -258,6 +258,29 @@ def wrapper(*args, **kwargs): | |||
return wrapper | |||
|
|||
|
|||
class switch_log_level(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't this belong to logger.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved, but inserted import configuration
in in __init__
to avoid circular imports
e48ff09
to
3529090
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple of nitpicks, but this is now GTG basically
devito/logger.py
Outdated
|
||
def __exit__(self, *args): | ||
set_log_level(self.level) | ||
logger.addHandler(stream_handler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this line go inside set_log_level
in an else
branch representing the comm is None
case? this way, a set_log_level
in user code will behave as expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: the above is nitpicking
3529090
to
c2d85ed
Compare
c2d85ed
to
52cb671
Compare
52cb671
to
e459d52
Compare
|
||
Until Devito v3.5 included, domain decomposition occurs along the fastest axis. As of later versions, domain decomposition occurs along the slowest axis, for performance reasons. And yes, it is possible to control the domain decomposition in user code, but this is not neatly documented. Take a look at `test_custom_topology` in [this file](https://github.com/devitocodes/devito/blob/master/tests/test_mpi.py). In essence, `Grid` accepts the optional argument `topology`, which allows the user to pass a custom topology as an n-tuple, where `n` is the number of distributed dimensions. For example, for a two-dimensional grid, the topology `(4, 1)` will decompose the slowest axis into four partitions, one partition per MPI rank, while the fastest axis will be replicated over all MPI ranks. | ||
Until Devito v3.5 included, domain decomposition occurs along the fastest axis. As of later versions, domain decomposition occurs along the slowest axis, for performance reasons. And yes, it is possible to control the domain decomposition in user code, but this is not neatly documented. Take a look at `class CustomTopology` in [distributed.py](https://github.com/devitocodes/devito/blob/master/devito/mpi/distributed.py) and `test_custom_topology` in [this file](https://github.com/devitocodes/devito/blob/master/tests/test_mpi.py). In essence, `Grid` accepts the optional argument `topology`, which allows the user to pass a custom topology as an n-tuple, where `n` is the number of distributed dimensions. For example, for a two-dimensional grid, the topology `(4, 1)` will decompose the slowest axis into four partitions, one partition per MPI rank, while the fastest axis will be replicated over all MPI ranks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Up to and including Devito v3.5," would be a more natural wording
|
||
# Triggers a callback to `_set_log_level` | ||
configuration['log-level'] = level | ||
|
||
|
||
class switch_log_level(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
object
is no longer needed. Just class switch_log_level:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpicking: class names start with upercase and don't use underscore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context managers lower case is probably acceptable
|
||
# Triggers a callback to `_set_log_level` | ||
configuration['log-level'] = level | ||
|
||
|
||
class switch_log_level(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpicking: class names start with upercase and don't use underscore
A context manager to temporarily change MPI logging. | ||
""" | ||
|
||
def __init__(self, comm): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comm is confusing we use it everywhere for the MPI comm, rename to level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is actually an MPI comm
# Output summary of performance achieved | ||
return self._emit_apply_profiling(args) | ||
# In case MPI is used restrict result logging to one rank only | ||
with switch_log_level(comm=args.comm): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remind me why switchconfig doesn't work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have to pass the comm
This logging mode helps to reduce the output of MPI runs to only emit performance numbers once per MPI rank.
It is mostly aimed to reduce the redundant verbose performance output.
e.g. instead of:
we get: