Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a parallel loop to do esmpy regridding for many levels #773

Closed
wants to merge 6 commits into from

Conversation

valeriupredoi
Copy link
Contributor

Before you start, please read our contribution guidelines.

Tasks

  • Create an issue to discuss what you are going to do, if you haven't done so already (and add the link at the bottom)
  • This pull request has a descriptive title that can be used in a changelog
  • Add unit tests
  • Public functions should have a numpy-style docstring so they appear properly in the API documentation. For all other functions a one line docstring is sufficient.
  • If writing a new/modified preprocessor function, please update the documentation
  • Circle/CI tests pass. Status can be seen below your pull request. If the tests are failing, click the link to find out why.
  • Codacy code quality checks pass. Status can be seen below your pull request. If there is an error, click the link to find out why. If you suspect Codacy may be wrong, please ask by commenting.
  • Please use yamllint to check that your YAML files do not contain mistakes
  • If you make backward incompatible changes to the recipe format, make a new pull request in the ESMValTool repository and add the link below

If you need help with any of the tasks above, please do not hesitate to ask by commenting in the issue or pull request.


Closes #issue_number #724

This PR introduces a parallel loop that assembles the esmpy regridders from multiple processes, this is needed because in the case of eg 75 levels one needs to wait forever and a half. This speeds up the regridder assembly from 400s to 1s 🍺

@valeriupredoi valeriupredoi marked this pull request as draft September 9, 2020 16:59
@valeriupredoi
Copy link
Contributor Author

OK found a bug - was printing a list from a generator (and that was eliminating the generator, no wonder that was so fast hahah). Here's a dilemma: even if this is faster by a factor of 5-6 for eg 8 processes, this means we can't use max_parallel_tasks more than 1 since daemonic processes can't have children (I love this error message 😁 ) so we'll take a serious hit with the other tasks that will have to stay serial; any way the 3d regridding can be made faster w/o parallelization @zklaus ? 🍺

@valeriupredoi
Copy link
Contributor Author

OK so the reason why this is still a Draft is that it doesn't actually work - the parallel loop stalls and I've managed to identify where it stalls:

def build_regridder_2d(iter_pack, regrid_method, mask_threshold):
    """Build regridder for 2d regridding."""
    src_rep, dst_rep = iter_pack
    dst_field = cube_to_empty_field(dst_rep)
    src_field = cube_to_empty_field(src_rep)
    regridding_arguments = {
        'srcfield': src_field,
        'dstfield': dst_field,
        'regrid_method': regrid_method,
        'unmapped_action': ESMF.UnmappedAction.IGNORE,
        'ignore_degenerate': True,
    }

-> this block runs fine in parallel!

    if np.ma.is_masked(src_rep.data):
        src_field.data[...] = ~src_rep.data.mask.T
        src_mask = src_field.grid.get_item(ESMF.GridItem.MASK,
                                           ESMF.StaggerLoc.CENTER)
        src_mask[...] = src_rep.data.mask.T
        center_mask = dst_field.grid.get_item(ESMF.GridItem.MASK,
                                              ESMF.StaggerLoc.CENTER)
        center_mask[...] = 0
        mask_regridder = ESMF.Regrid(
            src_mask_values=MASK_REGRIDDING_MASK_VALUE[regrid_method],
            dst_mask_values=np.array([]),
            **regridding_arguments)
        regr_field = mask_regridder(src_field, dst_field)
        dst_mask = regr_field.data[...].T < mask_threshold
        center_mask[...] = dst_mask.T
    else:
        dst_mask = False

-> there is something in this block that triggers a wait/acquire call that stalls all the processes; if you comment it out (well, of course it's silly, but for prototyping purposes) it goes on and builds the field_regridder fine, only to stop at:

    def regridder(src):
        """Regrid 2d for irregular grids."""
        res = get_empty_data(dst_rep.shape, src.dtype)
        data = src.data
        if np.ma.is_masked(data):
            data = data.data
        src_field.data[...] = data.T
        regr_field = field_regridder(src_field, dst_field)
        res.data[...] = regr_field.data[...].T
        res.mask[...] = dst_mask
        return res

with reducing (pickling) issues - this is expected since it's very hard to pickle a function within a function - if that is converted to normal excution block rather than a function all goes through nicely in parallel. Thing is I am not confident I can make the necessary code changes since I am not that familiar with the esmpy regridding, that's why @bouweandela @zklaus and meself we should probably have a chat about it 🍺

@valeriupredoi
Copy link
Contributor Author

OK I've dug even deeper in this - things like 2dim lat objects are hard to pass around to processes eg lat.ndim or lat.points are fine and accessed instantly by the child process but stuff like lat.bounds result in deadlocks; no such problems with 1dim coords. Hmmm this is harder to parallelize than I first thought it'd be 🤦‍♂️

@valeriupredoi
Copy link
Contributor Author

this is not working and is a bit of a dead end so let's keep the discussion going in #775

@valeriupredoi valeriupredoi deleted the speedup_esmpy branch September 30, 2020 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preprocessor Related to the preprocessor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant