Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Laplacian on irregular Cartesian grid #23

Merged

Conversation

NoraLoose
Copy link
Member

This PR adds a Laplacian on an irregular Cartesian grid: IrregularCartesianLaplacianWithLandMask.

@iangrooms, @jakesteinberg: FYI, this is exactly the finite volume Laplacian2D (more specifically: the corrected version Laplacian2D_FV) that we started with, but

  • works for a general grid with varying dx, dy, (i.e., does not assume constant dy and zonally constant dx as for our NW2 implementation)
  • written much nicer, and works now with dask and on GPU.

The test that I added for this Laplacian is still on a regular grid (with dx=dy=1). I will follow up with a better test, where we test on an irregular grid (maybe a very simple lat-lon grid). This will also involve changing the test_conservation in test_kernels.py to an area-weighted integral conservation test.

@NoraLoose
Copy link
Member Author

NoraLoose commented Feb 11, 2021

My idea is that a lot of the gcm kernels can inherit from IrregularCartesianLaplacianWithLandMask.

As an example, the POP kernel (on a B grid) can use the IrregularCartesianLaplacianWithLandMask with the following choices:

  • U fields (on corner points): dxw = HTN, dyw = HUW, dxs = HUS, dys = HTE, area = UAREA.
  • Tracer fields: dxw = np.roll(HUS, -1, axis=1), dys = np.roll(HTE, -1, axis=1), dxs = np.roll(HTN, -1, axis=0), dys = np.roll(HUW, -1, axis=0), area = TAREA,

where the order of the dimensions is (nlat,nlon).

Similarly, I fed the IrregularCartesianLaplacianWithLandMask with MOM6 grid variables (dxw dyw dxs dys area) in my tests, so I am linking this issue, too. Something similar could be done for MOM5, too.

What would be left to do in all these cases, is the implementation of a proper handling of the (e.g., tripolar) boundary conditions.

@rabernat
Copy link
Contributor

In order for the new tutorial to show up in the docs, you need to link to it from the index.rst page.

@NoraLoose
Copy link
Member Author

@arthurBarthe, @ElizabethYankovsky - I updated the tutorial notebook to illustrate how to filter with fixed filter length scale vs. fixed filter factor. I hope this helps! Let me know if anything is unclear.

@codecov-io
Copy link

codecov-io commented Feb 12, 2021

Codecov Report

Merging #23 (8e01b52) into master (8b92035) will increase coverage by 0.67%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #23      +/-   ##
==========================================
+ Coverage   95.55%   96.22%   +0.67%     
==========================================
  Files           7        7              
  Lines         225      265      +40     
==========================================
+ Hits          215      255      +40     
  Misses         10       10              
Flag Coverage Δ
unittests 96.22% <100.00%> (+0.67%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
gcm_filters/kernels.py 100.00% <100.00%> (ø)
tests/test_filter.py 100.00% <100.00%> (ø)
tests/test_kernels.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8b92035...8e01b52. Read the comment docs.

@rabernat
Copy link
Contributor

I have some comments about your tutorial:

st = xr.open_dataset('/glade/p/univ/unyu0004/neerajab/NeverWorld2/run%i/static.nc' % (run), decode_times=False) 
chunks = {'time': 20} 
av = xr.open_dataset('/glade/p/univ/unyu0004/neerajab/NeverWorld2/run%i/averages_000%i.nc' % (run,end_time-500+2), 
                     chunks=chunks, decode_times=False)

For the tutorials, I would be really nice to use public data, such as we can pull from an OPeNDAP server or from cloud storage. Of course I understand why your are doing it on Casper. But perhaps we could also point to the OSN data that Neeraja produced?

dxw = xr.DataArray(data=st.dxCu.isel(xq=slice(0,Nx)),coords={'yh':av.yh,'xh':av.xh}, dims=('yh','xh'))
dyw = xr.DataArray(data=st.dyCu.isel(xq=slice(0,Nx)),coords={'yh':av.yh,'xh':av.xh}, dims=('yh','xh'))
dxs = xr.DataArray(data=st.dxCv.isel(yq=slice(0,Ny)),coords={'yh':av.yh,'xh':av.xh}, dims=('yh','xh'))
dys = xr.DataArray(data=st.dyCv.isel(yq=slice(0,Ny)),coords={'yh':av.yh,'xh':av.xh}, dims=('yh','xh'))

It's somewhat poor xarray style here to create a new DataArray out of an old one. I understand why you are doing this--because you need all the "grid_vars" to have the same size and dimensions. But I would prefer to use rename on the dimensions to accomplish this.

This actually highlights a big difference between what we are doing in this package and the xgcm approach. With xgcm, we explicitly track the positions of the variables relative to the grid cell. Here, we are sweeping that under the rug, labeling everything to have the same dimensions, and letting the different kernels accept staggered variables. This could lead to confusion and errors.

An alternative would be to add xgcm as a dependency and use its machinery to handle staggered grids.

@NoraLoose
Copy link
Member Author

Thanks for the comments!

For the tutorials, I would be really nice to use public data, such as we can pull from an OPeNDAP server or from cloud storage.

I agree. I guess I mostly made this notebook to get other CPT members started (e.g., outlining how to do fixed factor vs. fixed scale filtering). But if we want to keep this tutorial beyond the development phase, I will look into the OSN data.

It's somewhat poor xarray style here to create a new DataArray out of an old one. I understand why you are doing this--because you need all the "grid_vars" to have the same size and dimensions. But I would prefer to use rename on the dimensions to accomplish this.

Before choosing this "nuclear option", I actually started out with using rename for the dimensions. But I ran into issues because in addition to renaming, one also has to pass the values for the new coordinates. Xarray has probably an option to do this for you, which I will look into. Thanks!

Here, we are sweeping that under the rug, labeling everything to have the same dimensions, and letting the different kernels accept staggered variables. This could lead to confusion and errors. An alternative would be to add xgcm as a dependency and use its machinery to handle staggered grids.

I agree. Not only do we let the kernels accept staggered variables, sometimes you have to find work-arounds to "trick" xarray and make it believe you are not on a staggered grid (see my previous comment). We should make it as easy as possible for the user.

Is your idea to use xgcm only at the user level, e.g., in a fashion that I outlined in #24 (comment)? Under the hood, we would still have to bring all variables to the same dimensions to take full advantage of the generalized ufuncs + dask, correct?

@rabernat
Copy link
Contributor

rabernat commented Feb 15, 2021

Is your idea to use xgcm only at the user level, e.g., in a fashion that I outlined in #24 (comment)? Under the hood, we would still have to bring all variables to the same dimensions to take full advantage of the generalized ufuncs + dask, correct?

Yes, I think we are thinking the same thing. What we would use from xgcm is its ability to "understand" grids. Xgcm knows if a variable is cell centered, left shift, right shift, etc. It also understands metrics like cell area, distance, etc.

We could use that high-level part to check and align the inputs, while keeping the low level of gcm-filters the same. And yes, we would ultimately have to rename the dimensions to align them before passing to apply_ufunc. However, we can do this more carefully than the nuclear option. 💣 😉

Before choosing this "nuclear option", I actually started out with using rename for the dimensions. But I ran into issues because in addition to renaming, one also has to pass the values for the new coordinates.

I think we faced this in xgcm once too: see xgcm/xgcm#153 for discussion.

@rabernat
Copy link
Contributor

rabernat commented Feb 17, 2021

I'd like to suggest a path forward so we can keep making progress.

Short Term

Let's require, for now, that field and all grid_vars share the same dimensions (e.g. nlat, nlon). This will require the user to manually do what you are doing in some cases: rename / relabel the dimensions and coordinates of the grid vars. We recognize that this opens the door to potential user errors, such as misunderstanding what sort of inputs the laplacians require (e.g. cell centered, vorticity point, etc.) There is useful discussion on this in #24. For now, we solve this problem with careful documentation

Long Term

We refactor gcm-filters to leverage xgcm to "understand" the grid more deeply. Each laplacian would have to define the grid positions of all its input variables. The grid_vars can be inferred from the xgcm metrics. @jbusecke does this sound doable?

If folks agree with this plan (leave a 👍 or 👎 ), I think we can move forward and merge this PR.

@NoraLoose
Copy link
Member Author

You left some comments regarding the tutorial notebook that I submitted as part of this PR. If the plan is to add the notebook to the documentation (as suggested in this PR), I would like to put a bit more effort into the notebook, including fixing the things that you mentioned in your comments. I'm also okay with removing this notebook from the index.rst page until it is ready.

@rabernat
Copy link
Contributor

There were two main issues I identified in the tutorial

  • Can we use public data? Yes, but we don't need to resolve this now
  • Can we avoid that somewhat awkward way of relabeling the input dataarrays? This is basically unavoidable if we go with my "short term" suggestion above. So not sure there is much to do at this point.

I'm also okay with removing this notebook from the index.rst page until it is ready.

At this point, we are our only "users" of this package, so we don't have to worry too much about what shows up on the documentation site.

So to summarize, if you want to continue tweak the tutorial, go ahead! But IMO it's good enough already to merge.

@NoraLoose
Copy link
Member Author

Cool. Let's merge it, and I will get back to the fine-tuning of the documentation once we have resolved some more fundamental issues in gcm-filters.

@rabernat rabernat merged commit ed1a007 into ocean-eddy-cpt:master Feb 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants