-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add regridding section #9
Add regridding section #9
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some food for thought.
I may have more to add later, but I want to get these suggestions out
(and I want a link to reference in another issue!)
```python | ||
# Consider a bonus task to measure the relative time taken. | ||
# import time... etc | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sensible way is probably to use the notebook "magic" for this %timeit
**Step 4:** Plot the results and the difference. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth making clear here that we can (+expect to) use standard Iris/matplotlib tools, not VTK
|
||
**Step 4:** Create a regridder from `mesh_cube` to the single celled cube you created. | ||
|
||
*Note:* ESMF represents all lines as sections of great circles rather than lines of constant latitude. It also cannot represent sections of great circles with angle equal or greater than 180 degrees. `MeshToGridESMFRegridder` would fail to properly handle the single celled cube without the `resolution` keyword. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this section is effectively addressing two different points at once.
At least on first reading, the "cannot represent..." part seems irrelevant to the main point (though now I've read more, I see it is really not !).
I think, effectively, the latitude approximation and unrepresentable-cells problems are separate, even though the resolution
key addresses both.
So perhaps the first example could have a smaller longitude range, to avoid the "unrepresentable" problem + only address the latitude-bounds question ?
Personally I also don't much like "would fail to properly handle .. without ..." bit.
- what would happen without the
resolution
keyword - what difference does it make.
So, I think we should maybe state this in a more positive way, describing what we do, how + why.
E.G. something like
We provide a
resolution
keyword to divide each target cell into so many multiple sub-cells : this approximates a single cell whose upper+lower boundaries follow lines of latitude.
I guess you could then also say (something like)..
With no
resolution
key, the upper+lower boundaries of the (single) target cell are great circle lines, which gives a substantially different result (and not what was intended).
And also something like ..
In addition, ESMF cannot handle any cells covering >=180 degrees, e.g. longitudes -180..+180 : such cases will give errors : These cases can be fixed by specifying
resolution=2
.
It would also be great to link to a built-docs description here,
which would be here, but which is not currently up-to-date.
The source docstring tells it here.
Note the difference in value. Also note that it takes more time to initialise a regridder with higher resolution. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be too complicated, but it occurs to me that you might suggest the user tries several values for 'resolution' : A graph of time-required vs. residual error should demonstrate the issues quite nicely.
It definitely is worth simply stating that "the time/accuracy is a tradeoff, which the user needs to decide."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still like it to clearly explain the existence of a time/accuracy tradeoff somewhere.
**Step 7:** Repeat steps 1 - 6 for latitude bounds `[[-90, 90]]`, longitude bounds `[[-40, 40]]` and resolutions 2 and 10. | ||
|
||
*Note:* Unlike lines of constant latitude, lines of constant longitude are already great circle arcs. Since these arcs are 180 degrees it is necessary to have a resolution argument. However, an increase in resolution will not affect the accuracy since a resolution of 2 will already have maximum accuracy. Note how the results are the equal. | ||
|
||
```python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand this.
It says "Since these arcs are 180 degrees it is necessary to have a resolution argument." But this now applys only to the longitudinal edges, so I guess it is saying we require a resolution>1 to resolve full-range cells in either lat or lon, which are unrepresentable in ESMF ??
But, that seems also to imply that resolution is dividing all (4) edges of target cells.
This is not clear, since the docstring suggests it is specific to latitude, as in
"represents the amount of latitude slices".
I guess this is not the place to explain it all, but we clearly need to be very careful not to say anything misleading or not-quite-true. Which may well apply to my suggested texts above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So internally, I believe the logic is that vertical resolution is only used in the specific case where the bounds are [-90, 90] in which case it is effectively set to 2 and in all other cases horizontal resolution is the only one used. I suppose the exercise here kind of suggests otherwise, but I think this might be a bit too deep into the internals to get into here.
plt.show() | ||
``` | ||
|
||
We can then plot the difference between the UM data and the data regridded from LFRic. Since our data is now on a latlon grid we can do this with matplotlib as normal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it may be more important to say that "Since our data is now on a latlon grid" we can now subtract to get the difference between this and the UM equivalent data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made further some comments + suggestions, but I think we could get on and merge this, and address those issues later.
Quick list -
- align sample code with prompt
- clearly state cost/accuracy tradeoff
- small fixes to sample code here
- ... and to same here
- review lengthy calc
- ?? convert all to C48 data
- ??convert all to use (mostly) user-provided code in examples
```python | ||
lat_band_mean_10 = lat_band_mean_calculator_10(mesh_cube) | ||
print(lat_band_mean_10.data) | ||
iqplt.pcolormesh(lat_band_mean_10) | ||
plt.gca().coastlines() | ||
plt.show() | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't doing what the prompt asks for (plotting instead of printing).
Note the difference in value. Also note that it takes more time to initialise a regridder with higher resolution. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still like it to clearly explain the existence of a time/accuracy tradeoff somewhere.
**Step 1:** Load a temperature cube using the `testdata_fetching` function `lfric_temp`. Extract a single pressure slice using the `cube.extract` method with a constraint `iris.Constraint(pressure=850)` as the argument (we choose this level because it has noticable details). | ||
|
||
|
||
from testdata_fetching import fric_temp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code needs to go in a code cell.
In the current version, it is rendered in markdown, but we need the actual results below.
**Step 1:** Load a temperature cube using the `testdata_fetching` function `lfric_temp`. Extract a single pressure slice using the `cube.extract` method with a constraint `iris.Constraint(pressure=850)` as the argument (we choose this level because it has noticable details). | ||
|
||
|
||
from testdata_fetching import fric_temp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misprint here too
from testdata_fetching import
(missing - l
)fric_temp
**Step 3:** Create a `MeshToGridESMFRegridder` regridder from the slice of the temperature cube onto the target cube. Set the resolution keyword to 2 (this should be sufficient since these are bands of constant longitude). Use this regridder to create a resulting cube. | ||
|
||
```python | ||
um_lon_band_mean_calculator = MeshToGridESMFRegridder(temp_slice, target_cube_lons, resolution=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, this step is currently taking an extremely long time ~1m40, , and about half the VDI memory (~2Gb).
So I think that needs reviewing for practicality.
NB with latest code we can "switch" to C48-sized data, which is over 10x smaller (221k -> 14k points).
This may well be worth doing (we should probably make it standard).
But it will affect the current results + so may require some adjustment to be made (e.g. might then be too fast to demonstrate efficiency changes !)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it looks like this problem is specific to longitude bands and performance is a lot better with latitude bands. I think this might have something to do with ESMF not liking long, thin cells. I've raised an issue here SciTools/iris-esmf-regrid#234 . Unfortunately, when taking the latitude band means there is a lot less visible detail in the resulting plot. I might try experimenting with other variables, but I think this is a reasonable trade off compared to minute long run times or C48 data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since merging this, I've also tested it with the newer C48 data.
Unfortunately, that is possibly too small, since the operations get so quick as to make the measurements less certain.
This op took about 1.6 seconds, instead of ~100.
Outstanding suggestions here : #11 |
No description provided.