Striping caused by vertical interpolation routine #66

alejandrobodas · 2019-09-03T22:47:24Z

alejandrobodas
Sep 3, 2019
Maintainer

CALIPSO (and Cloudsat 3D diagnostics show striping, potentially caused caused by the vertical interpolation from the model's native grid. This problem was identified years ago (e.g Figure 2 https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2012GL053153).

G. Cesana noticed that the issue disappears when the interpolation routine is deactivated (in both IPSL and GISS models), so it has to be related to the interpolation. However, when trying to pinpoint the issue, he used this interpolation routine in a separate program and the striping didn't occur…

The problem seems to become less relevant as the model's vertical resolution increases, as Figure 9 in https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2017MS001115 suggests. HadGEM3-GC3 has a higher vertical resolution and it only shows a discontinuity at high altitudes.

The evidence suggests that the vertical interpolation routine is not working properly for thick layers (thicker than the target layer of 480m deep).

rodrigoguzman-lmd · 2019-09-04T13:12:00Z

rodrigoguzman-lmd
Sep 4, 2019

Indeed, there seems to be a problem with the vertical interpolation routine called « cosp_change_vertical_grid », which is defined in ./src/cosp_stats.F90 in the current COSPv2 code. As said by @alejandrobodas , this routine is used by two simulators, the lidar simulator (./src/simulator/actsim/lidar_simulator.F90) and the radar simulator (./src/simulator/quickbeam/quickbeam.F90), to output the computed vertically resolved diagnostics (reflectivity profiles for radar and SR and ATB profiles for lidar) from the model levels into a same regular vertical grid having 40 levels of 480 m-thick layers from 0 to 19.2 km altitude above sea level. The number of levels for this output vertical grid is controlled by the parameter « nlvgrid » defined in ./driver/run/cosp2_input_nl.txt.

I am currently writting the code for a new lidar aerosols simulator in COSPv2 (https://github.com/rodrigoguzman-lmd/COSPv2.0/tree/Adding_lidar_aerosols) in collaboration with Po-Lun Ma based on one of his publications (https://www.nature.com/articles/s41467-018-05028-4). The main difference between the currently existing lidar simulator cloud diagnostics and the lidar aerosols simulator diagnostics we are working on is that the vertical resolution of the outputs has to be much finer for the aerosols diagnostics than for the cloud diagnostics.

In the lidar aerosols simulator code we are currently developing, we define a similar parameter to « nlvgrid » called « nlvgrid_aerosols » which determines how many constant layers the aerosols outputs will have. So far, we would like to test a vertical output grid having 320 levels of 60 m-thick layer each. And here is where the problem appears for us. Figure 1 attached to this message shows the mean 532nm lidar Attenuated Total Backscatter (ATB) computed by this new aerosols simulator, which uses the « cosp_change_vertical_grid » vertical interpolation routine, for « nlvgrid_aerosols » having values equal to 40, 80, 160, 320 and 640 vertical layers. The input sample data comes from the DOE model and has 30 unevenly-spaced vertical levels. Figure 1 shows the interpolation is not performed correctly when the outputs vertical resolution gets finer than the original model vertical resolution.

So, as concluded by @alejandrobodas in the previous message, when interpolating from the vertical model levels to an equivalently coarse or a coarser vertical grid (which is what was usually done by COSP users so far), this bug does not clearly appear, as we can see in the 40-level vertical grid output shown in Fig. 1a. However, when interpolating the computed signal by the simulator to finer vertical grids using the « cosp_change_vertical_grid » vertical interpolation routine, then this constant value striping appears, especially where the model layers are significantly coarser than the output vertical grid (above ~2 km in our case, Fig. 1b-e).

I think the code we are developing for this new lidar aerosols simulator should allow us to easily test a new version of the vertical interpolation routine. I will be happy to perform these tests if you agree on that.

0 replies

rodrigoguzman-lmd · 2020-01-31T11:57:43Z

rodrigoguzman-lmd
Jan 31, 2020

@alejandrobodas @dustinswales @RobertPincus
I have adapted a linear interpolation code that comes from John Burkardt’s website which has a lot of interesting Fortran code, and particularly an interpolation routine which would perfectly do the job for COSPv2 :

https://people.sc.fsu.edu/~jburkardt/f_src/interp/interp.f90

For the lidar aerosols simulator outputs I talked about in the previous post, the result is what we expect (see attached figure and compare it to the one above). I’m currently adapting the lidar simulators and the CloudSat simulator code to this new interpolation routine. I will soon post results on that but first I would like to know how should I cite/acknowledge the author of the original code I am adapting when the following statement appears on each piece of code ?

This code is distributed under the GNU LGPL license

I don’t know the rules that apply for the COSPv2 license.

0 replies

alejandrobodas · 2020-02-07T09:24:37Z

alejandrobodas
Feb 7, 2020
Maintainer Author

Hi @rodrigoguzman-lmd , thanks for looking into this. The results look great, but I have concerns over the GNU LGPL license. We cannot normally accept code into the UM with a GPL licence. LGPL might be a special case, but I would like to check first. It would be good if we could understand the implications of this license for all models before we incorporate this into COSP.

0 replies

RobertPincus · 2020-02-07T12:37:04Z

RobertPincus
Feb 7, 2020
Maintainer

@rodrigoguzman-lmd I agree this looks promising, thanks so much. Would it be possible to re-implement the ideas in new code, so there isn't an issue with licensing? Or maybe that's a lot of work.

0 replies

rodrigoguzman-lmd · 2020-02-10T07:55:06Z

rodrigoguzman-lmd
Feb 10, 2020

@alejandrobodas @RobertPincus
I can re-write the code to make it fit COSPv2 needs only, but that is going to be some more work indeed. There is also another issue concerning the interpolation in the lower atmosphere levels that I was not aware of until last week (not losing any signal from all the geometrically thin layers near the surface). I'm am working on a new solution which should address all this and I'll come back to you once I will have a satisfying new piece of code and figures to discuss about it.

0 replies

rodrigoguzman-lmd · 2020-03-11T11:36:38Z

rodrigoguzman-lmd
Mar 11, 2020

@alejandrobodas @RobertPincus @dustinswales

A new interpolation routine has been developed, which is giving satisfying results for the lidar aerosols simulator (see Figure in my comment on January 31st 2020). However, when applying this new routine to the already existing vertically resolved lidar diagnostics, it appeared that some information from the lower atmosphere layers, which can be as thin as a few tens of meters, can be missed when interpolating from the native model grid to the coarser 480 m thick layers of the statistical vertical output grid. Indeed, the interpolation algorithm only considers the 2 points of the native grid which are the closest to the final point on the interpolation grid, which results in not taking into account some values from the native grid and then missing part of the original information.

In order to avoid that, the best solution found so far is the following : Keep using the original regrid routine for lower layers (where model vertical resolution is higher than output vertical grid resolution) and perform a linear interpolation for the upper layers. A specific routine has been added to find the lowest altitude where the interpolation has to start (« COSP_FIND_GRID_INDEXES » in cosp_stats.F90). This combination of former regrid routine and new interpolation routine is only needed for high variability vertical fields (as lidar ATB or radar Reflectivities). Low variability vertical fields (as lidar ATB_mol or Pressure) can be directly interpolated with the new interpolation routine.

Results for the CALIPSO simulator are shown in Figures 1 and 2.

Figure 1 shows COSPv2 offline sample data mean Cloud Fraction (CF) profiles (« clcalipso » diagnostic). Figure 1c clearly shows that the largest differences between the new and the old regrid routines occur in the higher troposphere (above 10 km), where the UKMO model vertical resolution is lower than the 480 m-thick layers of the COSPv2 output vertical grid. As expected, no difference appears in the lower layers of the atmosphere until an altitude of ~4 km. From that altitude upwards, the red signal in Figure 1b has been calculted with the new interpolation routine instead of using the former regridding routine (blue signal).

Figure 2 shows one specific gridpoint profile (number 28) from the sample data having only 1 subcolumn. Figure 2a shows the Atenuated Total Backscatter (ATB, green) and the molecular ATB (blue) at the model levels. The CALIPSO simulator regrids these fields to the 40-level output grid before computing the Scattering Ratio (SR) and all the 3D output diagnostics as the Cloud Fraction. Figure 2b shows what the SR would look like before performing the regridding (green) and the SR from the interpolated ATB and ATB_mol (red). Figure 2c shows the resulting Cloud Fraction computed with the former regrid routine (dashed green) and the one computed with the new regrid routines (brown). The only possible values of CF here are 0, 100 or Fill Value because there is only 1 subcolumn per gridpoint (corresponding to the model gridpoint column). As we can see in Figure 2b and 2c, the former regridding routine did not seem to spread correctly the signal accross the new output layers. Indeed, between 13 km and 14 km altitude, we see that the model SR would have detected a cloud at that layer of the model, but when considering the interpolated signal, we see that the 2 closest layers of the output grid are below the SR=5 threshold and hence should not declare a cloud in none of the two output grid layers. Conversely, between 9 km and 12 km altitude, the one model layer strong signal declares only 2 cloud layers in the output grid with the former regrid routine, while the new regrid routine declares 4 cloud layers in the output grid, which seems consistent with the interpolated SR (red) in Figure 2b.

Results for the CloudSat simulator are shown in Figures 3 and 4.

Figure 3 shows COSPv2 offline sample data CloudSat simulator 2D histograms of Reflectivity occurrences (CFAD, « cfadDbze94 » diagnostic) differences between the former and the new regrid routines. As for the CALIPSO simulator, Figure 3c shows that the largest differences between the two regridding routines occur above 9 km. Two main features can be noticed from this figure. On one hand, there are clearly less counts (blue color) in the new regrid routine CFAD compared to the reference CFAD. On the other hand, there is a high concentration of these missing counts in the lowest Reflectivity bin (< -45 dBZ) between 14 km and 16.5 km (dark blue).

Figure 4 shows the same specific gridpoint profile (number 28) shown in Figure 2 from the sample data having only 1 subcolumn. Figure 4a shows the Reflectivity profile at the model levels (« dbze94 » diagnostic). This same signal is reported in Figures 4b and 4c over the CFAD of that gridpoint. Similarly to the CALIPSO CF in Figure 2c, CFAD values can only be 0 or 1 because there is only 1 subcolumn per gridpoint. This allows us to have the dbze94 signal regridded on the 40-level output vertical grid of the CFAD. One reason that partly explains having less counts in the CFAD with the new regrid routine is that no interpolation can be performed beyond a « vertical limit value » (with one Fill Value just above or just below). This is highlighted with a cyan ellipse in Figure 4c where no count exists with the new regrid routine while the former regrid routine counted one Reflectivity value at that same CFAD box. Nevertheless, the former regrid routine seems sometimes to add Reflectivity values where they do not exist. This is highlighted with green ellipses in Figure 4b where, even though the counts at 12 km and 14 km could somewhat be justified, the one at 11 km should not be considered as a valid count. Indeed, there is no valid signal between 11 km and 12.5 km, conversely to the 5.5 km altitude Reflectivity value which, considering the original signal, seems to be better interpolated with the new regrid routine CFAD (Fig. 4c) than with the reference one (Fig. 4b).

Last but not least, another solution was tried to solve this problem (having an intermediate high-resolution interpolated signal) but it was adding too much computing time (+50 % ~ +100 % with respect to the reference). The solution presented here only adds a few percents of computing time with respect to the reference code. To compute this estimation, COSPv2 offline was runned 10 times for each version with all the output diagnostics on (except « rttov ») and with 20 subcolumns per gridpoint. Only the 3 fastest runs out of the 10 performed were used to calculate a mean computing time estimate. Here are the numbers (in seconds) :

Reference regrid routine mean computing time =
(0.3109520 + 0.3009540 + 0.2919550)/3 = 0.301287

New regrid routine mean computing time =
(0.3089530 + 0.3059530 + 0.3029540)/3 = 0.305954

Resulting in an approximate 1.5 % additional time to run COSPv2 offline with respect to the reference.

If you want to give a try to this new vertical regrid code, here is the link to it :

https://github.com/rodrigoguzman-lmd/COSPv2.0/tree/implementing_new_vertical_interpolation_routine

0 replies

klein21 · 2020-03-13T17:48:07Z

klein21
Mar 13, 2020

Rodrigo,

Thanks so much for all of this work. I have a few questions ....

With your new method, it seems like you are always assuming that a model will have higher vertical resolution near the surface than it does aloft. While I know that that is the typical situation, I am wondering if your decision code for which interpolation routine to use can be written more generally so as to remove that assumption - which would allow for more generality. Have you have thought about that?
Also I wonder if you have run your new code for the full model to see the impact on climate statistics. For example, run for 1 year and show the climatological distribution of cloud occurrence, and height-resolved SR/dBZ histograms. Can you show those?

Thanks,

Steve

0 replies

rodrigoguzman-lmd · 2020-04-02T19:23:21Z

rodrigoguzman-lmd
Apr 2, 2020

@klein21
Hi Steve,

I havn’t really thought about how to write more generally the new code. Indeed, it is a strong assumption to consider that models will always have a higher resolution at lower layers and will progressively have coarser atmosphere layers as we go higher up. I had the feeling that this was always true in GCMs as long as we were only considering the troposphere (output vertical grid goes till ~19 km altitude).
For me, this was the best way to keep using the former regrid routine where it was giving satisfying results (lower troposphere) and replace it by something else where the striping features appeared (mid-high troposphere). But to be honest, it’s hard for me to see how we could make my code more general without re-thinking/revisiting the whole regrid routine.
Unfortunately, I can’t perform such a run with the LMDZ model and COSPv2 for the time being. I might be able to do that in a few months, and once I’ll have those results I will be happy to share them with you. But I think it is important to be aware of the following.

The actual impact on each diagnostic caused by this bug fix is difficult to estimate because it is model dependent. Indeed, the errors induced by the previous regridding routine depended on the model vertical grid (both on the vertical resolution and its phasing/vertical point grid location with respect to the fixed 40-level output grid on which all these dignostics depend on), which can be significantly different from one atmospheric model to another, and also from one version to another of a same model. In order to have a better idea of the impact of this bug fix on a particular diagnostic for a given model, one way to proceed is, as you suggested, to perform a one-year run with the original vertical regrid routine, and the same run with the new vertical regrid routine, and to compare the two outputs.

For the lidar simulator outputs, there should not be significant changes in the 2D (maps) diagnostics because vertically integrated quantities are likely to stay the same. For the 3D (profiles) diagnostics, changes might be relatively significant at mid-high troposphere but only at particular layers because this will depend on the specific features (resolution + location) of the model vertical grid.

0 replies

klein21 · 2020-04-20T16:20:50Z

klein21
Apr 20, 2020

Hi Rodrigo,

Thanks for your answers. And sorry for my delay.

I agree that results may be model dependent.

Regarding the assumption, I think you can remove it in a fairly easy way. Here's what I would try. You essentially have 2 interpolation methods. Method A, the original method, works fine when model resolution is finer than output statistic resolution (dz_mod < dz_out). Method B works fine in the reverse condition.

I propose that for each vertical level you compute a weighted average of the interpolation from the two methods. The weight w is a function F of the relative vertical resolution between the model and output grid. For example, F could be F = dz_out / (dz_mod + dz_out). If so, the final interpolated value would AF + B(1-F). Of course, the function F would be arbitrary, just as long is it stays between 0 and 1 and smoothly goes to 1 when dz_mod << dz_out, and goes to 0 when dz_mod >> dz_out. By multiplying dz_mod by an arbitrary constant, you can change the relative vertical resolution at which the interpolation methods are equally weighted. I mention this just in case you don't want to do equal weighting for dz_mod = dz_out.

As long as the model vertical resolution doesn't wildly oscillate - which I don't think any model will do, this will provide a smooth transition between the two interpolation methods, and it will not make the assumption about where in the atmosphere fine and coarse vertical resolution occurs.

I don't think this would be that hard to implement, so I hope you would give it a try.

Steve

0 replies

rodrigoguzman-lmd · 2020-04-21T15:14:49Z

rodrigoguzman-lmd
Apr 21, 2020

@klein21
Hi Steve,

Thanks for your comment. I will give a try to the method you describe in order to have a smooth transition between the two regridding routines regardless of the atmospheric level considered. Computing time will be a bit longer with your method than what I had so far, but I think it should still be ok.

I’ll let you know once this new code is ready for some tests/reviews.

Rodrigo

0 replies

rodrigoguzman-lmd · 2020-05-12T15:10:24Z

rodrigoguzman-lmd
May 12, 2020

@klein21 , @dustinswales , @alejandrobodas , @RobertPincus

I have implemented Steve’s method described above to mix both the legacy and the new linear interpolation vertical regridding routines. You can clone and test this new commit from my « implementing_new_vertical_interpolation_routine » branch (commit appearing just above this post) in order to give me some feedback on it :

https://github.com/rodrigoguzman-lmd/COSPv2.0/tree/implementing_new_vertical_interpolation_routine

This new implementation looks ok from my side so far, but feel free to give me any comment to improve the structure of the code, algorithm, bug suspicion, etc.

I am quite confident that the results are satisfying for the lidar simulator, but since I’m less familiar with the CloudSat simulator signals, a more thorough look on that part of the code from you would be particularly welcome.

In order to give you a first glance of the changes brought by this new mixing method, here are the same four figures shown in my post from the 11th of March updated with this new code :

Finally, the extra computing time estimate. The solution presented here adds around ~20 % of computing time with respect to the reference code. To compute this estimation, COSPv2 offline was run 10 times for each version with all the output diagnostics on (except « rttov ») and with 20 subcolumns per gridpoint. Only the 3 fastest runs out of the 10 performed were used to calculate a mean computing time estimate. Here are the numbers (in seconds) :

Reference regrid routine mean computing time =
(0.2239660 + 0.2239660 + 0.2209660)/3 = 0.222966
New mixing regrid routines mean computing time =
( 0.2599610 + 0.2609610 + 0.2609600)/3 = 0.260627

Resulting in an approximate 17 % additional time to run COSPv2 offline with respect to the reference.

0 replies

klein21 · 2020-05-12T15:55:04Z

klein21
May 12, 2020

Rodrigo,

Thanks for your efforts. It looks like the new regrid routine is working fine (but I encourage others to look this over). The surprising thing was that the additional run time is now ~20% for this routine, whereas it was ~2% for the previous implementation. Is that expected?

Steve

0 replies

rodrigoguzman-lmd · 2020-05-12T17:59:22Z

rodrigoguzman-lmd
May 12, 2020

@klein21
Hi Steve,

There might be some optimizations to do in the code that would make this new version to run faster (thanks for encouraging the others to have a look into this new code in order to do that as well). But in any case, 1) we are vertically regridding twice more profiles (at the subcolumn level for most of them) than before, 2) we are performing an extra calculation to weight at each level the contribution of the two regridding routines, and 3) the computing time « delta » between the two versions of this new regridding routines (the one just before and this one) is to be multiplied by the number of times we call this new mixing regrid routine (6 times for the lidar simulators and 3 times for the CloudSat simulator if I remember well). That’s why in a previous post I said that the computing time would be a bit longer than before, but I thought (and I hope) that we can still reduce this extra computing time to consider it acceptable by the community, knowing that it brings a more general and robust solution to the problem.

Maybe a good way to evaluate all this would be to perform (as we already discussed before) a one-year run over the globe to have both, 1) the global figures we would like to have to make sure everything looks ok at that scale, and 2) a better estimate of the extra computing time for such a run compared to the reference (legacy regrid routine only, showing the striping features), which should give us a more accurate estimate of the final impact on the computing resources needed for future model intercomparison exercises with COSPv2.

0 replies

klein21 · 2020-05-12T19:00:11Z

klein21
May 12, 2020

Hi Rodrigo,

Thanks for the explanation. Yes, I agree that the next step is to perform a 1-year simulation with the full model. This will allow one to confirm that it is working as an intended, and will provide an estimate of the computational time in the setting where the code needs to run.

Steve

0 replies

alejandrobodas · 2020-05-13T11:05:59Z

alejandrobodas
May 13, 2020
Maintainer Author

@rodrigoguzman-lmd @klein21
Thanks for the discussion. I have gone through all the figures again and I have some comments about the proposed approach.
The old routine produces discontinuities in the 1st derivative because it retains the concept of 'layer', each point in the vertical grid has an associated depth, assuming that we are dealing with discontinuous quantities like cloud condensate. Having this in mind, my interpretation of the figures above is different from Rodrigo's. The new routine artificially adds clouds in the surroundings of a model cloud layer when the model layer produces a strong signal: this happens between 10 and 12km in profile 28. On the other hand, it can remove model cloud when the signal from the model layer is just above the detection threshold: this happens around 13.5km height in profile 28.
When the model resolution is higher than the target grid (lower Troposphere), then the interpolation approach has other issues: the result in the target grid will be based on two points whose layer depths may be much smaller than the depth of the target layer, and therefore the result may be not representative of the target layer.
With the proposed approach, I think we need to be clear that we are interpreting model information in different ways in different parts of the atmosphere, and not fixing a bug. The old approach is consistent across the entire atmosphere in interpreting model grid points as volumes, it does a weighted average using all the overlapping volumes in the model and target grids.
I am not opposed to changing the approach, but I think we need to carefully consider the pros and cons.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Striping caused by vertical interpolation routine #66

{{title}}

Replies: 15 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Striping caused by vertical interpolation routine #66

alejandrobodas Sep 3, 2019 Maintainer

Replies: 15 comments

alejandrobodas Feb 7, 2020 Maintainer Author

RobertPincus Feb 7, 2020 Maintainer

alejandrobodas May 13, 2020 Maintainer Author

alejandrobodas
Sep 3, 2019
Maintainer

alejandrobodas
Feb 7, 2020
Maintainer Author

RobertPincus
Feb 7, 2020
Maintainer

alejandrobodas
May 13, 2020
Maintainer Author