Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A quasi-monotone flux limiter for isopycnal diffusion (Redi) #53

Conversation

dengwirda
Copy link
Collaborator

@dengwirda dengwirda commented Apr 28, 2023

This PR addresses a range of issues concerning the isopycnal diffusion operator (Redi):

  1. Updating the use of the triad slopes in one of the Redi fluxes.
  2. Correcting the implementation of the Redi fluxes.
  3. Adding a new new quasi-monotone flux limiting strategy that attempts to prevent the development of non-monotone tracer values.

Tested on cori-haswell + pm-cpu and gnu + openmp.

Formulation

Isopycnal diffusion of a tracer $\psi$ is achieved via a rotated diffusion operator

$\partial_{t} \psi = \dots + \nabla \cdot \big( \kappa \mathbf{K} \nabla \psi \big)$

where $\kappa$ is the diffusivity and $\mathbf{K}$ is a tensor built from the isopycnal slopes $S$. Integrating over layers via finite-volumes

$\partial_{t} h \psi = \dots + \nabla \cdot (h F_{x}) + \partial_{z} F_{z}$

$F_{x} = \kappa \nabla \psi + \kappa S \partial_{z} \psi, \quad (\text{horizontal flux})$

$F_{z} = \kappa S \cdot \nabla \psi + \kappa |S|^2 \partial_{z} \psi, \quad (\text{vertical flux})$

which leads to the 'four-term' expression that is implemented in MPAS-O

$\partial_{t} h \psi = \dots + \nabla \cdot (\kappa h \nabla \psi) + \nabla \cdot (\kappa h S \partial_{z} \psi) + \partial_{z} (\kappa S \cdot \nabla \psi) + \partial_{z} (\kappa |S|^2 \partial_{z} \psi).$

These four terms are denoted $\tau_{1,2,3,4}$, with the first three evaluated as part of the horizontal operator and the fourth as part of the implicit vertical mixing.

Evaluating $\tau_{3}$

Evaluating the cross-term fluxes $\tau_{2,3}$ is tricky, and a 'slope-triad' formulation is often used to deal with interactions between the isopycnal slopes and the (active) tracer gradients. MPAS-O uses the triad formulation currently.

This PR updates the way the triad slopes are used to evaluate the $\tau_{3}$ term, which needs to be computed at layer interfaces. The new code averages $S \cdot \nabla \psi$ (with up-down triads) using the two layers either side of an interface, compared to the one-sided approach used previously. Centering the evaluation at the layer interface appears to improve the accuracy of the scheme, helping to maintain the shape of tracer profiles being diffused along isopycnals steeply-sloped with respect to the model grid.

Remapping $\tau_{3}$

The $\tau_{3}$ term is evaluated first on edges, and then remapped onto cells. This PR updates the remapping weights, which previously included an extra factor of 2.0. This was leading to serious errors which would generate spurious 'stripe' instabilities, and contribute a kind of 'anti-diffusion' effect.

Flux limiter

It's well known that the cross-terms $\tau_{2,3}$ can generate non-monotone behaviour under certain conditions, so this PR also adds a cross-term flux limiter that attempts to prevent the development of non-monotone tracer values. Compared to the existing approach that enforces a global lower bound (typically zero) as well as allowing diffusion to be turned off in the upper layers of the ocean (config_Redi_min_layers_diag_terms), the new approach assembles a min/max bound for every cell in the ocean and deactivates the (potentially non-monotone) cross-term fluxes if a tracer value would violate these bounds otherwise.

The limiter bounds are taken as the min/max tracer values over the stencil of the operator --- all cells that are edge neighbours of a given cell over the upper, self, and lower layers.

The limiter may be configured using the new config_Redi_quasi_monotone_safety_factor option, which enables cross-term fluxes to be ramped toward zero proportional to the size of the monotonicity violation, with smaller safety_factors $\gamma$ ramping more aggressively.

$\psi^{*} = \psi^{n} + \frac{\Delta t}{h} (\tau_1 + \tau_2 + \tau_3) \quad \text{(predictor)}$

$\beta = \gamma (1 - \frac{|\psi^{*} - \psi_{bnds}|}{|\Delta \psi|})$

$\tau_{2}* = \beta \tau_{2}, \tau_{3}* = \beta \tau_{3} \quad \text{(cross-term scaling)}$

$\psi^{n+1} = \psi^{n} + \frac{\Delta t}{h} (\tau_1 + \tau_{2}* + \tau_{3}*) \quad \text{(corrector)}$

A hard on/off limiter may be obtained with config_Redi_quasi_monotone_safety_factor = 0.0.

Global spin-up

Using the new approach, a 1-yr MPAS-O QU30km standalone spin-up shows effectively no departure from initial temperature and salinity bounds, with e.g. the maximum SST remaining > -2 deg and <= 31 deg throughout. This run used the new config_Redi_use_quasi_monotone_limiter = .true. and did not disable any surface layers, with config_Redi_min_layers_diag_terms = 0. Previous spin-ups have shown a strong dependence on the number of surface layers disabled, with runs often developing SSTs in excess of 70 deg, exhibiting spurious temperature hot-spots in shelf regions, and ultimately crashing due to strong T/S gradients.

While the previous method + analysis has focused on surface effects / layers, non-monotone T/S behaviour should be prevented throughout the ocean in general. Disabling surface layers may also be undesirable in regions where isopycnals outcrop.

QU30km 1-yr standalone-spinup SST:

sst_qu30_redi_limiter

minmax_qu30_new_math

Idealised tests

Behaviour in the idealised 2d test case has been significantly improved, with arbitrary, sharp tracer distributions now being diffused smoothly along steeply tilted isopycnals for both linear and nonlinear equations-of-state (EoS = jm is used here, with Redi applied to T/S + passive tracers):
output_jmEOS_limiter1p0_kappa3600_newICs_CVmix_newMath

This compares to previous behaviour, in which issues with the $\tau_{3}$ flux led to instabilities:
output_jmEoS_kappa3600_newICs_CVmix_slp0p005

Even with the fixes to $\tau_{3}$ it appears the flux limiter is still needed to prevent spurious min/max violations and instabilities. The updated code with config_Redi_use_quasi_monotone_limiter = .false. showing instabilities in T/S:
output_jmEoS_kappa3600_newICs_CVmix_newMath

@dengwirda dengwirda requested review from vanroekel and xylar April 28, 2023 04:04
@xylar
Copy link
Collaborator

xylar commented Apr 28, 2023

@dengwirda, this is fantastic! I'll test it now.

One thing is that your branch name should have ocean/ or mpas-ocean/ in the the name and does not need to have dengwirda in it (since it's on your fork). That will need to be changed before it goes to E3SM. It might be good to add some mention of redi to the name, too, but I don't really care that much.

@dengwirda dengwirda marked this pull request as ready for review April 28, 2023 09:40
@xylar
Copy link
Collaborator

xylar commented Apr 28, 2023

@dengwirda, in the testing you posted above, did you use the default config_Redi_quasi_monotone_safety_factor = 0.5?

@xylar
Copy link
Collaborator

xylar commented Apr 28, 2023

Testing here in case anyone wants to follow along:

/lcrc/group/e3sm/ac.xylar/compass_1.2/chrysalis/test_20230428/ec_with_redi_mono

I'm running the usual EC30to60 dynamic adjustment, which has been showing some pretty serious nonmonotonicity even with the top 8 layers excluded from Redi diagonal terms in recent runs. I'm excited to see how it behaves with this branch.

@dengwirda
Copy link
Collaborator Author

dengwirda commented Apr 28, 2023

@xylar, yes, so far I've tried safety_factor = 0.5 or safety_factor = 0.0 (which should be the safer but discontinuous on/off limiter). The redi block I've run with is (for QU30km):

&Redi_isopycnal_mixing
    config_use_Redi = .true.
    config_Redi_closure = 'constant'
    config_Redi_constant_kappa = 300.0
    config_Redi_maximum_slope = 0.3
    config_Redi_use_slope_taper = .true.
    config_Redi_use_surface_taper = .true.
    config_Redi_N2_based_taper_enable = .true.
    config_Redi_N2_based_taper_min = 0.1
    config_Redi_N2_based_taper_limit_term1 = .true.
    config_Redi_use_quasi_monotone_limiter = .true.
    config_Redi_quasi_monotone_safety_factor = 0.5
    config_Redi_min_layers_diag_terms = 0
    config_Redi_horizontal_taper = 'ramp'
    config_Redi_horizontal_ramp_min = 3e3
    config_Redi_horizontal_ramp_max = 30e3

I also have config_submesoscale_enable = .true. but config_use_GM = .false., with fairly standard settings elsewhere. I typically haven't enabled redi in the first damped-adjustment steps (due to the strong gravity waves + potentially unbalanced stratification), but flip it on at 30 days in the forward spin. I'm not sure if the details of this sequence is important though.

The behaviour with safety_factor > 0.0 definitely needs more testing --- this is a fairly experimental feature as an attempt to make the limiter less discontinuous. So far it has worked better than I expected, but I'll be interested to see if the results generalise to other config.'s.

Copy link
Collaborator

@xylar xylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I think this is going to work really well and I'd love to see it go to E3SM as a stealth feature for now and then as the default soon when we can run some more tests.

In my dynamic adjustment run with config_Redi_quasi_monotone_safety_factor = 0.5, I see:

As desired, max of temperatureMax: 31.203329075339408 <= 33.0 in 
   damped_adjustment_1/analysis_members/globalStats.0001-01-01_00.00.00.nc
As desired, max of temperatureMax: 31.189221745144472 <= 33.0 in
   damped_adjustment_2/analysis_members/globalStats.0001-01-11_00.00.00.nc
As desired, max of temperatureMax: 31.189627181602162 <= 33.0 in
   simulation/analysis_members/globalStats.0001-01-31_00.00.00.nc

@dengwirda
Copy link
Collaborator Author

Excellent, thanks very much for looking at this so quickly @xylar.
I'm keen to push this on toward E3SM soon as well, as I'm a bit worried about what affect the non-monotone departures in the current approach may be having, but feel a few things should be chased down first:

  • Can the safety_factor be pushed further into the [0.5, 1] range? In theory, any safety_factor <= 1 may work, though I expect monotonicity violations may be seen as saftey_factor => 1. I suspect that making the limiter as smooth as possible may be advantageous, so finding the max safety_factor that works in practice may be useful. I'll investigate further using QU30km 1-yr spin-ups, and update the namelist defaults as makes sense.

  • Is the new local min/max limiter doing too much in places where the existing global min-only approach allows for larger fluxes? In the eyeball norm I feel I can see that the new scheme is doing what it should --- adding dissipation and smoothing sharp T/S gradients, but more detailed tests should done re: diffusing along isopycnal structures directly, either in idealised channels and/or coupled E3SM. Do we have functionality to do this currently?

What do you think @vanroekel and all?

@vanroekel
Copy link
Collaborator

@dengwirda This is indeed great. I was waiting to comment here for testing from @xylar. I think we should get some tests going that move us to full integration to E3SM. If we go stealth this requires a 10 year fully coupled test. But I'd suggest we just plan to run a full 100 years with it. However, it likely makes sense to run some simpler tests first as you suggest.

For the second testing (diffusing on isopycnals and such) @lconlon and @mark-petersen have some redi tests that are simple diffusion of passive tracers along fixed T&S gradients that you could leverage / expand upon.

While that testing is ongoing I can get a version of the code the coupled team would like to use for PR testing (ie.. what there current baseline config is).

@dengwirda
Copy link
Collaborator Author

Based on further QU30km testing I'm finding that running with a safety_factor = 1.0 seems to lead to montone behaviour as well, very similar to that obtained with smaller safety_factors, and I've updated the default to 1.0 (to minimise the amount of limiting that's done) as a result.

With this result seeming suspiciously good, I investigated further and found:

  • The original code would apply the flux scaling twice for any edges and/or vertical levels sandwiched between two cells that needed limiting, and this would result in fluxes being scaled down more than they needed to be. I've rearranged the code to prevent this --- first accumulating the flux scaling on cells, and then interpolating to edges / levels afterward. I've tried two kinds of 'interpolation' for the flux scaling --- taking the min. value of the two neighbouring cells (the strictly monotone option), and taking the harmonic mean (a smoother but not strictly monotone option). Both options lead to similar, apparently-monotone QU30km spin-up results, and I've opted for the harmonic mean option at present, which should lead to smoother numerics.

  • I've confirmed the limiter is not achieving monotone behaviour by simply turning the redi fluxes off everywhere. I output min / max / mean flux scaling statistics and saw that while some cells do scale fluxes down close to zero the mean flux scalings remain very close to 1.0, suggesting it must only be a minority of fluxes that are scaled down overall.

  • The min/max T/S values can be extracted from the global stats to give a time history of monotonicity behaviour, which for the current approach (config_Redi_min_layers_diag_terms = 6) and the updated config_Redi_use_quasi_monotone_limiter = .true. scheme leads to (dashed lines show min/max of ICs):
    minmax_qu30

Thanks very much everyone for your feedback on this --- @lconlon and @mark-petersen if there are some idealised verification cases available this would be fantastic, and @vanroekel it'd be great to see what affect this has in coupled runs too.

@vanroekel
Copy link
Collaborator

@dengwirda here is the guidance for testing or the coupled group

  1. Clone the https://github.com/E3SM-Project/v3atm repo
  2. checkout the NGD_v3atm_v2.1ocnice_alt branch
  3. implement your changes in this branch
  4. run simulation

@mark-petersen
Copy link

@dengwirda when I first implemented the Redi operator, I added these two test cases:

  1. Verification tests against analytic polynomial solutions
  2. Qualitative tests: Diffusion along isopycnals

These were originally developed in what is now compass legacy, and I never moved them over. But compass legacy still works if you want to use them. I'm not sure if (1) is relevant for a flux limiter, but I think at least (2) would be helpful for simple qualitative comparisons that can be visualized in 2D. The instructions for creating those test cases are in those comments.

@dengwirda
Copy link
Collaborator Author

Thanks @mark-petersen, I'll see if I can get the isopycnal diffusion test up and running and will add results here.

@xylar
Copy link
Collaborator

xylar commented May 4, 2023

Ugh! Finding (or creating) a conda environment that works with legacy compass can be quite a chore. Let me know how it goes and if I can help.

@dengwirda
Copy link
Collaborator Author

Thanks @xylar, but no problems so far --- I've skipped all of the xml business by hacking the mesh-build step into the python IC builder script, and all can then be run from the cmd-line with a fairly recent set of compass modules.

@dengwirda
Copy link
Collaborator Author

dengwirda commented May 5, 2023

@vanroekel, @xylar, @mark-petersen and all,
Working with the idealised tests I uncovered issues with one of the Redi flux terms in the underlying MPAS-O implementation, so have fixed that and updated the PR description above. It's a bit of a read, but summarises the various changes and where we're at re: test-case + global spin-up behaviour --- I think this may be a substantially different and hopefully improved Redi now.

(RediKappaScaling(k, iCell)*fluxRediZTop(iTr, k) &
* invAreaCell - RediKappaScaling(k + 1, iCell) &
*fluxRediZTop(iTr, k + 1) * invAreaCell)
flux_term3 = rediKappaCell(iCell) * ( &
Copy link

@mark-petersen mark-petersen Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dengwirda, the C-grid formula for a dot-product definitely requires a factor of 2.0 to average from edges to cells (I say factor of 1/2 on line 366 above, that should be factor of 2). So here on term 3:

$\partial_{t} h \psi = \dots + \nabla \cdot (\kappa h \nabla \psi) + \nabla \cdot (\kappa h S \partial_{z} \psi) + \partial_{z} (\kappa S \cdot \nabla \psi) + \partial_{z} (\kappa |S|^2 \partial_{z} \psi).$

we are computing $S \cdot \nabla \psi$ on each edge of the cell and interpolating it from edge to cell center. That operation does include a factor of 2.

The general formula for the dot product at the cell center $(a \cdot b)_c$ based on c-grid normal components $a_e$ and $b_e$ at edges is:
$(a \cdot b)c = \frac{2}{N}\sum{e=1}^N a_e b_e$
(sorry, github is not rendering that one...)

You can see that with a two simple examples. Take two identical vectors $a=(1,0)$ and $b=(1,0)$ in x-y Cartesian coordinates and spatially constant. On a quad c-grid, $a_e b_e$ is 1 on the right and left edges, and zero on the upper and lower edges. Averaging the product of the normal component on the four edges gives $(1+1+0+0)/4=1/2$, so you need that factor of two to get the correct value of $(a \cdot b)_c = 1$ at the center.

The same example on a hex, with $a=(1,0)$ and $b=(1,0)$, both are have normal values of 1 in the right and left and +/- 1/2 on the other four edges:
image
Taking the product of normal components $a_e b_e$ at every edge and averaging, we have
$1/6(2\cdot 1 + 4\cdot 1/4) = 1/6\cdot 3 = 1/2$
so again we need the factor of 2 to get the correct solution of $(a \cdot b)_c = 1$ at the center.

That's not a proof, but it was enough to convince me that the factor of two is required.

Of course, it could be that there is a bug with a factor somewhere else. Next I will redo the convergence test for every term using exact polynomial solutions, and that will tell us if the code is correct.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mark-petersen I know what you mean, but that extra 2.0 currently seems to be associated with some strange deformation in the test case. I'll revisit this and double check.

! includes weights for edge-to-cell remap
fluxRediZTop(iTr, k) = fluxRediZTop(iTr, k) + 0.125_RKIND*dvEdge(iEdge)*( &
slopeTriadDown(k - 1, iCellSelf, iEdge)*(tracers(iTr, k - 1, cell2) - tracers(iTr, k - 1, cell1)) + &
slopeTriadUp(k, iCellSelf, iEdge)*(tracers(iTr, k, cell2) - tracers(iTr, k, cell1)))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dengwirda I think your altered version here is correct. From Griffies 1998:
image
We are looking at $S \cdot \nabla \psi$, which are the red and green boxes in this sum. See that the green box has a vertical index $k+kr$. The $ip$ and $kr$ both index from 0 to 1 to sum over the four directions of triads, where kr is the vertical index. That shows that the tracer should be taken from the same vertical level as the corner of the triad, which is what you have done. Thanks! I'll try both versions for my convergence tests and see if they differ.

redi_flux_scale(iTr, k, iCell) + 1.0E-16_RKIND)

tend(iTr, k, iCell) = tend(iTr, k, iCell) + &
rediKappaCell(iCell) * ( &

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the factor of two in term 3 was also removed here. I just wanted to mark that in this PR, as it would be easy to miss. The factor of two in lines 508 and 514 above are for the harmonic mean only, and is unrelated.

@xylar
Copy link
Collaborator

xylar commented Aug 8, 2023

@mark-petersen and @dengwirda, what is the status here? It would be amazing to have this in compass so the dynamic adjustment no longer produces out-of-bounds temperatures (MPAS-Dev/compass#331).

@dengwirda
Copy link
Collaborator Author

@xylar and all, currently we're running G-case sensitivity with these changes, and as a result the $K_{3,3}$ flux term is also under investigation...
This PR can go through on it's own, but it'd be great to merge a proper Redi formulation in full I think.
On the other hand, if you're waiting on this for v3 mesh spin-up @xylar we could move forward with this as is.

@xylar
Copy link
Collaborator

xylar commented Aug 8, 2023

I don't want to do E3SM v3 mesh spin-up with code that isn't on E3SM master. But if the only purpose of merging this would be for initial condition dynamic adjustment in compass, that seems like a lot of work for a stop-gap and it would be better to wait until the reformulation is there. That leaves us just running in compass with what we have an keeping an eye on the temperature extrema in the hope that they're not too serious. Because I don't think we can wait for this work to be completed before we create the meshes.

@vanroekel
Copy link
Collaborator

@dengwirda @mark-petersen @xylar I think with the slope changes and limiter changes in #65 the redi formulation should be redi for updating. What is necessary to start pushing this one through to master? I think it would be best to try get this PR and the other I just opened pushed through in sequence quickly.

@mark-petersen
Copy link

Testing now. This successfully compiles standalone with OPENMP=true on chrysalis with gnu and intel.

@mark-petersen
Copy link

mark-petersen commented Sep 18, 2023

Using gnu debug on chrysalis on the head of this PR (1184e34), the nightly suite passes all except:

FAIL ocean_global_ocean_QU240_WOA23_RK4_decomp_test
FAIL ocean_global_ocean_QU240_WOA23_RK4_threads_test
FAIL ocean_global_ocean_QUwISC240_WOA23_performance_test

Where the first two simply hang when using compass run. If I use srun I get a traceback to the screen

report:

srun -c 1 -N 1 -n 4 ./ocean_model

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x15554f15a3ff in ???
#1  0xcf9316 in __ocn_tracer_hmix_redi_MOD_ocn_tracer_hmix_redi_tend._omp_fn.4
	at /home/ac.mpetersen/repos/E3SM/fix-diffusion/components/mpas-ocean/src/shared/mpas_ocn_tracer_hmix_redi.F:540
#2  0x15554fb79811 in GOMP_parallel
	at /tmp/svcbuilder/spack-stage-gcc-9.2.0-ugetvbp5jl5kgy7jwjloyf73vnhhw7db/spack-src/libgomp/parallel.c:171
#3  0xce3df9 in __ocn_tracer_hmix_redi_MOD_ocn_tracer_hmix_redi_tend
	at /home/ac.mpetersen/repos/E3SM/fix-diffusion/components/mpas-ocean/src/shared/mpas_ocn_tracer_hmix_redi.F:490
#4  0xcd72ef in __ocn_tracer_hmix_MOD_ocn_tracer_hmix_tend
	at /home/ac.mpetersen/repos/E3SM/fix-diffusion/components/mpas-ocean/src/shared/mpas_ocn_tracer_hmix.F:157
#5  0xd3f909 in __ocn_tendency_MOD_ocn_tend_tracer
	at /home/ac.mpetersen/repos/E3SM/fix-diffusion/components/mpas-ocean/src/shared/mpas_ocn_tendency.F:723
#6  0xb9b62c in ocn_time_integrator_rk4_compute_tracer_tends
	at /home/ac.mpetersen/repos/E3SM/fix-diffusion/components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_rk4.F:1076
#7  0xb9d329 in __ocn_time_integration_rk4_MOD_ocn_time_integrator_rk4
	at /home/ac.mpetersen/repos/E3SM/fix-diffusion/components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_rk4.F:522
#8  0xb47916 in __ocn_time_integration_MOD_ocn_timestep
	at /home/ac.mpetersen/repos/E3SM/fix-diffusion/components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration.F:131
#9  0xb44765 in __ocn_forward_mode_MOD_ocn_forward_mode_run
	at /home/ac.mpetersen/repos/E3SM/fix-diffusion/components/mpas-ocean/src/mode_forward/mpas_ocn_forward_mode.F:728
#10  0xb43751 in __ocn_core_MOD_ocn_core_run
	at /home/ac.mpetersen/repos/E3SM/fix-diffusion/components/mpas-ocean/src/driver/mpas_ocn_core.F:111
#11  0x40f4aa in __mpas_subdriver_MOD_mpas_run
	at /home/ac.mpetersen/repos/E3SM/fix-diffusion/components/mpas-framework/src/driver/mpas_subdriver.F:358
#12  0x40e223 in mpas
	at /home/ac.mpetersen/repos/E3SM/fix-diffusion/components/mpas-framework/src/driver/mpas.F:20
#13  0x40e28e in main
	at /home/ac.mpetersen/repos/E3SM/fix-diffusion/components/mpas-framework/src/driver/mpas.F:10
srun: error: chr-0494: task 0: Floating point exception
^Csrun: interrupt (one more within 1 sec to abort)
srun: step:392264.34 tasks 1-3: running
srun: step:392264.34 task 0: exited abnormally
^Csrun: interrupt (one more within 1 sec to abort)
srun: step:392264.34 tasks 1-3: running
srun: step:392264.34 task 0: exited abnormally
^Csrun: sending Ctrl-C to job 392264.34
srun: Job step aborted: Waiting up to 92 seconds for job step to finish.
slurmstepd: error: *** STEP 392264.34 ON chr-0494 CANCELLED AT 2023-09-18T14:32:10 ***
^Csrun: forcing job termination
which is complaining about `mpas_ocn_tracer_hmix_redi.F:540`. I can look into that. I don't know why all the other tests pass but these do not.

@vanroekel
Copy link
Collaborator

Are these all related to RK4? Or does the third failure use split explicit? The error message you pasted is very confusing. It seems to suggest an issue with redi_flux_scale, but this seems bounded b/w 0 and 1. Perhaps if the tracer is so far out of bounds it doesn't preserve this? The do loop block above has very similar operations but are interpolating in z instead of the horizontal., so perhaps jCell sometimes reaches to something out of bounds? Let me know if you want any help debugging this.

@mark-petersen
Copy link

This is confusing. The third one, FAIL ocean_global_ocean_QUwISC240_WOA23_performance_test simply fails because the variable dataLandIceFreshwaterFlux is in the streams file and not in the model. I just need to rebase for that.

The first two pass with intel debug. Let me rebase on master and retest, and I will post again.

- Add a new config_Redi_use_quasi_monotone_limiter option for the
  isopycnal diffusion operator to better control non-monotone
  departure of tracer values.

- Limiter expands on existing strategy of disabling cross-fluxes,
  but uses a more comprehensive test based on min/max values over
  stencil of neighbouring cells in the adjacent layers.
- Scale cross-fluxes toward zero proportional to the degree of
  non-monotonicity. Enables softer limiter compared to default
  on/off switch. Control ramp with quasi_monotone_safety_factor.
- Ensure fluxes are only scaled once by accumulating scaling on
  cells and then interpolating scaling to edges/levels.

- Change default safety_factor to 1.0, based on QU30km spin-ups
  showing fully monotone behaviour.

- Add local variables to OMP directives.
- Update use of triad slope in Redi term-3.

- Fix edge-to-cell remapping weights for Redi term-3.

- Switch to module-level config. variables.
@mark-petersen
Copy link

I rebased locally. That fixed the problem with ocean_global_ocean_QUwISC240_WOA23_performance_test. Gnu debug still fails these two:

FAIL ocean_global_ocean_QU240_WOA23_RK4_decomp_test
FAIL ocean_global_ocean_QU240_WOA23_RK4_threads_test

I'll look into that.

@mark-petersen
Copy link

@vanroekel do you mind if I force push the rebased branch?

@vanroekel
Copy link
Collaborator

This isn't my branch / PR so it is probably a better question for @dengwirda

@xylar
Copy link
Collaborator

xylar commented Sep 19, 2023

@mark-petersen, ocean_global_ocean_QUwISC240_WOA23_performance_test is failing because of stuff fixed in:
MPAS-Dev/compass#694
and
E3SM-Project#5910
and unrelated to this branch.

@dengwirda
Copy link
Collaborator Author

@mark-petersen you can push a rebase here if that helps.

@vanroekel
Copy link
Collaborator

@mark-petersen I was able to fix the floating point error you show above by adding this block just below the redi_flux_scale allocation

      !$omp parallel
      !$omp do schedule(runtime) private(k,iTr)
      do iCell=1,nCells
         do k=1,nVertLevels
            do iTr = 1,nTracers
               redi_flux_scale(iTr,k,iCell) = 1.0_RKIND
            end do
         end do
      end do
      !$omp end do
      !$omp end parallel

@mark-petersen mark-petersen force-pushed the dengwirda/fix-diffusion branch from 1184e34 to b8c3c99 Compare September 21, 2023 17:10
@mark-petersen
Copy link

Thanks @vanroekel that solved it. With the rebased code I just pushed up, which includes the code in @vanroekel's last comment, this now passes the full nightly suite on chrysalis with gnu debug and intel debug.

Copy link

@mark-petersen mark-petersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the above discussion and testing, this is now ready to move to E3SM-Project/E3SM repo.

Copy link
Collaborator

@vanroekel vanroekel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ready to move to the E3SM repo

@mark-petersen
Copy link

moved to E3SM-Project#5945

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants