Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have the FATES parameter file created at runtime in the buildnml #2126

Open
2 of 10 tasks
ekluzek opened this issue Aug 30, 2023 · 21 comments
Open
2 of 10 tasks

Have the FATES parameter file created at runtime in the buildnml #2126

ekluzek opened this issue Aug 30, 2023 · 21 comments
Assignees
Labels
enhancement new capability or improved behavior of existing capability next this should get some attention in the next week or two. Normally each Thursday SE meeting.

Comments

@ekluzek
Copy link
Collaborator

ekluzek commented Aug 30, 2023

This was an idea talked about here:

NGEET/fates#1073

We think a better way to handle FATES parameter files is to have the buildnml use the FATES tools to create them from the CDL files in FATES. A shorter way to do this would be for that to happen in a testmod in the shell_commands file. But, longer term we probably want it as part of the configure process for a CESM case. This also gets people in the habit of using the tools to make FATES parameter files themselves which sets them up to configure their files for their situation from the get go. That's a more FATES like way to run, and sets people up for doing things like PPE work, or site specific changes.

Definition of done: FATES parameter files are created from the CDL file stored in FATES

NOTE: The longer term solution that is desired is to store the default parameter file in FATES as XML. With this python tools that modify the file will just have standard python packages and NOT also require an extended Python environment using conda and activating ctsm_pylib as we require now.

Steps to get there:

@ekluzek ekluzek added enhancement new capability or improved behavior of existing capability type: -discussion next this should get some attention in the next week or two. Normally each Thursday SE meeting. labels Aug 30, 2023
@ekluzek
Copy link
Collaborator Author

ekluzek commented Aug 30, 2023

There's several considerations for this that we need to think about:

  • Will this take enough time that we should add a switch for it so it's not always done (note one way to do that is to skip the namelist generation step, which CESM has several ways of doing already)?
  • Should there be a way to do this for multi-instance cases?
  • Should there be a way to put a CDL file in your case directory to be used in place of the default one in FATES?
  • Should there be any other control variables (namelist or XML) to control this process?
  • Should we still store FATES parameter files in svn inputdata -- or always rely on this process?
  • If we do store FATES parameter files in svn inputdata how often do we do that?
  • Are there any robustness concerns with this process, or python package or versioning issue concerns?

@adrifoster @glemieux @rgknox @billsacks

@ekluzek
Copy link
Collaborator Author

ekluzek commented Aug 30, 2023

I think the main script for this is here:

https://github.com/NGEET/fates/blob/main/tools/UpdateParamAPI.py

Any scripts used would be in that directory.

Looking at dependencies, I see that scipy and numpy are being used, so that's pretty basic, but beyond the current barebones requirements that cime currently has.

@rgknox
Copy link
Collaborator

rgknox commented Aug 30, 2023

@ekluzek and all:

When I'm running FATES and making perturbations to parameters in the FATES parameter file, there are two things I typically do, depending on the scenario.

Scenario 1: I'm running a simulation where I've already calibrated a set of parameters, and this set of parameters is not encapsulated in the default file. This set of parameters could be associated with the default PFT set, or a subset, or a completely new set of PFTs.

  1. run BatchPatchParams.py , which will interpret an xml file that describes this set of parameter file modifications to the default. That xml file is kind of like a patch or diff file for a base parameterization.
  2. copy the new binary to my case directory and point to it in user_nl_clm

Scenario 2: I want to do a "one-off" perturbation to a parameter, or multiple parameters. The basis of my parameter set may be the default, or a different file perhaps generated from Scenario 1:

  1. if not done already, copy the binary form of the fates parameter file (default or other) to my case directory and point to it in user_nl_clm
  2. run the python module: modify_fates_paramfile.py for each specific parameter I want modified. This will act on the binary form of the parameter file, and runs in an over-write fashion

I could picture both of these methods being used in a testing workflow, or normal build and run workflow. The combination of them would certainly give us better testing coverage of FATES. The BatchPatchParams.py could potentially over-complicate a testing workflow, but might come in handy. The most direct method to modifying fates parameter for testing, might be calling modify_fates_paramfile.py in some type of shell or existing execution script.

Either way, both could be worked into a system where we build the file at run-time.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Aug 30, 2023

@rgknox that's very helpful. Yes, we should incorporate both scenarios into our testing workflow and make sure they are options that can be done from buildnml. As I think about this I think there should be some XML variables that control this operation. I'd like to talk about this more at tomorrow's software meeting.

I have one more idea I'm going to put in another comment...

@ekluzek
Copy link
Collaborator Author

ekluzek commented Aug 30, 2023

Since, there are python module dependencies that are beyond what cime normally requires (which right now is only the python version) we should add some error checking for those modules and python version dependencies.

This should also eventually go into the tool in FATES, but to start with we should have a graceful fail in the CTSM side. Here's the issue to put it into FATES...

NGEET/fates#1076

As a name for an XML field I'm thinking of something like:

CLM_FATES_PARAMFILE_BUILD = DEFAULT,NCGEN,BATCH_PATCH

The BATCH_PATH option will also have to assume a name for the input file it'll use we suggest: user_nl_FATES_param_changes.xml below.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Sep 1, 2023

We had some discussion on this today. I modified the above proposal a bit based on that. It does seem clear that transitioning to always building the parameter file would be the best way to separate the HLM's from needing to update whenever new parameters are added to the FATES parameter file. But, I think we should do this as a transition, so start allowing it, and then transition to one of the parameter file creation options is the only choice.

I tried to see if there might be a way to have the HLM still work with an updated version of FATES that has new parameters on the FATES parameter file. But, that doesn't seem workable. So working towards always creating the FATES parameter file at buildnml time is a better long term solution.

I also realized that an option to use straight up ncgen should be one of the options. The nice thing about that is that's part of NetCDF so will be loaded in the modules to run the model. So it's not an new requirement.

@billsacks billsacks removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Sep 7, 2023
@slevis-lmwg
Copy link
Contributor

slevis-lmwg commented Sep 8, 2023

Copied from google meet chat:

  • user_nl_FATES_param_changes.xml
  • maybe we should add a Unit Tests section to the Design Doc

@glemieux
Copy link
Collaborator

glemieux commented Sep 8, 2023

From today's meeting on parameter files we discussed how a future PR could utilize this infrastructure to provide a testmod for the fates seed dispersal code, since it needs reasonable values based on the resolution type. As requested, I've attached a potential xml file (adding a txt suffix since github doesn't like xml).

seed-dispersal-f45.xml.txt

Seed dispersal PR for future reference: NGEET/fates#1005 and #2077

@ekluzek
Copy link
Collaborator Author

ekluzek commented Sep 11, 2023

I looked up in cime, and we need to have a "user_nl_" prefix for it to be handled correctly. So we should have the name...

user_nl_FATES_param_changes.xml

@ekluzek
Copy link
Collaborator Author

ekluzek commented Sep 11, 2023

Some details that came out in our discussion. The ncgen part needs to always be done. And then the batch_path or modify script can be run on top. The nice thing about ncgen is that its guaranteed to be loaded as NetCDF is loaded in the cime case environment. The modify script only does one thing at a time, so typically you have to run it many multiple of times to get the changes needed. Since, the batch_patch option just bundles those changes into one xml file, it seems best to just enable the batch_patch option as the general tool to take a user XML file and apply a list of desired changes to the default file.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Sep 11, 2023

In our discussion we also thought this should be broken up into small steps. Here's my thinking of that. Feel free to suggest changes.

  1. Do Add some testmods for FATES to test some different FATES parameter file options #2151 to do this in the simplest way possible
  2. fates parameter file auto-build for all tests #2336 which makes parameter files on the fly for FATES tests
  3. Implement just the new XML variable, and the DEFAULT and NCGEN option for it (DEFAULT is the default option until the step further down)
  4. Add the BATCH_PATCH option, that runs NCGEN and then batch_patch on top of it. This will require checking the python environment.
  5. Convert the tests to use the BATCH_PATCH option
  6. Make the NCGEN option the default, but still allow the DEFAULT option
  7. Remove the DEFAULT option, once we feel the community is OK with that and it's robust
  8. Have E3SM and NorESM evaluate using this methodology as well
  9. Implement it into E3SM
  10. Implement it into NorESM#2151

Separate from this HLM work, will be work inside of FATES to improve the FATES parameter file manipulation tools. None of the above is dependent on that, but we may have to modify the interface here in response to changes there.

I think the two hardest steps are 1 and 3 and especially 3 which is the real proof of concept test for this.

@adrifoster @slevis-lmwg @glemieux @samsrabin any thoughts on this schedule above? Missing steps? Or things that could be broken up? Some of the steps could be done together depending on who does them, but it's nice to break it up into as small of steps as possible.

@adrifoster
Copy link
Collaborator

I looked up in cime, and we need to have a "user_nl_" prefix for it to be handled correctly. So we should have the name...

user_nl_FATES_param_changes.xml

I guess I am confused. The shell_commands and includes files work in CIME. Is there a reason why these are different? I am a bit reluctant to call something a user_nl that isn't a namelist (that is what nl stands for?).

@ekluzek
Copy link
Collaborator Author

ekluzek commented Sep 11, 2023

Yes, valid point. At the first step everything is contained in shell_commands and includes, so that all works until we get to step 3 above. At that point for clones to work, the filename either needs to be an assumed name that cime will automatically copy over (which right now is user_nl_*) -- or it needs to be explicitly named with another XML variable. But, if it's named with an XML variable, cime won't automatically copy it for create_clone, unless we figure out how to explicitly add that in somehow.

You are right that "nl" stands for namelist, but I really think of it as user lists of control file changes. The user_nl_* files mostly deal with namelists, but they also operate on config files (for driver), XML files (for CDEPS), and mizuRoute control files. So for datm you have user_nl_datm_streams and mizuRoute user_nl_mizuroute_control. This is perhaps why the "user_nl_" should become something more general, but because that would require a lot of coordinated changes we've left it alone.

Does that help? Or should we do more discussion?

@ekluzek
Copy link
Collaborator Author

ekluzek commented Sep 13, 2023

The changes in steps 2 and 3 above could be implemented in either the buildnml or the build-namelist scripts. Because we want to deprecate and replace the build-namelist, as in #585 we should implement this at that point in the buildnml. This could be an incremental step in moving towards having the build-namelist functionality fully inside the buildnml script.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 5, 2024

Work I did in ctsm5.3.014 helps with this. I'm taking the above task list and turning it into checkboxes that I'll put at the top.

ctsm5.3.014 enables the ctsm_pylib environment if it can, or adds graceful error checking for some FATES tests, and this can be extended for more. This would get us further down the road into this at least for testing. It could also be moved to user-mods for FATES so would be active in more compsets.

We've had some discussion on this in CSEG, and there is concern about the requirement of having a conda environment needed for a CESM case. So we'll have a subgroup discuss this once we have a document that outlines ideas and options.

Here's a link to the discussion there:

https://docs.google.com/document/d/186U6-dt_wWZZGU9NzYQ5zNlMnpx9XX6oweuTXzQY-oo/edit?tab=t.0#heading=h.tydat4isnshc

I just added some action items there and additional notes.

I've got a start at that document that we'll present to CSEG about different options here. This is only an overview right now.

https://docs.google.com/document/d/1XUjOGcBVbW8dY_0F2kmK8QJaB2nJn1th_474lpQPS_Q/edit?tab=t.0#heading=h.9ifrjra9jr1h

@ekluzek ekluzek added the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Dec 5, 2024
@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 5, 2024

Note an alternative that @glemieux talked about was changing the FATES modify parameter file tool, to NOT use anything beyond base python modules required by CIME. This is actually the desired long term solution for FATES for the parameter file to be stored in XML.

@mvdebolskiy for NorESMF is also planning on having the python module environment needed for this as part of the machine setup for the machines they support.

@glemieux
Copy link
Collaborator

glemieux commented Dec 6, 2024

I realized we didn't have an explicit issue or thread on the FATES-side so I'm referencing this here: NGEET/fates#1296

@mvdebolskiy
Copy link
Contributor

@ekluzek my setup won't work, since for the tests ./shell_commands are executed before case.setup, so the test does not know about modules on the machine, only about the current user environment. I can adjust the shell commands in the tests to check for Sigma2 machines and load a python bundle with scipy. But that is not the best solution for the long run since it will stop working if there is an OS update and modules are no longer accessible.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 6, 2024

@mvdebolskiy thanks for confirming the order for shell_commands. That's good to know for sure.

That might be a hard requirement for CIME to do it that way because shell_commands can also be in user-mods and create_newcase needs to execute them and it doesn't normally do case.setup as part of what it does. We could maybe modifty the behavior for create_test, but we wouldn't want it to do something that different than how create_newcase functions.

Another way to do this would be to require the user to rerun shell_commands after case.setup is done. So it would die with an error and then tell the user to run case.setup and then shell_commands again. How does that sound?

@ekluzek
Copy link
Collaborator Author

ekluzek commented Dec 6, 2024

I'd like to see this for usability in the CESM3 release, so marked it that way. But, I also added next on this to discuss as it's not that strictly required.

@mvdebolskiy
Copy link
Contributor

@ekluzek It's not a huge problem. I can just add machine specific things to the ctsmpy_env creation thing and the script that checks if it is activated. The environment is not huge right now, so it won't be that bad for most users on sigma2 machines.

Reruning shell commands does not sound good to me since it's adding confusion.
The easiest way, I think is the one above. If modules get outdated -- no big deal. I think different conda can pick up an old environment if package and env files are still in the user-space. I'll test this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new capability or improved behavior of existing capability next this should get some attention in the next week or two. Normally each Thursday SE meeting.
Projects
Status: No status
Development

No branches or pull requests

7 participants