Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to single-variable time series generation #86

Open
10 tasks
TeaganKing opened this issue Mar 18, 2024 · 2 comments
Open
10 tasks

Improvements to single-variable time series generation #86

TeaganKing opened this issue Mar 18, 2024 · 2 comments
Assignees

Comments

@TeaganKing
Copy link
Collaborator

TeaganKing commented Mar 18, 2024

There are a few outstanding comments from #78 that are being moved to a new issue so that we can try to get the basic timeseries functionality merged in. Please see that PR for more details/context on these comments!

  • From @wwieder , "I wonder if a more generic approach [than using lev] may be to copy all the coordinate dimensions from a history file onto the single variable time series?"
  • Ensure code is generic enough for different history file types (e.g. patch or landunit level output from CLM, as opposed to grid cell averages)
  • vars_to_derive might be updated when Sympy derived quantities from string equations ADF#278 comes in. From @nusbaume, "we could potentially use the sympy package to allow users to write their own equations for derived variables without having to write new python code for each individual derived quantity."
  • from @TeaganKing, "Each component will need to adjust the variables they want to generate (unless processing all vars), the relevant history string, and the height dimension they are using (lev is currently the default) in config.yml.
  • Process_all should be updated; I think there's some issue going on with vars not being available in all files; this might have to do with needing to change hist_string.
  • We also want to include an alternative to num_procs so that we don't surprise (especially non-NCAR) users with a somewhat hidden request for a particular number of processors
  • In order to manage which timeseries scripts we run (for which components), we can use the compute_scripts feature. This will also allow us to specify that timeseries should be run first. For details, see comments from @rmshkv in API updates to specify components #88 and Add single-variable time series file generation function from ADF #78. @mnlevy1981 and I had also discussed including a --timeseries-only flag in order to run timeseries without running the notebooks, and renaming --timeseries to --timeseries-first to be clear that that particular flag is for running timeseries AND diagnostics notebooks. However, I think that the using the compute_scripts ploomber capabilities would be ideal in this case.
  • Include **timeseries_params and use something like if type(arg1) == str: arg1 = [arg1] in timeseries.py instead of run.py in order to implement kwargs.
  • hist_str list  #90
  • from @wwieder Can we include additional data for some components? For example having the information about area and landfrac makes calculating global or regional sums easier.
@TeaganKing
Copy link
Collaborator Author

We'd also like to use the case name from the global parameters rather than specifying a case within the timeseries parameters to avoid duplication.

@mnlevy1981 mnlevy1981 self-assigned this Oct 1, 2024
@dabail10
Copy link
Collaborator

dabail10 commented Dec 6, 2024

Also, the timeseries by default puts the post-processed files in $case/$comp/proc/tseries. This should also include the frequency, so month_1 for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants