-
Notifications
You must be signed in to change notification settings - Fork 4
Add your own datasets to the C ESM EP: simulations or references
If your datasets can not be accessed via the already existing CliMAF projects (like IGCM_OUT, CMIP5...), you can easily add your own project to access your data.
A CliMAF project is:
- a definition of path/filename pattern(s)
- including a set of standards and optional keywords
- and definitions of aliases and default values
You can find an example of how to declare your own project in standard_comparison/datasets_setup.py:
# -- Declare a 'CMIP5_bis' CliMAF project (a replicate of the CMIP5 project)
# ---------------------------------------------------------------------------- >
cproject('CMIP5_bis', ('frequency','monthly'), 'model', 'realm', 'table', 'experiment', ensemble='model','simulation'],separator='%')
# --> systematic arguments = simulation, frequency, variable
# -- Set the aliases for the frequency
cfreqs('CMIP5_bis', {'monthly':'mon'})
# -- Set default values
cdef('simulation' , 'r1i1p1' , project='CMIP5_bis')
cdef('experiment' , 'historical' , project='CMIP5_bis')
cdef('table' , '*' , project='CMIP5_bis')
cdef('realm' , '*' , project='CMIP5_bis')
# -- Define the pattern
pattern="/prodigfs/project/CMIP5/output/*/${model}/${experiment}/${frequency}/${realm}/${table}/${simulation}/latest/${variable}/${variable}_${table}_${model}_${experiment}_${simulation}_YYYYMM-YYYYMM.nc"
# --> Note that the YYYYMM-YYYYMM string means that the period is described in the filename and that CliMAF can
# --> perform period selection among the files it found in the directory (can be YYYY, YYYYMM, YYYYMMDD).
# --> You can use an argument like ${years} instead if you just want to do string matching (no smart period selection)
# -- call the dataloc CliMAF function
dataloc(project='CMIP5_bis', organization='generic', url=pattern)
A good way to be sure that your project definition is correct is to test it from a Notebook.
Here is an example of how to do it for two different projects.
The idea is that you test your project definition from a notebook to take advantage of the interactivity of the notebooks. Then you simply copy the code to declare your project and copy it either in your datasets_setup.py file (if you want it to be available for all the atlas/components of the C-ESM-EP) or in the parameter file you work with (if, for instance, you work mainly with one atlas/component like ORCHIDEE, or AtlasExplorer).
This is important if you want to use, for instance, clim_period='last_10Y' and/or ts_period='full'. At the moment, the time_manager mechanism to know the period covered by the simulation is based on the period that appears in the filename. If your files follow the same syntax as the IGCM_OUT filenames (Ex: CM61-LR-pd-01_1M_200001_21991231_histmth.nc), the time manager will automatically use this syntax to get the period if your project name contains the string 'IGCM'. Similarly, if the period covered by your files follows the CMIP DRS (with the dates at the end of the filename, like tas_Amon_CNRM-CM5_historical_r1i1p1_198001-200512.nc or tas_Amon_IPSL-CM6-LR_historical_r1i1p1f1_gr_198001-200512.nc, the time manager will automatically use this syntax to get the period if your project name contains the string 'MIP'.