-
Notifications
You must be signed in to change notification settings - Fork 0
How to use Registry
This document should help developers learn how to make use of the Registry format used within MPAS. MPAS makes use of an xml based Registry format, which can be used to define anything from dimensions, to streams, to namelist options, and variables.
Registry provides an easy to modify method of auto-generating data structures and variable definitions. In addition to auto-generating data structures, it also auto-generates some subroutines which can be used to initalize or destroy these data structures.
Below will be a description of the Registry.xml format, along with some of the rules assocaited with Registry.
As mentioned above, Registry uses an XML format. This choice was made for a variety of reasons, but mostly because it is:
- Easily extensible
- Self documenting
- Easily parsed
A Registry file has to be defined in a particular fashion. To begin, it must follow a standard XML format, with all of the subsequent defintions defined within a registry block, as follows:
<?xml version="1.0"?>
<registry model="mpas" core="core" version="ver">
... other definitions ...
</registry>
The registry block is the highest level construct that exists in a registry file. The model, core, and version attributes allow each dynamic core (atmosphere, ocean, land ice, etc...) to name their registry block. These attributes are also written to output files. So a user/developer knows which core, and version of MPAS an output file was generate from.
Within the registry block, several other blocks can be defined. These are:
- dims
- nml_record
- packages
- streams
- var_struct
Each of these blocks defines a different part of MPAS, and will be described in their own sections.
Nested under the registry block, a dims block can be defined. The dims block is used to group together all dim blocks which are used to define dimensions. Below is an example of how to define the dims block.
<registry model="mpas" core="core" version="ver">
<dims>
... other definitions ...
</dims>
</registry>
The dims block does not contain any attributes, and it just used as an overarching structure that contains all dim blocks.
Beneath the dims block a dim block can be defined. A dim block defines an individual dimension. Each dimension is contained within the var_struct named "mesh" which will be described later. Below is an example of how to define a dim block.
<dims>
<dim name="dimension_name" definition="definition_string"
units="dimension_units"
description="dimension_description"
/>
</dims>
When defining a dim block, there are 4 available attributes. These are name, definition, units, and description. Below each of these attributes are described:
The name attribute is the only required attribute for a dim block. It defines the name of the dimension in any streams that it is used in, as well as the var_struct named "mesh".
The definition attribute is optional. It allows dimensions to contain alternative definitions than from input files. By default, if definition is not defined all dimensions are read from the input stream, and are defined as havine the same value as the input file has.
The definition attribute can be used to define a dimension as a constant, or as being dependent on the value of another dimension. It can even be used to define a dimension based on a namelist option.
For example, if the dimension:
<dim name="nCells"/>
is defined, the following are all valid uses of the definition attribute.
<dim name="nCellsP1" definition="nCells+1"/>
<dim name="FIVE" definition="5"/>
<dim name="nCellsNML" definition="namelist:config_number_of_cells"/>
In this case, nCellsP1 has the value of nCells (read from the input file) plus
- FIVE has the constant value of 5, and nCellsNML has whatever value config_number_of_cells has.
The units attribute is optional. Its main use is for documentation purposes. It allows developers to associate units with a dimension which can then be referenced later to better understand what a dimension is for.
The description attribute is optional. It is used for documentation purposes and can be an arbitrary length string describing the purpose of the dimension.
The nml_record block is used to define a record in a namelist file. A namelist record is a sensible grouping of namelist options. Below is an example of how to define a nml_record block in a registry file:
<registry model="mpas" core="core" version="ver">
<nml_record name="rec_name" in_defaults="true">
... other definitions ...
</nml_record>
</registry>
The nml_record block has two attributes, name, and in_defaults.
The name attribute is required. This defines the name of the namelist record in a namelist file. A namelist record is defined by
&nml_rec_name
... options...
/
in a namelist file.
The in_defaults attribute is optional but defaults to true if not specified. At build time, registry generates a default namelist for the core that is being built. In order to build the "minimum set" of required namelist options, registry allows a developer to flag each record and option with the in_defaults attribute. If this attribute has a value of "true" the record and all options not explicitly given an in_defaults value of "false" are written to the default namelist. If it has a value of "false" then the namelist record is not written to the default namelist.
The nml_option block is used to define an individual namelist option. The namelist option must be nested within a namelist record. Below is an example of how to define a nml_option block.
<nml_record name="rec_name" in_defaults="true">
<nml_option name="opt_name" type="opt_type" default_value="opt_val" units="opt_units"
description="opt_description"
possible_values="opt_values" in_defaults="true"
/>
</nml_record>
All namelist options are defined as variables in the mpas_configure module. If
a module needs access to any namelist options, it should use mpas_configure
.
The nml_record block has 7 attributes which will be described below:
The name attribute is required. It defines the name of the namelist option both in the namelist file, and the mpas_configure module.
The type attribute is required. It defines the type of the namelist option in the mpas_configure module. This attribute can have any of the following values:
- integer
- real
- logical
- character
The default_value attribute is required, and defines the value written to the namelist option both when writing the default namelist, and to initialize the namelist option in mpas_configure.
The units attribute is optional and describes the units of the namelist option. It is used for documentation purposes.
The description attribute is optional and allows for a long description of the namelist option. It is used for documentation purposes.
The possible_values attribute is optional and list the possible values the namelist option might take. Currently it is only used for documentation purposes, but in the future this attribute could be used for validation of namelist options.
The in_defaults attribute is optional, but defaults to false. It can be used to override the in_defaults value defined at the nml_record level for an individual nml_option. The use is the same as the nml_record attribute with the same name. This attribute can have the value of either "true" or "false".
The packages blocks wraps the definition of package blocks. It does not have any attributes associated with it. Below is an exaple of how to define a packages block.
<registry model="mpas" core="core" version="ver">
<packages>
... other definitions ...
</packages>
</registry>
The package block defines a package. A package is an abstract grouping of variables. Later, there will be examples of how to attach variables to packages and it will be explained what this does.
Below is an example of how to define a package in registry.
<packages>
<package name="pkgName" description="pkgDescription"/>
</packages>
The package block has two possible attributes, name and description. Below are descriptions of each of these attributes.
The name attribute is required on a package block definition. It defines the name of the package. Defining a package like this creates a variable in the mpas_packages module named "pkgNameActive" (using the previous example). This variable is a logical variable with a default value of ".false.".
Each core has a subroutine named mpas_core_setup_packages. This subroutine can be used to modify the value of these "pkgNameActive" varaibles prior to allocation of variables.
The description attribute is optional. It is intended to be used for documentation purposes. It can contain a long description of the intended use of a package.
The streams block wraps the definitions of stream blocks. Stream blocks must only exist within a streams block. The streams block does not have any attributes associated with it. Below is an example of how to define a streams block.
<registry model="mpas" core="core" version="ver">
<streams>
... other definitions ...
</streams>
</registry>
The stream block defines an I/O stream. Presently only four streams can be defined, but this will be modified in the future. I/O streams are groupings of variables for input/output purposes. Below is an example of how to define a stream block.
<streams>
<stream name="streamName" type="streamType">
<var name="varName"/>
</stream>
</streams>
A stream block has two attributes and a nested block. These will be described below.
The name attribute on a stream block is required. It defines the name of the stream within MPAS. Presently, streams can only have one of four names.
- input
- output
- restart
- surface
In the future, these will be expanded to allow more versatile stream definitions.
The type attribute is required. It defines the type of the stream which can also be throught of as the "direction" of the stream. It can take one of the following three values.
- input
- output
- restart
Input streams can only be read in. Output streams can only be written out. Restart streams are streams that can both be read and written.
A stream by default does not contain any variables. If a stream without any variables is read or written, nothing will happen. In order to attach variables to a stream variables must be listed within the stream block. In the previous example, the varaible with a unique I/O name of varName has been attached to the stream named streamName. Additional variable can be attached to each stream in the same fashion.
A var struct block defines a new data type in MPAS. This data type is a large
grouping of variables. Each data type is defined in the mpas_grid_types module.
Above these data types, there are two larger groupings, the first being a
domain type, and the second being a block type. Within a block type, one
instance of each type defined by a var_struct block exists. For example,
domain % blocklist % mesh
is an example of a var_struct with the name mesh.
Below is an example of how to define a var_struct block.
<registry model="mpas" core="core" version="ver">
<var_struct name="structName" time_levs="N">
... other definitions ...
</var_struct>
</registry>
A var_struct also has two options for nested blocks:
- var
- var_array
The var_struct block has two attributes, name and time_levs. These attributes will be described below.
The name attribute is a required attribute that defines the name of the data type. This defines both the name of the derived data type, and of the particular instance within a block.
For example
<var_struct name="mesh" time_levs="0">
... other definitions ...
</var_struct>
defines a new data type named mesh_type
as well as adding a variable within
domain % blocklist
named mesh
that is of type mesh_type
.
The time_levs attribute is a required attribute that defines how many time levels should be setup within this new data type. It also slightly modifies how a developer can access the fields within a var_struct.
For example
<var_struct name="mesh" time_levs="0">
... other definitions ...
</var_struct>
<var_struct name="state" time_levs="2">
... other definitions ...
</var_struct>
defines two new data types. As in the name attribute section, a mesh_type
is
defined, as well as an additional state_type
. In this case, mesh_type
will
have a single time levels, while state_type
will have two time levels. If
time_levs is zero (e.g. mesh), the lowest level type can be accessed via
domain % blocklist % mesh
. If time_levs is non-zero (e.g. state), the lowest
level type can be accessed via domain % blocklist % state % time_levs(:) % state
.
In this case, the time_levs array is allocated with a size equal to the value of the time_levs attribute for that var_struct. This is true even if time_levs is set to 1.
A var_array block is a way of combining several variables into a single data structure. The var_array block can only be used within a var_struct block. Below is an example of how to define a var_array inside a var_struct.
<var_struct name="mesh" time_levs="0">
<var_array name="varrName" type="varrType" dimensions="dim1 dim2">
<var name="varName" array_group="varGroup" units="varUnits"
description="varDescription" packages="pkg1;pkg2"
/>
</var_array>
</var_struct>
A var block nested within a var_array block is referred to as a constituent variable. All constituent variables must have the same type and dimensions.
A var_array has three direct attributes, name, type, and dimensions. A constituent var block has five attributes, name, array_group, units, description, and packages. Below these will be described.
The name attribute is required for a var_array block. It defines the name of
the variable within MPAS the grouping of variables will be given. In the above
example, a variable will be created as:
domain % blocklist % mesh % varrName
The type attribute is required for a var_array block. It defines the type of the created variable and can have one of the following values:
- real
- integer
- text
The dimensions attribute is required for a var_array block. It defines the dimensions of the constituent variables. It should be a space delimited list of dimensions defined as dim blocks. In addition to any dimensions defined in dim blocks, a Time dimension is available.
For example, <var_array name="varrName" type="varrType" dimensions="nCells Time">
will cause each of the constituents to have the dimension nCells. In terms of
array size, the Time dimension is ignored. It is only used for I/O purposes to
let MPAS know that each I/O frame should contain an independent value for this
variable.
NOTE: The dimensions attribute on a var_array block does not define the dimensions of the var_array. A var_array always has one dimension higher that is equal to the number of constituents defined at run time.
The name attribute is required on all constituent var blocks. It defines the
name of the variable within all I/O files. This name must be unique, and should
not be the same as any other name given to a constituent or non-constituent var
block in registry. The addition of a constituent defines a variable that is
given the name index_varName
. This variable lives within the lowest level
var_struct type and is equal to the index in the var_array for the particular
constituent named varName. If a constituent is defined to be inactive at run
time, this index variable has a value of -1.
The array_group attribute is required on all constituent var blocks. It defines
how constituents should be grouped within a var_array. It also creates two
variables within the lowest level var_struct type. These are named:
array_group_start
and array_group_end
. They are equal to the start and end
indices for the array group. If all constituents within an array_group are
defined as inactive at run time, the start and end indices should both be equal
to -1.
The units attribute is optional on all constituent var blocks. It is used for documentation purposes to define the units of each constituent variable.
The description attribute is optional on all constituent var blocks. It is used for documentation purposes to provide a long description of each constituent variable.
The packages attribute is optional on all constituent var blocks. It is used to attach a constituent variable to a package defined in a packge block. A constituent variable can be attached to multiple packages by providing a semicolon delimited list of packge names. Package names must be identical to those defined within the package blocks.
In the above example, two variables should have been defined due to package blocks named pkg1Active and pkg2Active. If either of these variables has a value of ".true." at the time of allocation, the constituent variable will be active. If both of them are false, the constituent variable will be inactive.
Inactive constituent variables will not be allocated, and will not be acted on in I/O streams.
A var block defines an individual variable at the lowest level var_struct type. Below is an example of how to define a var block.
<var_struct name="mesh" time_levs="0">
<var name="varName" type="varType" dimensions="dim1 dim2"
units="varUnits" description="varDescription" packages="pkg1;pkg2"
name_in_code="codeVarName"
/>
</var_struct>
A var block can have seven attributes, name, type, dimensions, units, description, packages, and name_in_code. These will be described below.
The name attribute on a var block is a required attribute. Is defines the name given to the variable in I/O streams. It should be unique, and should not be the same as any active or inactive constituent or non-constituent varaibles.
In the above example, the var block would produce a field in an output file named varName.
The type attribute is required on all var blocks. It defines the type of the varaible both in MPAS and in input/output files. It can have one of the following values:
- real
- integer
- text
The dimensions attribute is required on all var blocks. It defines the dimensions of the variable both in MPAS and in input/output files. It should be a space delimited list of dimensions defined as dim blocks. In addition to all dimensions defined in dim blocks, the Time dimension is available as an option. The Time dimension is ignored within MPAS however, and does not represent an actual dimension in the array size. Its only purpose is to allow the variable to contain an unlimited dimension in output files.
Dimensions should be listed in Fortran order, and will be identical in Fortran as the order specifed. For example, the above example will produce a varaible named varName with dimensions (dim1, dim2).
The units attribute is optional on all var blocks. It is only used for documentation purposes to allow developers to define the dimensions associated with a variable.
The description attribute is optional on all var blocks. It is only used for documentation purposes to allow developers to write a longer description for each variable.
The packages attribute is optional on all var blocks. It is used to attach a variable to packages defined within package blocks. A variable can be attached to multiple packages by providing a semicolon delimited list of packge names. Package names must be identical to those defined within the package blocks.
In the above example, two variables should have been defined due to package blocks named pkg1Active and pkg2Active. If either of these variables has a value of ".true." at the time of allocation, the variable will be active. If both of them are false, the variable will be inactive.
Inactive variables will not be allocated, and will not be acted on in I/O streams.
The name_in_code attribute is optional on all var blocks. It allows the name
within MPAS to differ from the name in all input/ouput files. In the above
example, the variable will be read or written as varName, but within MPAS it
will be accessible as:
domain % blocklist % mesh % codeVarName