How to use Registry

This document should help developers learn how to make use of the Registry format used within MPAS. MPAS makes use of an xml based Registry format, which can be used to define anything from dimensions, to streams, to namelist options, and variables.

Registry provides an easy to modify method of auto-generating data structures and variable definitions. In addition to auto-generating data structures, it also auto-generates some subroutines which can be used to initalize or destroy these data structures.

Below will be a description of the Registry.xml format, along with some of the rules assocaited with Registry.

XML Fromat

As mentioned above, Registry uses an XML format. This choice was made for a variety of reasons, but mostly because it is:

Easily extensible
Self documenting
Easily parsed

A Registry file has to be defined in a particular fashion. To begin, it must follow a standard XML format, with all of the subsequent defintions defined within a registry block, as follows:

<?xml version="1.0"?>  
<registry model="mpas" core="core" version="ver">  
	... other definitions ...  
</registry>

The registry block is the highest level construct that exists in a registry file. The model, core, and version attributes allow each dynamic core (atmosphere, ocean, land ice, etc...) to name their registry block. These attributes are also written to output files. So a user/developer knows which core, and version of MPAS an output file was generate from.

Within the registry block, several other blocks can be defined. These are:

dims
nml_record
packages
streams
var_struct

Each of these blocks defines a different part of MPAS, and will be described in their own sections.

dims block

Nested under the registry block, a dims block can be defined. The dims block is used to group together all dim blocks which are used to define dimensions. Below is an example of how to define the dims block.

<registry model="mpas" core="core" version="ver">  
	<dims>
		... other definitions ...
	</dims>
</registry>

The dims block does not contain any attributes, and it just used as an overarching structure that contains all dim blocks.

dim block

Beneath the dims block a dim block can be defined. A dim block defines an individual dimension. Each dimension is contained within the var_struct named "mesh" which will be described later. Below is an example of how to define a dim block.

<dims>
	<dim name="dimension_name" definition="definition_string"
		 units="dimension_units"
		 description="dimension_description"
	/>
</dims>

When defining a dim block, there are 4 available attributes. These are name, definition, units, and description. Below each of these attributes are described:

name attribute

The name attribute is the only required attribute for a dim block. It defines the name of the dimension in any streams that it is used in, as well as the var_struct named "mesh".

definition attribute

The definition attribute is optional. It allows dimensions to contain alternative definitions than from input files. By default, if definition is not defined all dimensions are read from the input stream, and are defined as havine the same value as the input file has.

The definition attribute can be used to define a dimension as a constant, or as being dependent on the value of another dimension. It can even be used to define a dimension based on a namelist option.

For example, if the dimension:

<dim name="nCells"/>

is defined, the following are all valid uses of the definition attribute.

<dim name="nCellsP1" definition="nCells+1"/>
<dim name="FIVE" definition="5"/>
<dim name="nCellsNML" definition="namelist:config_number_of_cells"/>

In this case, nCellsP1 has the value of nCells (read from the input file) plus

FIVE has the constant value of 5, and nCellsNML has whatever value config_number_of_cells has.

units attribute

The units attribute is optional. Its main use is for documentation purposes. It allows developers to associate units with a dimension which can then be referenced later to better understand what a dimension is for.

description attribute

The description attribute is optional. It is used for documentation purposes and can be an arbitrary length string describing the purpose of the dimension.

nml_record block

The nml_record block is used to define a record in a namelist file. A namelist record is a sensible grouping of namelist options. Below is an example of how to define a nml_record block in a registry file:

<registry model="mpas" core="core" version="ver">  
	<nml_record name="rec_name" in_defaults="true">
		... other definitions ...
	</nml_record>
</registry>

The nml_record block has two attributes, name, and in_defaults.

name attribute

The name attribute is required. This defines the name of the namelist record in a namelist file. A namelist record is defined by

&nml_rec_name
	... options...
/

in a namelist file.

in_defaults attribute

The in_defaults attribute is optional but defaults to true if not specified. At build time, registry generates a default namelist for the core that is being built. In order to build the "minimum set" of required namelist options, registry allows a developer to flag each record and option with the in_defaults attribute. If this attribute has a value of "true" the record and all options not explicitly given an in_defaults value of "false" are written to the default namelist. If it has a value of "false" then the namelist record is not written to the default namelist.

nml_option block

The nml_option block is used to define an individual namelist option. The namelist option must be nested within a namelist record. Below is an example of how to define a nml_option block.

<nml_record name="rec_name" in_defaults="true">
	<nml_option name="opt_name" type="opt_type" default_value="opt_val" units="opt_units"
				description="opt_description"
				possible_values="opt_values" in_defaults="true"
	/>
</nml_record>

All namelist options are defined as variables in the mpas_configure module. If a module needs access to any namelist options, it should use mpas_configure.

The nml_record block has 7 attributes which will be described below:

name attribute

The name attribute is required. It defines the name of the namelist option both in the namelist file, and the mpas_configure module.

type attribute

The type attribute is required. It defines the type of the namelist option in the mpas_configure module. This attribute can have any of the following values:

integer
real
logical
character

default_value attribute

The default_value attribute is required, and defines the value written to the namelist option both when writing the default namelist, and to initialize the namelist option in mpas_configure.

units attribute

The units attribute is optional and describes the units of the namelist option. It is used for documentation purposes.

description attribute

The description attribute is optional and allows for a long description of the namelist option. It is used for documentation purposes.

possible_values attribute

The possible_values attribute is optional and list the possible values the namelist option might take. Currently it is only used for documentation purposes, but in the future this attribute could be used for validation of namelist options.

in_defaults attribute

The in_defaults attribute is optional, but defaults to false. It can be used to override the in_defaults value defined at the nml_record level for an individual nml_option. The use is the same as the nml_record attribute with the same name. This attribute can have the value of either "true" or "false".

packages block

The packages blocks wraps the definition of package blocks. It does not have any attributes associated with it. Below is an exaple of how to define a packages block.

<registry model="mpas" core="core" version="ver">  
	<packages>
		... other definitions ...
	</packages>
</registry>

package block

The package block defines a package. A package is an abstract grouping of variables. Later, there will be examples of how to attach variables to packages and it will be explained what this does.

Below is an example of how to define a package in registry.

<packages>
	<package name="pkgName" description="pkgDescription"/>
</packages>

The package block has two possible attributes, name and description. Below are descriptions of each of these attributes.

name attribute

The name attribute is required on a package block definition. It defines the name of the package. Defining a package like this creates a variable in the mpas_packages module named "pkgNameActive" (using the previous example). This variable is a logical variable with a default value of ".false.".

Each core has a subroutine named mpas_core_setup_packages. This subroutine can be used to modify the value of these "pkgNameActive" varaibles prior to allocation of variables.

description attribute

The description attribute is optional. It is intended to be used for documentation purposes. It can contain a long description of the intended use of a package.

streams block

The streams block wraps the definitions of stream blocks. Stream blocks must only exist within a streams block. The streams block does not have any attributes associated with it. Below is an example of how to define a streams block.

<registry model="mpas" core="core" version="ver">  
	<streams>
		... other definitions ...
	</streams>
</registry>

stream block

The stream block defines an I/O stream. Presently only four streams can be defined, but this will be modified in the future. I/O streams are groupings of variables for input/output purposes. Below is an example of how to define a stream block.

<streams>
	<stream name="streamName" type="streamType">
		<var name="varName"/>
	</stream>
</streams>

A stream block has two attributes and a nested block. These will be described below.

name attribute

The name attribute on a stream block is required. It defines the name of the stream within MPAS. Presently, streams can only have one of four names.

input
output
restart
surface

In the future, these will be expanded to allow more versatile stream definitions.

type attribute

The type attribute is required. It defines the type of the stream which can also be throught of as the "direction" of the stream. It can take one of the following three values.

input
output
restart

Input streams can only be read in. Output streams can only be written out. Restart streams are streams that can both be read and written.

nested blocks

A stream by default does not contain any variables. If a stream without any variables is read or written, nothing will happen. In order to attach variables to a stream variables must be listed within the stream block. In the previous example, the varaible with a unique I/O name of varName has been attached to the stream named streamName. Additional variable can be attached to each stream in the same fashion.

var_struct block

A var struct block defines a new data type in MPAS. This data type is a large grouping of variables. Each data type is defined in the mpas_grid_types module. Above these data types, there are two larger groupings, the first being a domain type, and the second being a block type. Within a block type, one instance of each type defined by a var_struct block exists. For example, domain % blocklist % mesh is an example of a var_struct with the name mesh. Below is an example of how to define a var_struct block.

<registry model="mpas" core="core" version="ver">  
	<var_struct name="structName" time_levs="N">
		... other definitions ...
	</var_struct>
</registry>

A var_struct also has two options for nested blocks:

var
var_array

The var_struct block has two attributes, name and time_levs. These attributes will be described below.

name attribute

The name attribute is a required attribute that defines the name of the data type. This defines both the name of the derived data type, and of the particular instance within a block.

For example

<var_struct name="mesh" time_levs="0">
	... other definitions ...
</var_struct>

defines a new data type named mesh_type as well as adding a variable within domain % blocklist named mesh that is of type mesh_type.

time_levs attribute

The time_levs attribute is a required attribute that defines how many time levels should be setup within this new data type. It also slightly modifies how a developer can access the fields within a var_struct.

For example

<var_struct name="mesh" time_levs="0">
	... other definitions ...
</var_struct>
<var_struct name="state" time_levs="2">
	... other definitions ...
</var_struct>

defines two new data types. As in the name attribute section, a mesh_type is defined, as well as an additional state_type. In this case, mesh_type will have a single time levels, while state_type will have two time levels. If time_levs is zero (e.g. mesh), the lowest level type can be accessed via domain % blocklist % mesh. If time_levs is non-zero (e.g. state), the lowest level type can be accessed via domain % blocklist % state % time_levs(:) % state.

In this case, the time_levs array is allocated with a size equal to the value of the time_levs attribute for that var_struct. This is true even if time_levs is set to 1.

var_array block

A var_array block is a way of combining several variables into a single data structure. The var_array block can only be used within a var_struct block. Below is an example of how to define a var_array inside a var_struct.

<var_struct name="mesh" time_levs="0">
	<var_array name="varrName" type="varrType" dimensions="dim1 dim2">
		<var name="varName" array_group="varGroup" units="varUnits"
			 description="varDescription" packages="pkg1;pkg2"
		/>
	</var_array>
</var_struct>

A var block nested within a var_array block is referred to as a constituent variable. All constituent variables must have the same type and dimensions.

A var_array has three direct attributes, name, type, and dimensions. A constituent var block has five attributes, name, array_group, units, description, and packages. Below these will be described.

name attribute

The name attribute is required for a var_array block. It defines the name of the variable within MPAS the grouping of variables will be given. In the above example, a variable will be created as:
domain % blocklist % mesh % varrName

type attribute

The type attribute is required for a var_array block. It defines the type of the created variable and can have one of the following values:

real
integer
text

dimensions attribute

The dimensions attribute is required for a var_array block. It defines the dimensions of the constituent variables. It should be a space delimited list of dimensions defined as dim blocks. In addition to any dimensions defined in dim blocks, a Time dimension is available.

For example, <var_array name="varrName" type="varrType" dimensions="nCells Time"> will cause each of the constituents to have the dimension nCells. In terms of array size, the Time dimension is ignored. It is only used for I/O purposes to let MPAS know that each I/O frame should contain an independent value for this variable.

NOTE: The dimensions attribute on a var_array block does not define the dimensions of the var_array. A var_array always has one dimension higher that is equal to the number of constituents defined at run time.

constituent var name attribute

The name attribute is required on all constituent var blocks. It defines the name of the variable within all I/O files. This name must be unique, and should not be the same as any other name given to a constituent or non-constituent var block in registry. The addition of a constituent defines a variable that is given the name index_varName. This variable lives within the lowest level var_struct type and is equal to the index in the var_array for the particular constituent named varName. If a constituent is defined to be inactive at run time, this index variable has a value of -1.

constituent var array_group attribute

The array_group attribute is required on all constituent var blocks. It defines how constituents should be grouped within a var_array. It also creates two variables within the lowest level var_struct type. These are named: array_group_start and array_group_end. They are equal to the start and end indices for the array group. If all constituents within an array_group are defined as inactive at run time, the start and end indices should both be equal to -1.

constituent var units attribute

The units attribute is optional on all constituent var blocks. It is used for documentation purposes to define the units of each constituent variable.

constituent var description attribute

The description attribute is optional on all constituent var blocks. It is used for documentation purposes to provide a long description of each constituent variable.

constituent var packages attribute

The packages attribute is optional on all constituent var blocks. It is used to attach a constituent variable to a package defined in a packge block. A constituent variable can be attached to multiple packages by providing a semicolon delimited list of packge names. Package names must be identical to those defined within the package blocks.

In the above example, two variables should have been defined due to package blocks named pkg1Active and pkg2Active. If either of these variables has a value of ".true." at the time of allocation, the constituent variable will be active. If both of them are false, the constituent variable will be inactive.

Inactive constituent variables will not be allocated, and will not be acted on in I/O streams.

var block

A var block defines an individual variable at the lowest level var_struct type. Below is an example of how to define a var block.

<var_struct name="mesh" time_levs="0">
	<var name="varName" type="varType" dimensions="dim1 dim2"
		 units="varUnits" description="varDescription" packages="pkg1;pkg2"
		 name_in_code="codeVarName"
	/>
</var_struct>

A var block can have seven attributes, name, type, dimensions, units, description, packages, and name_in_code. These will be described below.

name attribute

The name attribute on a var block is a required attribute. Is defines the name given to the variable in I/O streams. It should be unique, and should not be the same as any active or inactive constituent or non-constituent varaibles.

In the above example, the var block would produce a field in an output file named varName.

type attribute

The type attribute is required on all var blocks. It defines the type of the varaible both in MPAS and in input/output files. It can have one of the following values:

real
integer
text

dimensions attribute

The dimensions attribute is required on all var blocks. It defines the dimensions of the variable both in MPAS and in input/output files. It should be a space delimited list of dimensions defined as dim blocks. In addition to all dimensions defined in dim blocks, the Time dimension is available as an option. The Time dimension is ignored within MPAS however, and does not represent an actual dimension in the array size. Its only purpose is to allow the variable to contain an unlimited dimension in output files.

Dimensions should be listed in Fortran order, and will be identical in Fortran as the order specifed. For example, the above example will produce a varaible named varName with dimensions (dim1, dim2).

units attribute

The units attribute is optional on all var blocks. It is only used for documentation purposes to allow developers to define the dimensions associated with a variable.

description attribute

The description attribute is optional on all var blocks. It is only used for documentation purposes to allow developers to write a longer description for each variable.

packages attribute

The packages attribute is optional on all var blocks. It is used to attach a variable to packages defined within package blocks. A variable can be attached to multiple packages by providing a semicolon delimited list of packge names. Package names must be identical to those defined within the package blocks.

In the above example, two variables should have been defined due to package blocks named pkg1Active and pkg2Active. If either of these variables has a value of ".true." at the time of allocation, the variable will be active. If both of them are false, the variable will be inactive.

Inactive variables will not be allocated, and will not be acted on in I/O streams.

name_in_code attribute

The name_in_code attribute is optional on all var blocks. It allows the name within MPAS to differ from the name in all input/ouput files. In the above example, the variable will be read or written as varName, but within MPAS it will be accessible as:
domain % blocklist % mesh % codeVarName

How to use Registry

XML Fromat

dims block

dim block

name attribute

definition attribute

units attribute

description attribute

nml_record block

name attribute

in_defaults attribute

nml_option block

name attribute

type attribute

default_value attribute

units attribute

description attribute

possible_values attribute

in_defaults attribute

packages block

package block

name attribute

description attribute

streams block

stream block

name attribute

type attribute

nested blocks

var_struct block

name attribute

time_levs attribute

var_array block

name attribute

type attribute

dimensions attribute

constituent var name attribute

constituent var array_group attribute

constituent var units attribute

constituent var description attribute

constituent var packages attribute

var block

name attribute

type attribute

dimensions attribute

units attribute

description attribute

packages attribute

name_in_code attribute

Clone this wiki locally