Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

units_metadata attribute and other clarifications of units #480

Merged
merged 16 commits into from
Dec 4, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion appc.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ In the __Units__ column, __u__ indicates units dimensionally equivalent to those
The use of this modifier is deprecated and the standard_name number_of_observations is preferred to describe this type of metadata variable.

| `standard_error` | __u__
| The uncertainty of the data value.
| *The uncertainty of the data value.
The standard error includes both systematic and statistical uncertainty.
By default it is assumed that the values supplied are for one standard error.
If the values supplied are for some multiple of the standard error, the `standard_error` ancillary variable should have an attribute **`standard_error_multiplier`** stating the multiplication factor.
Expand All @@ -30,3 +30,7 @@ If the values supplied are for some multiple of the standard error, the `standar
The variable should have **`flag_values`** or **`flag_masks`** (or both) and **`flag_meanings`** attributes to show how it should be interpreted (<<flags>>).
The use of this modifier is deprecated and the standard_name status_flag is preferred to describe this type of metadata variable.
|===============

*The definition of this modifier implies that if _u_ is a either unit of temperature, or a unit of temperature multiplied by some other unit, the temperature in _u_ must be interpreted as a temperature difference.
Therefore the **`units_metadata`** attribute, if present, must have the value `temperature: difference`, even if the corresponding data variable without the modifier would have `units_metadata="temperature: on_scale"`.
See <<temperature-units>> for explanation.
10 changes: 6 additions & 4 deletions appe.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -41,15 +41,17 @@ This is the default method for a quantity that is extensive with respect to the

| `mode` | __u__ | Mode (most common value)

| `range` | __u__ | Absolute difference between maximum and minimum
| `range` | __u__ | *Absolute difference between maximum and minimum

| `root_mean_square` | __u__ | Root mean square (RMS)

| `standard_deviation` | __u__ | Standard deviation
| `standard_deviation` | __u__ | *Standard deviation

| `sum_of_squares` | __u^2^__ | Sum of squares

| `variance` | __u^2^__ | Variance
| `variance` | __u^2^__ | *Variance
|===============


*The definition of this method implies that if _u_ is a either a unit of temperature, or a unit of temperature multiplied by some other unit, the temperature in _u_ must be interpreted as a temperature difference.
Therefore the **`units_metadata`** attribute, if present, must have the value `temperature: difference`.
See <<temperature-units>> for explanation.
107 changes: 93 additions & 14 deletions ch03.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -14,20 +14,29 @@ But since it is an optional attribute, applications that implement these standar
=== Units

The **`units`** attribute is required for all variables that represent dimensional quantities (except for boundary variables defined in <<cell-boundaries>> and climatology variables defined in <<climatological-statistics>>).
The value of the **`units`** attribute is a string that can be recognized by the UDUNITS package <<UDUNITS>>, with a few exceptions that are given below.
The **`units`** attribute is permitted but not required for dimensionless quantities (see <<dimensionless-units>>).

The value of the **`units`** attribute is a string that can be recognized by the UDUNITS package <<UDUNITS>>, with the exceptions that are given in <<dimensionless-units>> and <<units-multiples>>.
Note that case is significant in the **`units`** strings.
Note also that CF depends on UDUNITS only for the definition of legal **`units`** strings.
CF does not assume or require that the UDUNITS software will be used for **`units`** conversion.
In most **`units`** conversions, the sole operation on the data is multiplication by a scale factor.
Special treatment is required in converting the **`units`** of variables that involve temperature (<<temperature-units>>) and the **`units`** of time coordinate variables (<<time-coordinate>>).

The COARDS convention prohibits the unit `degrees` altogether, but this unit is not forbidden by the CF convention because it may in fact be appropriate for a variable containing, say, solar zenith angle.
The unit `degrees` is also allowed on coordinate variables such as the latitude and longitude coordinates of a transformed grid.
In this case the coordinate values are not true latitudes and longitudes which must always be identified using the more specific forms of `degrees` as described in <<latitude-coordinate>> and <<longitude-coordinate>>.

Units are not required for dimensionless quantities.

[[dimensionless-units, Section 3.1.1, "Dimensionless units"]]
==== Dimensionless units

A variable with no **`units`** attribute is assumed to be dimensionless.
However, a **`units`** attribute specifying a dimensionless unit may optionally be included.
The canonical unit (see also <<standard-name>>) for dimensionless quantities that represent fractions, or parts of a whole, is `1`.
When a dimensionless quantity is a ratio of dimensional quantities, CF suggests that it may be informative to users of data if the **`units`** are given as ratio of dimensional units, for instance `mg kg-1` for a mass ratio of 1e-6, or `microlitre litre-1` for a volume ratio of 1e-6.

The UDUNITS package defines a few dimensionless units, such as `percent`, `ppm` (parts per million, 1e-6), and `ppb` (parts per billion, 1e-9).
As an alternative to the canonical **`units`** of `1` or some other unitless number, the **`units`** for a dimensionless quantity may be given as a ratio of dimensional units, for instance `mg kg-1` for a mass ratio of 1e-6, or `microlitre litre-1` for a volume ratio of 1e-6. Data-producers are invited to consider whether this alternative would be more helpful to the users of their data.

The CF convention supports dimensionless units that are UDUNITS compatible, with one exception, concerning the dimensionless units defined by UDUNITS for volume ratios, such as `ppmv` and `ppbv`.
These units are allowed in the **`units`** attribute by CF only if the data variable has no **`standard_name`**.
These units are prohibited by CF if there is a **`standard_name`**, because the **`standard_name`** defines whether the quantity is a volume ratio, so the **`units`** are needed only to indicate a dimensionless number.
Expand All @@ -41,9 +50,78 @@ The UDUNITS syntax that allows scale factors and offsets to be applied to a unit
The application of any scale factors or offsets to data should be indicated by the **`scale_factor`** and **`add_offset`** attributes.
Use of these attributes for data packing, which is their most important application, is discussed in detail in <<packed-data>>.

UDUNITS recognizes the following prefixes and their abbreviations.

[[temperature-units, Section 3.1.2, "Temperature units"]]
==== Temperature units

The **`units`** of temperature imply an origin (i.e. zero point) for the associated measurement scale.
When the temperature value is the degree of warmth with respect to the origin of the measurement scale, we call it an _on-scale temperature_.
When **`units`** of on-scale temperature are converted, the data may require the addition of an offset as well as multiplication by a scale factor, because the physical meaning of a numerical value of zero for an on-scale temperature depends on the unit of measurement.
On-scale temperature is _unique_ among quantities in the respect that the origin and the unit of measurement are both defined by the **`units`** and therefore cannot be chosen independently.
For all other quantities, the origin and the unit of measurement are independent.
Converting the unit of measurement alone, without changing the origin, does not change the meaning of zero.
For example (using **bold** to indicate a numerical data value), **0** `kilogram` is the same mass as **0** `pound`, and **0** `seconds since 1970-1-1` means the same as **0** `days since 1970-1-1`, but **0** `degC` is not the same temperature as **0** `degF` (= **-17.8** `degC`), because these two temperature **`units`** implicitly refer to measurement scales which have different origins.

On the other hand, when the temperature value is a _temperature difference_, which compares two on-scale temperatures with the same origin, the value of that origin is irrelevant as it cancels out when taking the difference.
Therefore to convert the **`units`** of a temperature difference requires only multiplication by a scale factor, without the addition of an offset.

The **`units`** attribute does not distinguish between on-scale temperatures and temperature differences.
This ambiguity also affects units of temperature raised to some power e.g. `K^2` or multiplied by other units e.g. `W m-2 K-1`, `degF/foot` or `degC m s-1`.
A **`standard_name`** (<<standard-name>>) or **`standard_name`** modifier (<<standard-name-modifiers>>) may clarify the intention, but they are optional.
Some statistical operations described by the **`cell_methods`** attribute (<<cell-methods>>; <<appendix-cell-methods>>) imply that temperature must be interpreted as temperature difference, but this attribute is optional too.

In order to convert the **`units`** correctly, it is essential to know whether a temperature is on-scale or a difference.
Therefore this standard strongly recommends that any variable whose **`units`** involve a temperature unit should also have a **`units_metadata`** attribute to make the distinction.
This attribute must have one of the following three values: `temperature: on_scale`, `temperature: difference`, `temperature: unknown`.
The **`units_metadata`** attribute, **`standard_name`** modifier (<<standard-name-modifiers>>) and **`cell_methods`** attribute (<<appendix-cell-methods>>) must be consistent if present.
A variable must not have a **`units_metadata`** attribute if it has no **`units`** attribute or if its **`units`** do not involve a temperature unit.

[[use-of-units-metadata-ex]]
[caption="Example 3.1. "]
.Use of **`units_metadata`** to distinguish temperature quantities
====
----
variables:
float Tonscale;
Tonscale:long_name="global-mean surface temperature";
Tonscale:standard_name="surface_temperature";
Tonscale:units="degC";
Tonscale:units_metadata="temperature: on_scale";
Tonscale:cell_methods="area: mean";
float Tdifference;
Tdifference:long_name="change in global-mean surface temperature relative to pre-industrial";
Tdifference:standard_name="surface_temperature";
Tdifference:units="degC";
Tdifference:units_metadata="temperature: difference";
Tdifference:cell_methods="area: mean";
----
====

With `temperature: unknown`, correct conversion of the **`units`** cannot be guaranteed.
This value of **`units_metadata`** indicates that the data-writer does not know whether the temperature is on-scale or a difference.
If the **`units_metadata`** attribute is not present, the data-reader should assume `temperature: unknown`.
The **`units_metadata`** attribute was introduced in CF 1.11.
In data written according to versions before 1.11, `temperature: unknown` should be assumed for all **`units`** involving temperature, if it cannot be deduced from other metadata.
We note (for guidance only for `temperature: unknown`, not as a CF convention) that the UDUNITS software assumes `temperature: on_scale` for **`units`** strings containing only a unit of temperature, and `temperature: difference` for **`units`** strings in which a unit of temperature is raised to any power other than unity, or multiplied or divided by any other unit.

With `temperature: on_scale`, correct conversion can be guaranteed only for pure temperature **`units`**.
If the quantity is an on-scale temperature multiplied by some other quantity, it is not possible to convert the data from the **`units`** given to any other **`units`** that involve a temperature with a different origin, given only the **`units`**.
For instance, when temperature is on-scale, a value in `kg degree_C m-2` can be converted to a value in `kg K m-2` only if we know separately the values in `degree_C` and `kg m-2` of which it is the product.


[[units-multiples, Section 3.1.3, "Scale factors and offsets"]]
==== Scale factors and offsets

UDUNITS recognises the SI prefixes shown in <<table-supported-units>> for decimal multiples and submultiples of units, and allows them to be applied to non-SI units as well.
UDUNITS offers a syntax for indicating arbitrary scale factors and offsets to be applied to a unit.
(Note that this is different from the scale factors and offsets used for converting between **`units`**, as discussed for temperature in <<temperature-units>>.)
This UDUNITS syntax for arbitrary transformation of **`units`** is not supported by **the CF** standard, except for the case of specifying reference time (<<time-coordinate>>).
The application of any scale factors or offsets to data should be indicated by the **`scale_factor`** and **`add_offset`** attributes.
Use of these attributes for data packing, which is their most important application, is discussed in detail in <<packed-data>>.

[[table-supported-units]]
.Supported Units
.Prefixes for decimal multiples and submultiples of units
[options="header",caption="Table 3.1. "]
|===============
| Factor | Prefix | Abbreviation | | Factor | Prefix | Abbreviation
Expand All @@ -59,6 +137,7 @@ UDUNITS recognizes the following prefixes and their abbreviations.
| 1e24 | yotta | Y | | 1e-24 | yocto | y
|===============


[[long-name, Section 3.2, "Long Name"]]
=== Long Name

Expand Down Expand Up @@ -93,7 +172,7 @@ Unless it is dimensionless, a variable with a **`standard_name`** attribute must

description:: The description is meant to clarify the qualifiers of the fundamental quantities such as which surface a quantity is defined on or what the flux sign conventions are.
We don't attempt to provide precise definitions of fundumental physical quantities (e.g., temperature) which may be found in the literature.
The description may define rules on the variable type, attributes and coordinates which must be complied with by any variable carrying that standard name (such as in example 3.4).
The description may define rules on the variable type, attributes and coordinates which must be complied with by any variable carrying that standard name (such as in Example 3.5).

When appropriate, the table entry also contains the corresponding GRIB parameter code(s) (from ECMWF and NCEP) and AMIP identifiers.

Expand All @@ -116,7 +195,7 @@ Other types of quantity modifiers are expressed using the optional modifier part
The permissible values of these modifiers are given in <<standard-name-modifiers>>.

[[use-of-standard-name-ex]]
[caption="Example 3.1. "]
[caption="Example 3.2. "]
.Use of **`standard_name`**
====
Expand Down Expand Up @@ -148,7 +227,7 @@ The dimensions of an ancillary variable must be the same as or a subset of the d
If an ancillary variable of a data variable that has been compressed by gathering (<<compression-by-gathering>>) does not span the compressed dimension, then its dimensions may be any subset of the data variable's uncompressed dimensions, i.e. any of the dimensions of the data variable except the compressed dimension, and any of the dimensions listed by the **`compress`** attribute of the compressed coordinate variable.

[[instrument-data-ex]]
[caption="Example 3.2. "]
[caption="Example 3.3. "]
.Ancillary instrument data
====
Expand Down Expand Up @@ -184,7 +263,7 @@ Several examples are listed below:
The following example illustrates the use of three of these flags to represent two independent quality control tests and an aggregate flag that combines the results of the two tests.

[[quality-flag-ex]]
[caption="Example 3.3. "]
[caption="Example 3.4. "]
.Ancillary quality flag data
====
Expand Down Expand Up @@ -226,7 +305,7 @@ If multi-word phrases are used to describe the flag values, then the words withi
The following example illustrates the use of flag values to express a speed quality with an enumerated status code.

[[flag-variable-flag-values-ex]]
[caption="Example 3.4. "]
[caption="Example 3.5. "]
.A flag variable, using **`flag_values`**
====
Expand Down Expand Up @@ -255,7 +334,7 @@ The following example illustrates the use of flag_masks to express six sensor st


[[flag-variable-flag-masks-ex]]
[caption="Example 3.5. "]
[caption="Example 3.6. "]
.A flag variable, using **`flag_masks`**
====
Expand All @@ -279,7 +358,7 @@ The following example illustrates this using integer flag values for a variable


[[region-variable-flag-values-ex]]
[caption="Example 3.6. "]
[caption="Example 3.7. "]
.A region variable, using **`flag_values`**
====
Expand All @@ -303,7 +382,7 @@ Each **`flag_values`** and **`flag_masks`** value must coincide with a **`flag_m
The following example illustrates the use of **`flag_masks`** and **`flag_values`** to express two sensor status conditions and one enumerated status code.

[[flag-variable-flag-masks-flag-values-ex]]
[caption="Example 3.7. "]
[caption="Example 3.8. "]
.A flag variable, using **`flag_masks`** and **`flag_values`**
====
Expand Down
8 changes: 2 additions & 6 deletions ch04.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -97,9 +97,7 @@ Optionally, the longitude type may be indicated additionally by providing the **
Coordinates of longitude with respect to a rotated pole should be given units of **`degrees`**, not **`degrees_east`** or equivalents, because applications which use the units to identify axes would have no means of distinguishing such an axis from real longitude, and might draw incorrect coastlines, for instance.




[[vertical-coordinate]]
[[vertical-coordinate, Section 4.3, "Vertical Coordinate"]]
=== Vertical (Height or Depth) Coordinate

Variables representing dimensional height or depth axes must always explicitly include the **`units`** attribute; there is no default value.
Expand Down Expand Up @@ -214,9 +212,7 @@ The `computed_standard_name` attribute indicates that the values in variable
`p` would have a `standard_name` of `air_pressure`.




[[time-coordinate]]
[[time-coordinate, Section 4.4, "Time Coordinate"]]
=== Time Coordinate

Variables representing reference time must always explicitly include the **`units`** attribute; there is no default value.
Expand Down
5 changes: 5 additions & 0 deletions conformance.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -147,11 +147,16 @@ Exceptions are boundary and climatology variables.
* The type of the **`units`** attribute is a string that must be recognizable by the UDUNITS package.
Exceptions are the units **`level`**, **`layer`**, and **`sigma_level`**.
* Dimensionless units for volume fractions defined by UDUNITS (**`ppv`**, **`ppmv`**, **`ppbv`**, **`pptv`**, **`ppqv`**) are not allowed in the **`units`** attribute of any variable which also has a **`standard_name`** attribute.
* If present, the **`units_metadata`** attribute must have one of these values: `temperature: on_scale`, `temperature: difference`, `temperature: unknown`.
* The **`units`** of a variable that specifies a **`standard_name`** must be physically equivalent to the canonical units given in the standard name table, as modified by the **`standard_name`** modifier, if there is one, according to Appendix C, and then modified by all the methods listed in order by the **`cell_methods`** attribute, if one is present, according to Appendix E.
* If the **`standard_name`** attribute includes the `standard_error` modifier, the **`units_metadata`** attribute, if present, must have the value `temperature: difference`.
* If the **`cell_methods`** attribute includes any entry with any of the methods `range`, `standard_deviation` or `variance`, the **`units_metadata`** attribute, if present, must have the value `temperature: difference`.
* A variable must not have a **`units_metadata`** attribute if it has no **`units`** attribute or if its **`units`** do not involve a temperature unit.

*Recommendations:*

* The units **`level`**, **`layer`**, and **`sigma_level`** are deprecated.
* Any variable whose **`units`** involve a temperature unit should also have a **`units_metadata`** attribute.

[[section-8]]

Expand Down
2 changes: 2 additions & 0 deletions history.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@

=== Working version (most recent first)

* {issues}481[Issue #481]: Introduce **`units_metadata`** attribute and clarify some other aspects of **`units`**
* {issues}458[Issue #147]: Clarify the use of compressed dimensions in related variables
* {issues}486[Issue #486]: Fix PDF formatting problems and invalid references
* {issues}490[Issue #490]: Simple correction to Example 6.1.2
* {issues}457[Issue #457]: Creation date of the draft Conventions document
Expand Down
Loading
Loading