Skip to content

Commit

Permalink
Adds a CF-compliant unlimited time dimension
Browse files Browse the repository at this point in the history
  - adds support for an unlimited time dimension
  - adds the CF-compliant time field with appropriate units
  - adds additional Field properties to support time-dependent fields
  - extends the support for non-distributed field IO into IOStream
  - updates documentation

This still only supports one time slice per file. A subsequent modification
will add support for multiple slices in a file.
  • Loading branch information
philipwjones committed Nov 25, 2024
1 parent 871ea4f commit 6230c93
Show file tree
Hide file tree
Showing 11 changed files with 288 additions and 64 deletions.
31 changes: 24 additions & 7 deletions components/omega/doc/devGuide/Field.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ Fields are created with standard metadata using
ValidMax, ///< [in] max valid field value field data)
FillValue, ///< [in] scalar used for undefined entries
NumDims, ///< [in] number of dimensions (int)
Dimensions ///< dim names for each dim (vector of strings)
Dimensions, ///< [in] dim names (vector of strings)
InTimeDependent ///< [in] (opt, default true) if time varying
);
```
This interface enforces a list of required metadata. If a CF standard name does
Expand All @@ -49,8 +50,15 @@ for some intermediate calculations or unique analyses. If there is no
restriction on valid range, an appropriately large range should be provided for
the data type. Similarly, if a FillValue is not being used, a very unique
number should be supplied to prevent accidentally treating valid data as a
FillValue. Actual field data stored in an array is attached in a separate
call as described below. Fields without a data array can be created with:
FillValue. The optional TimeDependent argument can be omitted and is assumed
to be true by default. Fields with this attribute will be output with the
unlimited time dimension added. Time should not be added explicitly in the
dimension list since it will be added during I/O. Fields that do not change
with time should include this argument with the value false so that the time
dimension is not added. Actual field data stored in an array is attached in a
separate call as described below. Scalar fields can be added by setting the
NumDims to zero (the DimNames is then ignored). Scalar data is attached using
a 1D array with size 1. Fields without a data array can be created with:
```c++
std::shared_ptr<Field> MyField =
Field::create(FieldName ///< [in] Name of field
Expand Down Expand Up @@ -108,9 +116,10 @@ captured correctly. If the location of the data changes (eg the time
level changes and the pointer points to a different time slice), the data must
be updated by calling the attach routine to replace the pointer to the new
location. It is up to the developer to insert the appropriate call to reattach
the data. The attach function primarily sets the pointer to the data location
but it also sets the data type of the variable and its memory location using
two enum classes:
the data. As mentioned previously, scalar data should be attached using the
appropriate 1D HostArray with a size of 1. The attach function primarily sets
the pointer to the data location but it also sets the data type of the variable
and its memory location using two enum classes:
```c++
enum class FieldType {Unknown, I4, I8, R4, R8};
enum class FieldMemLoc {Unknown, Device, Host, Both};
Expand Down Expand Up @@ -155,7 +164,15 @@ The dimension information can be retrieved using:
int Err = MyField->getDimNames(MyDimNames);
```
Once the dimension names have been retrieved, the Dimension class API can be
used to extract further dimension information.
used to extract further dimension information. Two other field quantities
can be retrieved, but are used only by the IOStream capability:
```c++
bool IsTimeDependent = MyField->isTimeDependent();
bool IsDistributed = MyField->isDistributed();
```
The first determines whether the unlimited time dimension should be added
during IO operations. The second determines whether any of the dimensions
are distributed across MPI tasks so that parallel IO is required.

The data and metadata stored in a field can be retrieved using several
functions. To retrieve a pointer to the full Field, use:
Expand Down
3 changes: 3 additions & 0 deletions components/omega/src/base/IO.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@ namespace IO {
/// ID for global metadata (ie metadata not associated with a variable)
constexpr int GlobalID = PIO_GLOBAL;

/// Length for unlimited dimensions
constexpr int Unlimited = PIO_UNLIMITED;

/// Choice of parallel IO rearranger algorithm
enum Rearranger {
RearrBox = PIO_REARR_BOX, ///< box rearranger (default)
Expand Down
44 changes: 42 additions & 2 deletions components/omega/src/infra/Field.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include "Field.h"
#include "DataTypes.h"
#include "Dimension.h"
#include "IO.h"
#include "Logging.h"
#include <iostream>
#include <map>
Expand All @@ -30,7 +31,8 @@ std::map<std::string, std::shared_ptr<FieldGroup>> FieldGroup::AllGroups;

//------------------------------------------------------------------------------
// Initializes the fields for global code and simulation metadata
int Field::init() {
int Field::init(const Clock *ModelClock // [in] default model clock
) {

int Err = 0;

Expand All @@ -39,6 +41,24 @@ int Field::init() {
std::shared_ptr<Field> CodeField = create(CodeMeta);
std::shared_ptr<Field> SimField = create(SimMeta);

// Define an unlimited time dimension for many time-dependent fields
// for CF-compliant output
std::shared_ptr<Dimension> TimeDim =
Dimension::create("time", IO::Unlimited);

// Define a time field with required metadata for CF-compliant output
// It is defined here as a scalar field but the time axis will be added
// during IO
TimeInstant StartTime = ModelClock->getStartTime();
std::string StartTimeStr = StartTime.getString(4, 0, " ");
std::string UnitString = "seconds since " + StartTimeStr;
CalendarKind CalKind = Calendar::getKind();
std::string CalName = CalendarCFName[CalKind];
std::vector<std::string> DimNames; // empty dim names vector
std::shared_ptr<Field> TimeField =
create("time", "time", UnitString, "time", 0.0, 1.e20, 0.0, 0, DimNames);
TimeField->addMetadata("calendar", CalName);

return Err;
}

Expand Down Expand Up @@ -70,7 +90,8 @@ Field::create(const std::string &FieldName, // [in] Name of variable/field
const std::any ValidMax, // [in] max valid field value
const std::any FillValue, // [in] scalar for undefined entries
const int NumDims, // [in] number of dimensions
const std::vector<std::string> &Dimensions // [in] dim names
const std::vector<std::string> &Dimensions, // [in] dim names
bool InTimeDependent // [in] flag for time dependent field
) {

// Check to make sure a field of that name has not already been defined
Expand Down Expand Up @@ -107,16 +128,24 @@ Field::create(const std::string &FieldName, // [in] Name of variable/field
ThisField->FieldMeta["FillValue"] = FillValue;
ThisField->FieldMeta["_FillValue"] = FillValue;

// Set the time-dependent flag
ThisField->TimeDependent = InTimeDependent;

// Number of dimensions for the field
ThisField->NDims = NumDims;

// Dimension names for retrieval of dimension info
// These must be in the same index order as the stored data
// Also determine whether this is a distributed field - true if any of
// the dimensions are distributed.
ThisField->Distributed = false;
ThisField->DimNames;
if (NumDims > 0) {
ThisField->DimNames.resize(NumDims);
for (int I = 0; I < NumDims; ++I) {
ThisField->DimNames[I] = Dimensions[I];
if (Dimension::isDistributedDim(Dimensions[I]))
ThisField->Distributed = true;
}
}

Expand Down Expand Up @@ -300,6 +329,17 @@ bool Field::isFieldOnHost(const std::string &FieldName // [in] name of field
// Returns the number of dimensions for the field
int Field::getNumDims() const { return NDims; }

//------------------------------------------------------------------------------
// Determines whether the field is time dependent and requires the unlimited
// time dimension during IO
bool Field::isTimeDependent() const { return TimeDependent; }

//------------------------------------------------------------------------------
// Determinse whether a field is distributed across tasks or whether a copy
// is entirely local. This is needed to determine whether a parallel IO or
// a non-distributed IO will be used.
bool Field::isDistributed() const { return Distributed; }

//------------------------------------------------------------------------------
// Returns a vector of dimension names associated with each dimension
// of an array field. Returns an error code.
Expand Down
27 changes: 25 additions & 2 deletions components/omega/src/infra/Field.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include "DataTypes.h"
#include "Dimension.h"
#include "Logging.h"
#include "TimeMgr.h"
#include <any>
#include <map>
#include <memory>
Expand Down Expand Up @@ -107,6 +108,14 @@ class Field {
/// Location of data
FieldMemLoc MemLoc;

/// Flag for whether this is a time-dependent field that needs the
/// Unlimited time dimension added during IO
bool TimeDependent;

/// Flag for whether this is a field that is distributed across tasks
/// or whether it is entirely local
bool Distributed;

/// Data attached to this field. This will be a pointer to the Kokkos
/// array holding the data. We use a void pointer to manage all the
/// various types and cast to the appropriate type when needed.
Expand All @@ -117,7 +126,9 @@ class Field {
// Initialization
//---------------------------------------------------------------------------
/// Initializes the fields for global code and simulation metadata
static int init();
/// It also initializes the unlimited time dimension needed by most fields
static int init(const Clock *ModelClock ///< [in] the default model clock
);

//---------------------------------------------------------------------------
// Create/destroy/query fields
Expand All @@ -142,7 +153,8 @@ class Field {
const std::any ValidMax, ///< [in] max valid field value
const std::any FillValue, ///< [in] scalar for undefined entries
const int NumDims, ///< [in] number of dimensions
const std::vector<std::string> &Dimensions ///< dim names for each dim
const std::vector<std::string> &Dimensions, ///< [in] dim names
const bool InTimeDependent = true ///< [in] opt flag for unlim time
);

//---------------------------------------------------------------------------
Expand Down Expand Up @@ -222,6 +234,17 @@ class Field {
std::vector<std::string> &Dimensions ///< [out] list of dimensions
) const;

//---------------------------------------------------------------------------
// Query for other properties
/// Determine whether this is a time-dependent field that requires the
/// unlimited time dimension for IO
bool isTimeDependent() const;

/// Determine whether this is a distributed field or whether it is entirely
/// local. This is needed to determine whether IO uses parallel read/write
/// or an undistributed read/write.
bool isDistributed() const;

//---------------------------------------------------------------------------
// Metadata functions
//---------------------------------------------------------------------------
Expand Down
Loading

0 comments on commit 6230c93

Please sign in to comment.