Proposed VIC workflow

This is based on the text in VIC repo issue #7, but expanded further.

What and why?

I propose a reorganization of the VIC code, that would allow multiple VIC drivers to use a single science core, so that science-based changes to the code can easily be shared amongst difference VIC configurations.

VIC is currently implemented in a number of different configurations, both offline and coupled. For example:

offline, time-first mode ( VIC classic ): This is the configuration used by the code that is distributed by the UW team through its own web site and through the Github repository.
offline, space-first mode ( VIC image mode): This is the configuration used in the VIC implementation in NASA's Land Information System (LIS), which uses parts of the Earth System Modeling Framework (ESMF).
coupled, space-first mode ( VIC image mode ): This is the configuration used in the VIC implementation in the Regional Arctic System Model (RASM), which uses the Community Earth System Model (CESM). CESM in turn uses parts of the Model Coupling Toolkit (MCT).

In addition, there are likely many project-specific configurations of VIC in use by other teams.

The challenge with the current implementation is that the computational or scientific core of VIC is not cleanly separated from the driver. The effect of this is that once a VIC version is implemented within a certain configuration, it is difficult to keep that configuration updated with the latest science-based changes that are made to VIC. In other words, after porting VIC version X so that it works in anything other than the default configuration ( VIC classic ), it is difficult to accommodate changes made as part of the VIC versions X.1, X.2, etc.

The goal of the proposed reorganization is limited to cleanly separating the VIC computational / scientific core from the driver. The driver determines whether the model is run in uncoupled, coupled, time-first, or space-first mode. This will allow all configurations to benefit from updates to VIC's core. Note that changes to VIC's core may require additional changes to the driver (for example, if new parameters need to be read).

The proposed reorganization includes changes in the organization of the VIC source code tree and in changes in the source code itself. Note that these changes should not affect the results. That is, VIC classic should produce the same results before and after the reorganization. Similarly, when using the same model parameters and driven with the same meteorological forcings, all drivers should produce the same results. This will allow us to put some rigorous tests in place to ensure that the reorganization does not introduce unwanted effects and to ensure that the results are correct if someone creates a new driver.

The reorganization will allow us to add additional drivers to the VIC code repository, which would allow VIC to be operated in time-first or space-first mode, uncoupled or coupled.

The following are some preliminary ideas (feedback requested) for how to implement this reorganization.

Driver and VIC core separation

Each driver must call at a minimum the following functions:

vic_start() -- Initialize global parameters for simulation
vic_alloc() -- Allocate memory for VIC structures
vic_init() -- Initialize model parameters
vic_restore() -- Restore model state
vic_run() -- Run VIC for a single grid cell, for a single time step.
vic_write() -- Write VIC model output for a single timestep (either for an entire domain or for a single grid cell).
vic_save() -- Save a VIC model state.
vic_finalize() -- Final cleanup.

In essence, the most important call will be vic_run(), which will run VIC for a single timestep and for a single grid cell. All the code that is invoked as part of the vic_run() call is by definition part of VIC core and will be the same regardless of each driver. The arguments to the vic_run() call will include the complete set of meteorological forcings for that particular time step, that is, any manipulation of meteorological forcings will be done in the driver. The memory allocated in vic_alloc() will need to persist in the driver between successive calls to vic_run() until vic_finalize() is called.

The other vic_ functions, as well as any other functionality will be implemented as part of each driver and will consequently vary between VIC implementations, even though there may be a significant amount of overlap between drivers. Note that the drivers will need to implement additional functionality, for example, how to deal with meteorological forcings, how to deal with time varying model parameters that are read from file, and how to deal with changes to a model state as part of a data assimilation scheme.

Because only the code that is invoked by vic_run() is part of VIC's core, it could be argued that only the call to vic_run() should be required by each driver. However, requiring calls to the other vic_ functions as well, will likely promote code reuse among drivers and facilitate the implementation of new drivers. For example, two different drivers that both operate in image mode and write NetCDF output, may be able to use the same vic_write() and vic_save() functions and parts of the same vic_init() and vic_finalize() functions.

In short, pseudo-code for VIC classic would look something like:

vic_start()
foreach gridcell:
    vic_alloc()
    vic_init()
    vic_restore()
    foreach timestep:
        vic_run()
        if output:
            vic_write()
        if save:
            vic_save()
    vic_finalize()

Pseudo-code for an image model implementation would look something like:

vic_start()
vic_alloc()
vic_init()
vic_restore()
foreach timestep:
    foreach gridcell:
        vic_run()
    if output:
        vic_write()
    if save:
        vic_save()
vic_finalize()

In both cases there would be a large amount of additional code between consecutive vic_ calls. This code would be specific to each driver. For example, before each call to vic_run(), atmospheric forcings would need to be updated, time-varying model parameters may need to be updated, and the model state may need to be updated in a data assimilation scheme. Whether predefined names should be used for each of these steps is up for discussion (as is the rest of this document).

VIC drivers may need to be implemented in Fortran to interact with other model components (for example as part of RASM). In that case it is important that the vic_ functions are callable from Fortran, in particular the vic_start, vic_alloc(), vic_run(), and vic_finalize() functions. Alternatively, these functions may need to be wrapped in another set of functions that are callable from Fortran. I am not sure yet, how this should be implemented and how driver dependent this will be and how this will affect the above functions.

Source tree organization

In the current version of VIC, the source code is all located in a single directory. I propose that we separate the code into directory trees that refect the code separation outlined above. For example:

|-drivers
|---classic
|-----include
|-----src
|---lis
|-----include
|-----src
|---rasm
|-----include
|-----src
|---test
|-----include
|-----src
|-vic_run
|---include
|---src

For all directories I propose that we cleanly split the header files from the source code (hence include and src directories in each of the subdirectories). All the core vic code would go in the vic_run directories, while all the other code would be specific to each driver. There must be a better way to ensure that code is re-used, so am looking for suggestions. For example, should the mtclim code go in to the driver/classic directory or somewhere else? How can we set it up so that read and write functions can be shared more easily. Alternatively, just keep it split as above and allow for duplication, since each driver will be tailored to a specific environment and/or application.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposed VIC workflow

What and why?

Driver and VIC core separation

Source tree organization

Clone this wiki locally