Skip to content

General, Output and Processing Instructions

Rose Pearson edited this page Jan 14, 2025 · 22 revisions

These options control behaviour associated with the DEM generation in each stage and are all grouped under the general, output and processing key-values.

General

The general section controls the general code flow associated with each stage int eh framework for behaviour related to generating and manipulating the DEM. Defaults exist for all values. The accepted options are listed below. Note that None is specified as null in JSON (e.g. in the instruction files).

keyword type default description
z_labels dict {"waterways": "z", "rivers": "z", "lakes": "z", "stopbanks": "z", "ocean": None} A dict specifying the column name for z values if not included within the geometry for each of the features that can be included for hydrological conditioning. If None is specified for one of the categories (e.g. ocean, rivers and waterways) it is assumed that the depth information is included in the geometry.
drop_offshore_lidar True bool or dict Bool defining if offshore LiDAR is discarded or kept. Set True to ignore offshore LiDAR that reflects the ocean surface. Also Bool that specifies if the background DEM (is added) is set to zero or not where in the foreshore. If multiple LiDAR datasets can define as a dict of bools with each dataset defined by its name. e.g. "drop_offshore_lidar": {"dataset_1": true, "dataset_2": false}
zero_positive_foreshore True bool Paired with drop_offshore_lidar. If set then any positive LiDAR values along the foreshore are replaced with zeros.
lidar_classifications_to_keep [2] list Defines if / how to filter LiDAR points to retain only points with the specified classification values. The standard LAS/LAZ classification values can be found at LAS 1.4 specifications. The subset of classifications used can also be found in the survey summary for Open topography datasets (i.e. Wellington_2013, or NZ20_Westport).
interpolation dict "interpolation": {"lidar": "idw", "rivers": "rbf", "waterways": "cubic", "ocean": "rbf", "lakes": "linear", "stopbanks": "nearest", "no_data": None} Defines the interpolation method use for each data source category before applying the no_data option at the end to any missing values in the final raster. The no_data options are: None or linear, nearest or cubic. The lidar options are: idw, mean, median, linear, min, max, std, and count. idw stands for inverse distance weighted, or mean for taking the arithmetic mean, median for the arithmetic median, linear for linear interpolation as calculated by scipy.interpolate.griddata, min for taking the minimum value, max for taking the maximum value, std for returning the standard deviation of the elevations, and count for returning the number of points in that grid cell. The others have options of: rbf, cubic, and linear.
elevation_range None list A list of the form [minimum_elevation, maximum_elevation], where the minimum_elevation and maximum_elevation values define the range of allowable elevations. If this is not defined then all elevations are kept.
download_limit_gbytes no 100 float
lidar_buffer 0 float The number of cells around LiDAR data to interpolate to any added coarse DEM values. A default of 0 means coarse DEM value will be added directly next to LiDAR values.
filter_waterways_by_osm_ids no [] list
ignore_clipping False bool If True the LiDAR DEM is not clipped in the Raw LIDAR generation stage. This will cause changes if drop_offshore_lidar is also set.
compression 1 int The level of compression applied to the final output netCDF file. If the output files are of TIFF format then this variable is ignored as there is only a fixed level of compression applied to TIFF files. Common options include integer values of 1 through 9.
use_edge {"ocean": False, "lakes": False} dict Defines if the surrounding LiDAR elevations are used when interpolating across the ocean and lakes.
is_depth {"ocean": False} dict Defines if the ocean elevations converted from depths to elevations relative to an assumed 0m.
nearest_k_for_interpolation {"ocean": 40, "lakes": 500, "rivers": 100} dict Defines the number of surrounding values when interpolating from the ocean, lake or river dataset.

Output [Required]

The output section contains information about the resolution and CRS of the DEM generated by the GeoFabricsGenerator class. All output keywords are mandatory unless specified otherwise. Accepted keywords are:

  • crs [Optional] - The CRS is optional with default values of horizontal=2193 (NZTM2000 - EPSG:2193) and vertical=7839 (NZVD2016 - EPSG:7839)
  • grid_params - The resolution are not optional and must be specified. This defined the DEM grid geometry in metres, where the grid is square.

Processing [Required]

The processing section contains information used by Dask to allocate CPU cores and to chunk up the DEM into separate processing tasks.

  • chunk_size [Default is None] - This is the number of DEM pixels to have in each chunk of the DEM that is processed separately. This will equate to a square area with sides of resolution x chunk_size. Reduce the chunk size if you are getting memory errors in the log file. A good initial value is 1 to 1.5x a single LiDAR tile (i.e. a 1km x 1km Lidar tile with a resolution of 10m will equate to a chunk_size of 100). The default is None, which will only work if their is only one LiDAR file being processed.
  • number_of_cores [Default is 1] - The number of separate CPU cores or processes to run at the same time. This should not exceed the number of cores on your device. If running on your own device (i.e. not NeSI), it can be good to leave 1-2 cores unused by geofabrics for other background tasks.
  • memory_limit [Default is 10GBi] - The maximum memory to be used by a single Dask task.
Clone this wiki locally