-
Notifications
You must be signed in to change notification settings - Fork 4
JSON configuration
This is a quick overview of the most popular JSON attributes (JSON sections/keys) used for NMDataParser configuration. Full description of the NMDataParser JSON configuration is given in wiki section Parser configuration via JSON
The JSON config file consists of several major sections (objects on the first level of the JSON schema):
Defines basic info for the data format being imported. Intended mainly for internal usage. This section contains the options: NAME, VERSION and TYPE.
defines the basic access to the excel file
is used to set up simultaneous access to several sheets of a given excel file.
Defines the reading (data locations) of the basic fields of a substance record.
Defines how to read an array of ProtocolApplication objects which are included in the SubtsanceRecord object defined in the previous section.
is a key concept for the JSON configuration of excel parser reader. Excel data location is required for each data component of the Substance representation e.g. PUBLIC_NAME
, CITATION_TITLE
, ENDPOINT
etc.
The Excel data location is defined by means of several options:
Defines the iteration mode - how the data from this location is accessed. The fFollowing iteration modes are supported:
ROW_SINGLE - data is accessed treating each excel table row as a separate data unit (e.g. Substance record),
ROW_MULTI_FIXED - a fixed number of rows are treated as a separate Substance record,
ROW_MULTI_DYNAMIC - a dynamic number of rows are used to load a Substance record (the number of rows may vary for each record),
ABSOLUTE_LOCATION - the data component is read from absolute location fro the excel file e.g. sheet, row and column must be defined
JSON_VALUE - the data component is taken directly from the JSON config file
JSON_REPOSITORY - the data component is taken directly from the JSON config file but special section REPOSOTORY is used.
define respectively excel sheet, column and row for reading data from a single excel cell. Depending on the ITERATION mode, some of these fields are not required (if supplied they are ignored). For example in ROW_SINGLE mode only column index will be used while in ABSOLUTE_LOCATION mode all of the above inddices will be used. When particular index is needed for the current iteration mode but not supplied, a parsing error is obtained. Typically the default values of the fields ITERATION and SHEET_INDEX (when not supplied explicitly) are taken globally from the DATA_ACCESS section for the primary iteration sheet or from the corresponding PARALLEL_SHEET.
Excel data location may define an array of cells. For this purpose boolean JSON field IS_ARRAY is set to true (default value if missing is false). Accordingly fields: ROW_INDICES and COLUMN_INDICES are used to define arrays of row and column indices. Since all row/column indices are given explicitly it is not needed to be consequent numbers hence more complex sets of cells can set this way.
Column indices may be defined in two ways by integers (designating the column numbers) or as column labels (as used in the Excel standart addressing). For example: "COLUMN_INDICES" :[2,3,4] or "COLUMN_INDICES" :["B","C","D"] are both valid specifications.
Defines the basic parameters for data access and iteration of the primary sheet. This section describes the basic approach for reading data i.e. these are the default reading parameters. When particular parameter in a given Excel Data Location is ommited the default value is taken from this section. Data access for other sheets (addional secondary sheets) is set in JSON array section PARALLEL_SHEETS analogously to this section.
ITERATION Defines the iteration mode. Possible iteration modes were already described above: ROW_SINGLE, ROW_MULTI_FIXED, ROW_MULTI_DYNAMIC, ABSOLUTE_LOCATION, JSON_VALUE, JSON_REPOSITORY, VARIABLE.
SHEET_INDEX The primary sheet for iteration
SHEET_NAME The primary sheet name
START_ROW The starting row for iteration
START_HEADER_ROW The first (starting) header row
END_HEADER_ROW The last (ending) header row
ALLOW_EMPTY Flag that defines whether empty cells are allowed. Default value is true
RECOGNITION The mode for sheet/column/row recognition. These elements can be recognized by index or by name.
DYNAMIC_ITERATION Defines how dynamic iteration is performed in mode ROW_MULTI_DYNAMIC. Several rows are read at once where the criterion for row group recognition is: NEXT_NOT_EMPTY or NEXT_DIFFERENT_VALUE.
DYNAMIC_ITERATION_COLUMN_INDEX The index of the column used for dynamic iteration.
VARIABLES is a subsection that defines an array of excel locations that are read and stored into a work list of variables later used in the reading process. The variables can be used for more complex routing of data e.g. one value can be used/loaded in several components of the substance record.
This is an array of sections similar to section DATA_ACCESS that define the simultaneous reading of several sheets together with the primary sheet.
Defines the excel locations for reading of the basic fields of a Substance Record:
SUBSTANCE_NAME
, OWNER_NAME
, SUBSTANCE_TYPE
, OWNER_UUID
, SUBSTANCE_UUID
, PUBLIC_NAME
, COMPOSITION
This is an array of sections , defining the excel data locations for reading of Protocol Application data. Each section includes following fields: CITATION_TITLE
, CITATION_YEAR
, CITATION_OWNER
, INTERPRETATION_RESULT
, INTERPRETATION_CRITERIA
, PROTOCOL_GUIDELINE
,PARAMETERS
(an array of data locations), EFFECTS
(an array of sections)
This is an array of sections. Each section defines data structures (effect record) for particular measurements and includes following excel data locations: SAMPLE_ID
, ENDPOINT
, LO_VALUE
, UP_VALUE
, ERR_VALUE
, TEXT_VALUE
, VALUE
, LO_QUALIFIER
, UP_QUALIFIER
, ERR_QUALIFIER
, UNIT
, CONDITIONS
(an array of data locations)
A JSON structure for defining preconfigured data (e.g. protocol, parameters) to be read directly from the JSON file into the data classes.
- Home
- Quick start
- Data templates
- eNanoMapper Data Model
- Parser configuration via JSON
- Available templates
- How to
- Additional information