Skip to content
Fabian Cretton edited this page Nov 21, 2024 · 25 revisions

The goal of Cube Designer is to

  1. add the necessary metadata around the initial cube, which is either resulting from the latest cube transformation or an imported cube from another system. The metadata is used by visualize.admin.ch tool and further for the automated publication to opendata.swiss.

  2. to verify all data was correctly imported and if there is any missing information.

For projects created from CSV files, the cube, displayed as a table, is the result of the transform operation based on the defined Cube table in the CSV Mapping step. Each observation being a table row.

Adding or Adapting the Metadata

Per cube we have metadata on the level of the cube itself, which also signifies its own dataset in the database and on opendata.swiss once published. Further is it crucial to complete the metadata per dimension, to make it possible that external users and visualize.admin.ch can interpret the content of each dimension.

Dataset / Cube Metadata

With the 🖊️ icon to the right of the cube title, you can access the Metadata of the cube / dataset. The fields have descriptions to give you hints on the content. Mind that you can also open other projects to compare if you are unsure about the usage of the fields.

Most of the fields are described in the interface. Please provide the maximum data possible, some fields are mandatory.

Status

The publication stati influence the visibility of the cubes in the listings on publish applications. The behaviour of publishing the stati in different orders are listed below:

  • Draft only

    The Draft is shown in the applications in "draft" status.

  • Published after Draft

    The published version superseeds the draft version, the draft version is not listed anymore in applications.

  • Draft after Published

    The published version stays the main version, the new draft version is shown in draft status.

Publish to

Currently two applications can explicitly be selected as a publication target besides LINDAS which is the public database.

visualize.admin.ch If this flag is added, the published cube will be shown in the listing of the datasets on visualize.admin.ch.
Some metadata influence the possible visualizations and options:

  • It is crucial to complete the metadata per dimension, so that visualize.admin.ch can interpret the content of each dimension
  • The scale of measure defines the possible proposed charts. See here under for further details
  • A "creator" is recommanded, it is set in the metadata of the cube, in the "Organisation" field of the "Opendata.swiss" tab
  • The cube can be checked in the cube-checker tool provided by visualize.admin.ch.
    In that tool, choose the Data source (TEST/INT/PROD) corresponding to the platform on which the cube was published.
    Then you need to provide the URL of the cube's version that was published, as for instance:
    https://environment.ld.admin.ch/foen/UBD003002/5
    That cube version's URL can be seen in the SPARQL query result, pressing on the "SPARQL end-point with graph preselection" button available on the publication job's detail:
    image
    Query result:
    image

opendata.swiss If this flag is added, the published cube will be harvested and listed on opendata.swiss. Also compare the usage of the fields on the opendata.swiss handbook.

Dimensions

Specify the details for every dimension inside your cube with the 🖊️ icon for each dimension (presented as a column). The metadata provided on the dimension influences the possible charts available on visualize.admin.ch.

Name

Each dimension should provide a fitting name describing the content of the table. For Key Dimensions this comes mostly naturally as it is the class of the entries. (E.g. Cantons for Berne, Zurich, ...).

For the Measurement Dimensions it is important to also provide a defining name on the values provided. (E.g. population, temperature, spending). Even if often clear in the context of a cube, do not use generic names like values, measurement, count. Also it is possible to have multiple Measurement Dimension in one cube, therefore it will be necessary to distinguish them by the name.

Description

The description allows for a full sentence and potentially more detailed information on the content of the dimension.

Languages

You should add names and descriptions for the legal languages and English. Metadata fields should in most cases be provided in multiple languages. Every language is added as a new string with a separate language tag.

Dimension Type

The dimension type is a crucial setting. The dimension type is necessary to correctly interpret how the cube can be filtered.

Measurement Dimension

The Measurement Dimension provides the data, values or the observations in the cube. They are mostly integer or decimal, but can also be ordinal or even nominal concepts.

Key Dimension A Key Dimension will be used to construct the filter to select a specific slice in the cube for the visualization. The filter for a Key Dimension can not be deleted by the user in the visualization.

If none of the above dimension type is chosen the dimension will be provided as an optional filter.

Scale of Measure

The Scale of Measure helps to classify the nature of the information of a Dimension. On one side this information will help the data consumer to better understand your data. Also it influences the possible proposed charts on visualize.admin.ch.

Nominal (named variables)

Most Concepts are of nominal nature. They can be named but not put in a natural order. (E.g. cantons, colors, woods)

Ordinal (+ ordered variables)

If a Concept can be put in an order (E.g. big, medium, small or urgent, normal, low-priority) they are of the ordinal nature. It is important that concepts with ordinal nature provide a schema:position on the concept table to provide a machine readable order.

Interval (+ proportionate interval between variables) The values of Measurement Dimensions are in general of the interval or ratio nature. Also are most values about points in time and durations on the interval scale.

It signifies that the difference between values (E.g. temperatures in celsius: 10°C, 20°C, 30°C, or dates: 2001, 2002, 2003 or directions from north: 10°, 20°, 30°) are proportional. But you can't state that 20°C are double of 10°C or that the year 2000 is double of the year 1000.

Ratio (+ can accommodate absolute zero)

If Measurement Dimensions are of the types mass, length, duration, plane angle, energy and electric charge or similar, or also temperature in kelvin. The zero point has a significance and we can state that 2kg is twice 1kg.

If in doubt between interval and ratio, choose interval.

Unit

Every Measurement Dimension needs to provide a Unit. The Cube Creator provides an extensive list based on QUDT of units from which you can chose.

Percentages and Counts Please make sure that you select:

  • Number (#) for counts and statistics
  • Percent (%) for percentages

Missing units can be ordered by creating an issue.

Data Kind

In the case a dimension provides a specific data kind, e.g. of a temporal or geographical nature, the valid use in this cube can be specified.

Geographic coordinates If a dimension provides concepts with geographic coordinates specify this data kind.

Geographic shape If a dimension provides concepts with geographic shapes specify this data kind. Only concepts from Shared Dimensions provide shapes.

Time description If a dimension has a temporal data type (date, dateTime) specify this data kind. Additionally you can hint a suitable Time precision for visualisation purposes.

Linking to Shared Dimensions

Only nominal and ordinal concepts can be mapped to Shared Dimensions. Therefor you need to first make sure that the Scale of Measure is selected for a dimension. Once this is done a 🔗 symbol shows up in the row header. Clicking the 🔗 will show a dialog which allows to select a Shared Dimension and map the input values to the concepts in the Shared Dimension.

After there are changes in the mapping, you need to re-transform the cube to take effect. Once the transformation is done, the strings should be now substituted as concepts.

It is possible to link to multiple different Shared Dimension within a single concept (e.g. if the data contains different levels of administrative units - like municipalities, cantons and countries).

Hierarchies: if a hierarchy description already exists for that Shared Dimension, and you want to include the hierarchy in your published cube, the hierarchy description must be copied by editing the dimension and feeling the "copy from" field on the "Hierarchy" tab:
image

The published cube will include the hierarchy information (for example the links between cantons and municipalities), and this can be useful for tools as Visualize to organize the data.

Check correctness of the Cube

The second purpose of the Cube Designer is to manually inspect the correctness of the imported data.

You can manually check:

  • Check if the cube is complete (all data was converted)
    It is possible to page through all generated observations. It is also possible to do a quick check to know if all data was converted, comparing the mentioned number of total observations with the number of lines of the CSV file used for the Cube Table in the CSV Mapping step. The total number of observations is displayed under the table, next to the paging options.
  • Check that the links to other tables are correctly transformed. Each one of those values can be clicked to display the details.
  • Check if all languages are present. This can be done by changing the language on the top right, and verify all concepts.
  • Check if all metadata is added and translated for all languages for the Dataset/Cube and for each Dimension.