From f896f5d1f2561f11966fad37277de2368ef08bfa Mon Sep 17 00:00:00 2001 From: Remi Gau Date: Sat, 23 Mar 2024 10:31:46 -0400 Subject: [PATCH 01/18] split physio metadata tables --- ...ological-and-other-continuous-recordings.md | 18 ++++++++++++++++-- src/schema/rules/sidecars/continuous.yaml | 2 +- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/src/modality-specific-files/physiological-and-other-continuous-recordings.md b/src/modality-specific-files/physiological-and-other-continuous-recordings.md index 38f9842ae3..6be9665321 100644 --- a/src/modality-specific-files/physiological-and-other-continuous-recordings.md +++ b/src/modality-specific-files/physiological-and-other-continuous-recordings.md @@ -1,5 +1,7 @@ # Physiological and other continuous recordings +## Physiological recordings + Physiological recordings such as cardiac and respiratory signals and other continuous measures (such as parameters of a film or audio stimuli) MAY be specified using two files: @@ -51,7 +53,19 @@ measurements in a different sampling frequency. Physiological recordings (including eyetracking) SHOULD use the `_physio` suffix, and signals related to the stimulus SHOULD use `_stim` suffix. -The following table specifies metadata fields for the `*_.json` file. +The following tables specify metadata fields for the `*_.json` file. + + +{{ MACROS___make_sidecar_table(["continuous.Continuous"]) }} + +## Hardware information -{{ MACROS___make_sidecar_table(["continuous.Continuous", "continuous.Physio"]) }} +{{ MACROS___make_sidecar_table(["continuous.PhysioHardware"]) }} Additional metadata may be included as in [any TSV file](../common-principles.md#tabular-files) to specify, for diff --git a/src/schema/rules/sidecars/continuous.yaml b/src/schema/rules/sidecars/continuous.yaml index 2e74912f4f..3023427138 100644 --- a/src/schema/rules/sidecars/continuous.yaml +++ b/src/schema/rules/sidecars/continuous.yaml @@ -15,7 +15,7 @@ Continuous: Columns: required # Other recommended metadata for physiological data -Physio: +PhysioHardware: selectors: - suffix == "physio" fields: From ee4966b48240102d00738a2b57c0fe47b76c2900 Mon Sep 17 00:00:00 2001 From: Oscar Esteban Date: Mon, 25 Mar 2024 07:34:35 +0100 Subject: [PATCH 02/18] enh: add compressed TSV files to the common principles Their description is hedged within the physiological recordings so upcasting them to the common principles seems important. --- src/common-principles.md | 26 +++++++++++++++++++ ...logical-and-other-continuous-recordings.md | 24 +++++------------ 2 files changed, 33 insertions(+), 17 deletions(-) diff --git a/src/common-principles.md b/src/common-principles.md index 1b8c09a407..df007b12d7 100644 --- a/src/common-principles.md +++ b/src/common-principles.md @@ -542,6 +542,32 @@ like in the example below. } ``` +### Compressed tabular files + +Large tabular information such as physiological recordings MAY be stored with +[compressed tab-delineated (TSVGZ) files](glossary.md#tsvgz-extensions). +Rules for formatting plain-text tabular files apply to TSVGZ files with three exceptions: + +1. The contents of TSVGZ files SHOULD be compressed with + [gzip](https://datatracker.ietf.org/doc/html/rfc1952). +1. Compressed tabular files SHOULD NOT contain a header in the first row + indicating the column names. +1. TSVGZ files SHOULD be accompanied by a JSON file with the same name as their + corresponding tabular file but with a `.json` extension. + +???+ warning "Columns of TSVGZ files MUST be defined in the corresponding JSON sidecar and the tabular content MUST NOT include a header line." + + In contrast to plain-text TSV files, compressed tabular files files + MUST NOT include a header line. + Column names MUST be specified in the JSON file following the + [`Columns` metadata](glossary.md#columns-metadata) specifications. + As plain-text tabular data, column names MUST NOT be blank (that is, an empty string), + and MUST NOT be duplicated within a single JSON file describing a TSVGZ file. + + TSVGZ are header-less to improve compatibility with existing software + (for example, FSL, or PNM), and to facilitate the support for other file formats + in the future. + ### Key-value files (dictionaries) JavaScript Object Notation (JSON) files MUST be used for storing key-value diff --git a/src/modality-specific-files/physiological-and-other-continuous-recordings.md b/src/modality-specific-files/physiological-and-other-continuous-recordings.md index 38f9842ae3..f3bc4f27e2 100644 --- a/src/modality-specific-files/physiological-and-other-continuous-recordings.md +++ b/src/modality-specific-files/physiological-and-other-continuous-recordings.md @@ -2,12 +2,9 @@ Physiological recordings such as cardiac and respiratory signals and other continuous measures (such as parameters of a film or audio stimuli) MAY be -specified using two files: - -1. a [gzip](https://datatracker.ietf.org/doc/html/rfc1952) - compressed TSV file with data (without header line) - -1. a JSON file for storing metadata fields (see below) +specified using a [compressed tabular file](../common-principles.md#compressed-tabular-files) +([TSVGZ file](../glossary.md#tsvgz-extensions)) and a corresponding +JSON file for storing metadata fields (see below). !!! example "Example datasets" @@ -38,8 +35,10 @@ before the suffix. For example for the file `sub-control01_task-nback_run-1_bold.nii.gz`, `` would correspond to `sub-control01_task-nback_run-1`. -Note that when supplying a `*_.tsv.gz` file, an accompanying -`*_.json` MUST be supplied as well. +!!! warning "TSVGZ files SHOULD NOT include a header line (as established by the [common-principles](../common-principles.md#compressed-tabular-files))" + + As a result, when supplying a `*_.tsv.gz` file, an accompanying + `*_.json` MUST be supplied as well. The [`recording-