Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stimuli specifications to BIDS #2

Merged
merged 10 commits into from
Dec 22, 2024
138 changes: 138 additions & 0 deletions src/modality-specific-files/stimuli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# Stimuli

Stimulus files should be stored in the `/stimuli` directory under the root directory of the dataset. The `/stimuli` directory can contain subdirectories to organize the stimulus files. Stimulus files SHOULD follow the BIDS naming conventions and SHOULD be referenced in the `events.tsv` file using the `stim_id` column. The `stim_id` column represents unique identifiers for the stimuli. Additional information about the stimuli and their annotations can be provided in the `stimuli.tsv`, `stimuli.json`, `annotations.tsv`, `annotations.json`, and `stim-<label>.json` files.

Standardizing stimulus files and their annotations within the BIDS specifications offers several advantages:

1. **Consistency**: Ensures that stimulus files are stored and referenced consistently across different datasets.
2. **Reusability**: Facilitates the reuse of stimulus files and annotations in other studies by providing a standardized structure.
3. **Efficiency**: Reduces redundancy by avoiding the need to replicate annotations across subjects, modalities, tasks, and runs.
4. **Flexibility**: Allows for easy modification of annotations by updating a single file, enabling the reuse of datasets with alternative annotations.

To preserve backward compatibility with existing datasets (see the Legacy section below), the use of these specifications for `/stimuli` directory and the `stim_id` column in the `events.tsv` files is RECOMMENDED but not required. Researchers are encouraged to follow these guidelines to enhance the interoperability and reproducibility of their studies.

Following these guidelines will help ensure that stimulus files and their annotations are stored and referenced consistently across different datasets, facilitating data sharing, reuse, and reproducibility.

<!-- This block generates a file tree.
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_filetree_example({
"stimuli": {
"stimuli.tsv": "",
"stimuli.json": "",
"[stim-<label>[_part-<label>]_<suffix>.<extension>]": "",
"[stim-<label>[_part-<label>]_<suffix>.json]": "",
"[[stim-<label>_]annotations.tsv]": "",
"[[stim-<label>_]annotations.json]": "",
"[stim-<label>[_part-<label>]_annot-<label>_events.tsv]": "",
"[stim-<label>[_part-<label>]_annot-<label>_events.json]": ""
}
}) }}

Note: The presence of `stimuli.tsv` file indicates that the content of the `/stimuli` folder follows this BIDS specification for stimulus organization. This structure is planned to become mandatory in BIDS 2.0.

## Referencing Stimulus Identifiers in `events.tsv`

To reference stimulus identifiers in the `events.tsv` file, use the `stim_id` column. The values in the `stim_id` column should represent unique identifiers for the stimuli. Stimulus ID (`stim_id`) should correspond to the unique identifier of the stimulus file in the /stimuli directory and expands to all files (both stimulus and annotation files) that share the same stimulus ID.

Example `events.tsv` file:

| onset | duration | trial_type | response_time | stim_id |
|-------|----------|------------|---------------|---------|
| 1.23 | 0.65 | start | 1.435 | stim-\<label\> |
| 5.65 | 0.65 | stop | 1.739 | stim-\<label\> |
| 12.1 | 2.35 | n/a | n/a | stim-\<label\> |

In the accompanying JSON sidecar, the `stim_id` column might be described as follows:

```JSON
{
"stim_id": {
"LongName": "Stimulus identifier",
"Description": "Represents a unique identifier for the stimulus presented at the given onset time."
}
}
```

## The stimulus file and JSON sidecar

Stimulus files can be of various types including audio, image, video, and combined audiovideo formats. Each stimulus file SHOULD have an accompanying JSON sidecar file containing metadata about the stimulus. The structure of the stimulus file name and the JSON sidecar file name SHOULD follow the BIDS naming conventions as below:

- Stimulus file: `stim-<label>[_part-<label>]_<suffix>.<extension>`
- JSON sidecar file: `stim-<label>[_part-<label>]_<suffix>.json`

Then stimulus file names MUST start with `stim-` followed by a unique label, and an optional part label if the stimulus is divided into parts. `stim-` is the standard entity for the stimulus files, indicating that the files do not belong to a specific subject or participant but rather are likely to be used across subjects throughout the experiment. The suffix SHOULD describe the type of stimulus (such as audio, image, video, audiovideo) and the extension MUST indicate the file format. The JSON sidecar file MUST have the same name as the stimulus file but with a `.json` extension. Here are the allowed suffixes and extensions for the stimulus files:

| Modality | Extensions | Description |
|--------------|---------------------------|------------------------------------|
| `audio` | .wav, .mp3, .aac, .ogg | Audio-only stimulus files |
| `image` | .jpg, .png, .svg | Static visual stimulus files |
| `video` | .mp4, .avi, .mkv, .webm | Video-only stimulus files |
| `audioVideo` | .mp4, .avi, .mkv, .webm | Combined audio-visual stimulus files |
<!-- | Tactile | .ros | Robot Operating System program files for tactile stimulation | -->

If distribution restrictions prevent including the actual stimulus file, the JSON sidecar SHOULD still be present with appropriate metadata describing the stimulus.
For the stimuli that can be described in a table format (such as image datasets), the `stimuli.tsv` and `stimuli.json` files can be used to provide information about the stimuli based on their `stim_id`, and the presence of the stimulus file and JSON sidecar is OPTIONAL.

The following table describes the REQUIRED, RECOMMENDED, and OPTIONAL fields for the `stim-<label>.json` file:

<!-- This block generates a metadata table.
These tables are defined in
src/schema/rules/sidecars
The definitions of the fields specified in these tables may be found in
src/schema/objects/metadata.yaml
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_sidecar_table("stimulus.Stimulus") }}

## Stimuli.tsv and Stimuli.json

The `stimuli.tsv` and `stimuli.json` files are used to provide information about the stimuli based on their `stim_id`. This file is similar in usage as `participants.tsv`, `scans.tsv` and `sessions.tsv`, which list descriptions about subjects, scans and sessions, respectively. The `stimluli.tsv/json` files should be placed in the `/stimuli` directory.

### Stimuli.tsv

The `stimuli.tsv` file contains information about each stimulus, including stimulus ID, type, URL, and other relevant details. The following table describes the REQUIRED, RECOMMENDED, and OPTIONAL columns for the `stimuli.tsv` file:

<!-- This block generates a columns table.
The definitions of these fields can be found in
src/schema/rules/tabular_data/*.yaml
and a guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_columns_table("stimuli.Stimuli") }}

This is an example of the `stimuli.tsv` file describing three images from the natural scene dataset (NSD):

| stimulus_id | type | description | HED | NSD_id | COCO_id |
|--------------|-------|-------------|-----|--------|---------|
| stim-nsd02951 | image | an open market full of people and piles of vegetables | ((Item-count, High), Ingestible-object), (Background-view, ((Human, Body, Agent-trait/Adult), Outdoors, Furnishing, Natural-feature/Sky, Urban, Man-made-object)) | 2951 | 262145 |
| stim-nsd02991 | image | a couple of people are cooking in a room | (Foreground-view, ((Item-count/1, (Human, Body, Agent-trait/Adult)), (Item-count/1, (Human, Body, (Face, Away-from), Male, Agent-trait/Adult)), ((Item-count, High), Furnishing))), (Background-view, (Ingestible-object, Furnishing, Room, Indoors, Man-made-object, Assistive-device)) | 2991 | 262239 |
| stim-nsd03050 | image | a person standing on a surfboard riding a wave | (Foreground-view, ((Item-count/1, ((Human, Human-agent), Body, Male, Agent-trait/Adolescent)), (Play, (Item-count/1, Man-made-object)))), (Background-view, (Outdoors, Natural-feature/Ocean)) | 3050 | 262414 |

### Stimuli.json

The `stimuli.json` file provides detailed descriptions of the columns in the `stimuli.tsv` file. There can be extra entries in the `stimuli.json` in addition to the columns in the `stimuli.tsv` to provide more details about the stimulus.

## Annotations.tsv and Annotations.json

The `annotations.tsv` and `annotations.json` files are used to provide additional information about the annotations associated with the stimuli. These files should be placed in the `/stimuli` directory.

### Annotations.tsv

The `annotations.tsv` file contains information about each annotation, including the annotation ID and description. The following table describes the REQUIRED, RECOMMENDED, and OPTIONAL columns for the `annotations.tsv` file:

<!-- This block generates a columns table.
The definitions of these fields can be found in
src/schema/rules/tabular_data/*.yaml
and a guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_columns_table("annotations.Annotations") }}

### Annotations.json

The `annotations.json` file provides detailed descriptions of the columns in the `annotations.tsv` file.

There could be only one `annotations.tsv` in the `/stimuli` directory. Alternatively, each stimulus (with a unique stimulus ID) can have a separate `stim-<label>_annotations.tsv` to describe the annotations for a specific stimulus.
115 changes: 109 additions & 6 deletions src/modality-specific-files/task-events.md
Original file line number Diff line number Diff line change
Expand Up @@ -269,15 +269,15 @@ and a guide for using macros can be found at

The operating system description SHOULD include the following attributes:

- type (for example, Windows, macOS, Linux)
- distribution (if applicable, for example, Ubuntu, Debian, CentOS)
- the version number (for example, 18.04.5)
- type (for example, Windows, macOS, Linux)
- distribution (if applicable, for example, Ubuntu, Debian, CentOS)
- the version number (for example, 18.04.5)

Examples:

- Windows 10, Version 2004
- macOS 10.15.6
- Linux Ubuntu 18.04.5
- Windows 10, Version 2004
- macOS 10.15.6
- Linux Ubuntu 18.04.5

The amount of information supplied for the `OperatingSystem` SHOULD be sufficient
to re-run the code under maximally similar conditions.
Expand Down Expand Up @@ -400,3 +400,106 @@ A guide for using macros can be found at
Additional metadata may be included as in
[any TSV file](../common-principles.md#tabular-files) to specify, for
example, the units of the recorded time series for each column.

## Standardization of Stimulus Files and Annotations

To ensure consistency and facilitate reuse, the BIDS specifications provide guidelines for standardizing stimulus files and their annotations. This section outlines the recommended practices for storing and referencing stimulus files within a BIDS dataset.

### Storing Stimulus Files

Stimulus files should be stored in the `/stimuli` directory under the root directory of the dataset. The `/stimuli` directory can contain subdirectories to organize the stimulus files. There are no restrictions on the file formats of the stimulus files.

Example directory structure:

<!-- This block generates a file tree.
A guide for using macros can be found at
https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md
-->
{{ MACROS___make_filetree_example({
"stimuli": {
"images": {
"cat01.jpg": "",
"cat02.jpg": "",
},
"videos": {
"movie01.mp4": "",
"movie02.mp4": "",
},
},
}) }}

### Referencing Stimulus Files in `events.tsv`

To reference stimulus files in the `events.tsv` file, use the `stim_file` column. The values in the `stim_file` column should represent the relative path to the stimulus file within the `/stimuli` directory.

Example `events.tsv` file:

```Text
onset duration trial_type response_time stim_file
1.23 0.65 start 1.435 images/cat01.jpg
5.65 0.65 stop 1.739 images/cat02.jpg
12.1 2.35 n/a n/a videos/movie01.mp4
```

In the accompanying JSON sidecar, the `stim_file` column might be described as follows:

```JSON
{
"stim_file": {
"LongName": "Stimulus file",
"Description": "Represents the location of the stimulus file (such as an image, video, or audio file) presented at the given onset time. The values correspond to a path relative to the /stimuli directory."
}
}
```

### Referencing Stimulus Identifiers in `events.tsv`

To reference stimulus identifiers in the `events.tsv` file, use the `stim_id` column. The `stim_id` corresponds to the unique identifier of the stimulus files and their annotations stored under the `/stimuli` directory. This allows linking each event to multiple related files associated with that stimulus.

Example `events.tsv` file:

```Text
onset duration trial_type response_time stim_id
1.23 0.65 start 1.435 stim-face01
5.65 0.65 stop 1.739 stim-face02
12.1 2.35 n/a n/a stim-video01
```

In the accompanying JSON sidecar, the `stim_id` column might be described as follows:

```JSON
{
"stim_id": {
"LongName": "Stimulus identifier",
"Description": "Represents a unique identifier for the stimulus presented at the given onset time. Links to files and annotations in the /stimuli directory."
}
}
```

The `stim_id` in the events file links to corresponding files:

```Text
stimuli/
├── stim-face01_image.jpg
├── stim-face01_image.json
├── stim-face01_annotations.tsv
├── stim-face02_image.jpg
├── stim-face02_image.json
├── stim-face02_annotations.tsv
├── stim-face02_annotations.tsv
├── stimuli.tsv
└── stimuli.json
```

By using `stim_id`, multiple annotations and stimulus files associated with the same identifier can be efficiently linked to events in the `events.tsv` file. The `stim_id` is a unique identifier for the stimulus that can be used to reference the stimulus files, annotations, and metadata stored in `stimuli.tsv` and `stimuli.json`. For more information on the structure of stimulus files and annotations, refer to the [Stimuli](./stimuli.md) BIDS specifications.

### Advantages of Standardization

Standardizing stimulus files and their annotations within the BIDS specifications offers several advantages:

1. **Consistency**: Ensures that stimulus files are stored and referenced in a consistent manner across different datasets.
2. **Reusability**: Facilitates the reuse of stimulus files and annotations in other studies by providing a standardized structure.
3. **Efficiency**: Reduces redundancy by avoiding the need to replicate annotations across subjects, modalities, tasks, and runs.
4. **Flexibility**: Allows for easy modification of annotations by updating a single file, enabling the reuse of datasets with alternative annotations.

By following these guidelines, researchers can enhance the interoperability and reproducibility of their studies, making it easier to share and reuse data within the scientific community.
9 changes: 9 additions & 0 deletions src/schema/objects/columns.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -585,6 +585,15 @@ stim_file:
For example `images/cat03.jpg` will be translated to `/stimuli/images/cat03.jpg`.
type: string
format: stimuli_relative
stim_id:
name: stim_id
display_name: Stimulus identifier
description: |
Represents a unique identifier for the stimulus presented at the given onset
time. The `stim_id` is inclusive of the stimulus file(s), annotations
related to the stimulus, and the information about the stimulus present in
the `stimuli.tsv` file.
type: string
strain:
name: strain
display_name: Strain
Expand Down
43 changes: 29 additions & 14 deletions src/schema/objects/entities.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -194,20 +194,12 @@ part:
name: part
display_name: Part
description: |
This entity is used to indicate which component of the complex
representation of the MRI signal is represented in voxel data.
The `part-<label>` entity is associated with the DICOM Tag
`0008, 9208`.
Allowed label values for this entity are `phase`, `mag`, `real` and `imag`,
which are typically used in `part-mag`/`part-phase` or
`part-real`/`part-imag` pairs of files.

Phase images MAY be in radians or in arbitrary units.
The sidecar JSON file MUST include the `"Units"` of the `phase` image.
The possible options are `"rad"` or `"arbitrary"`.

When there is only a magnitude image of a given type, the `part` entity MAY be
omitted.
This entity is used to indicate which component of a complex
representation is being stored. For MRI data, it indicates which component
of the complex signal is represented in voxel data. For stimulus files, it can
be used to distinguish different parts of a single stimulus, such as chapters
in an audiobook or segments of a long movie (for example, `part-1`, `part-2`,
`part-epilog`, `part-chapter1`).
type: string
format: label
enum:
Expand Down Expand Up @@ -373,6 +365,19 @@ stain:
and/or `"SampleSecondaryAntibodies"` metadata fields, as appropriate.
type: string
format: label
stimulus:
name: stim
display_name: Stimulus
description: |
The `stim-<label>` entity can be used to distinguish different stimulus files
or annotations. The label is a unique identifier for the stimulus or annotation.

This entity represents the `"Stimulus"` metadata field and requires corresponding
entries in the `stimuli.tsv` file. Therefore, if the `stim-<label>` entity is
present in a filename, `"Stimulus"` MUST be defined in the associated metadata,
and a matching entry MUST exist in the `stimuli.tsv` file.
type: string
format: label
subject:
name: sub
display_name: Subject
Expand Down Expand Up @@ -443,3 +448,13 @@ tracksys:
may be longer and more human readable.
type: string
format: label
annotation:
name: annot
display_name: Annotation
description: |
The `annot-<label>` entity accommodates multiple annotations for a single
(usually, but not necessarily, time-varying) stimulus id. Similar to `stimuli.tsv`,
there can be one or multiple `annotations.tsv` files with `annotation_id`, providing
a list of the annotations in the directory, or for a specific stimulus respectively.
type: string
format: label
Loading
Loading