Skip to content

Commit

Permalink
Added docs about tags and composite table
Browse files Browse the repository at this point in the history
  • Loading branch information
bbollen23 committed Oct 4, 2024
1 parent feebe1c commit 1d3a5a7
Showing 1 changed file with 85 additions and 19 deletions.
104 changes: 85 additions & 19 deletions apps/docs-website/docs/loon-for-scientists/data.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,19 +32,21 @@ Example Content:
}
```

## `Experiment metadata file`
## Experiment Metadata File

Each experiment metadata file is stored as a JSON file. This defines some metadata aspects of the experiment and points to the other data files.

At the top level it expects the following attributes:

| Attribute | Definition |
| ---------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `name` | Name of the experiment as it should appear in the Loon dashboard.
| `headers` | The list of column names in the CSV feature tables. The order should match the CSV files. |
| `headerTransforms` | Defines the name of certain special columns (`time`, `frame`, `id`, `parent`, `mass`, `x`, `y`). This is optional if the name already exactly matches in headers. See [the table below](https://github.com/visdesignlab/aardvark-util?tab=readme-ov-file#headertransforms) for information about these special columns. |
| `locationMetadataList` | A list of imaging location metadata. Each imaging location will include an `id`, `tabularDataFilename`, `imageDataFilename`, and `segmentationsFolder`. See [the table below](https://github.com/visdesignlab/aardvark-util?tab=readme-ov-file#locationmetadatalist) for more information on each of these. |
| `headerTransforms` | Defines the name of certain special columns (`time`, `frame`, `id`, `parent`, `mass`, `x`, `y`). This is optional if the name already exactly matches in headers. See [the table below](#header-transforms) for information about these special columns. |
| `locationMetadataList` | A list of imaging location metadata. Each imaging location will include an `id`, `tabularDataFilename`, `imageDataFilename`, and `segmentationsFolder`. See [the table below](#location-metadata-list) for more information on each of these. |
| `compositeTabularDataFilename` | Specifies the path to the combined tabular data file. See the [section below](#composite-tabular-data-file) for more information. |

### `headerTransforms`
### Header Transforms

| Attribute | Definition |
| --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
Expand All @@ -56,17 +58,21 @@ At the top level it expects the following attributes:
| `x` | The X coordinate for the cell's center position in pixel space. (It does not matter what definition of center is used.) |
| `y` | Same, but for the Y coordinate. |

### `locationMetadataList`
### Location Metadata List

`id` | A unique name for this location. Can be anything, but will be displayed in the interface, so a more descriptive name is better.
`tabularDataFilename` | The location of the CSV file feature table for this experiment.
`imageDataFilename` | The location of the OME TIFF image file. This should be a `*.companion.ome` file.
`segmentationsFolder` | This folder contains all of the segmentation files for a given location. See the [section on segmentations](#segmentations-folder) for more details.
| Attribute | Definition |
| --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `id` | A unique name for this location. Can be anything, but will be displayed in the interface, so a more descriptive name is better.
| `tabularDataFilename` | The location of the CSV file feature table for this experiment.
| `imageDataFilename` | The location of the OME TIFF image file. This should be a `*.companion.ome` file.
|` segmentationsFolder` | This folder contains all of the segmentation files for a given location. See the [section on segmentations](#segmentations-folder) for more details.
| `tags` | A JSON object containing key-value pairs that capture metadata about the particular location. See the [tags section](#tags) for more information and some examples.

So, altogether a single experiment metadata file should look something like the following:

```
{
"name":"ExperimentOne",
"headers": [
"Frame",
"Tracking ID",
Expand All @@ -92,30 +98,90 @@ So, altogether a single experiment metadata file should look something like the
"x": "Pixel Position X (pixels)",
"y": "Pixel Position Y (pixels)"
},
"compositeTabularDataFilename":"experiment1/composite_tabular_data_file.parquet",
"locationMetadataList": [
{
"id": "Condition A",
"tabularDataFilename": "experiment1/Table_A.csv",
"imageDataFilename": "experiment1/images_A.companion.ome",
"segmentationsFolder": "experiment1/segmentations_A/"
"tabularDataFilename": "experiment1/location_A/Table_A.csv",
"imageDataFilename": "experiment1/location_A/images_A.companion.ome",
"segmentationsFolder": "experiment1/location_A/segmentations_A",
"tags":{
"drug":"drug_1",
"concentration":"0.1"
}
},
{
"id": "Condition B",
"tabularDataFilename": "experiment1/Table_B.csv",
"imageDataFilename": "experiment1/images_B.companion.ome",
"segmentationsFolder": "experiment1/segmentations_B/"
"tabularDataFilename": "experiment1/location_B/Table_B.csv",
"imageDataFilename": "experiment1/location_B/images_B.companion.ome",
"segmentationsFolder": "experiment1/location_B/segmentations_B"
"tags":{}
},
{
"id": "Condition C",
"tabularDataFilename": "experiment1/Table_C.csv",
"imageDataFilename": "experiment1/images_C.companion.ome",
"segmentationsFolder": "experiment1/segmentations_C/"
"tabularDataFilename": "experiment1/location_C/Table_C.csv",
"imageDataFilename": "experiment1/location_C/images_C.companion.ome",
"segmentationsFolder": "experiment1/location_C/segmentations_C",
"tags":{
"drug":"drug_2",
}
}
]
}
```

### Tags

Tags are used to define metadata about an individual location. This is used in the Loon UI to specify specific conditions corresponding to the location. The tags object has no restrictions. For example, locations have have completely different sets of tags, locations may have no tags, and locations can overlap on one more tags.

For example, suppose we have three locations. Your tags may look like this:

```json
[
{
"id": "location_1",
...
"tags": {
"drug": "drug_1",
"concentration":"0.1"
}
},
{
"id": "location_2",
...
"tags": {}
},
{
"id": "location_3",
...
"tags": {
"drug": "drug_3"
}
}
]
```

### Composite Tabular Data File

This key specifies the location (relative to the root of the current experiment directory) of a "combined metadata table" as a parquet file.

This table must be the union of each of the individual location metadata csv files with an additional location column and the union of all tags separated as columns as well.

For example, suppose we use the example from [tags section](#tags). Then, a sample of 6 rows of our parquet file would be something like this:

| location | \{rest_of_headers\} | drug | concentration |
|----------|-------------------|------|----------------|
| location_1 | . . . | drug_1 | 0.1 |
| location_1 | . . . | drug_1 | 0.1 |
| location_2 | . . . | | |
| location_2 | . . . | | |
| location_3 | . . . | drug_3 | |
| location_3 | . . . | drug_3 | |

Here, the empty spaces denote empty strings.



## `Segmentations Folder`

Each imaging location should have a corresponding folder that contains all of the segmentation files. The names of the files must correspond to the imaging frame. That is `1.json` will contain all of the cell segmentations for the first frame., `2.json` will contain the second frame, and so on. Each json file must follow the GeoJSON specification. In addition to the standard geometry attribute, the `bbox` attribute must be defined. To link the segmentations with the corresponding metadata the cell `id` defined in the feature table must be included as an `ID` in the GeoJSON properties object.
Expand Down

0 comments on commit 1d3a5a7

Please sign in to comment.