Skip to content

Commit

Permalink
release v0.1.0
Browse files Browse the repository at this point in the history
  • Loading branch information
quentinblampey committed Sep 11, 2024
1 parent 70e5b06 commit a9a2464
Show file tree
Hide file tree
Showing 9 changed files with 113 additions and 62 deletions.
4 changes: 2 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
## [0.1.0] - tbd
## [0.1.0] - 2024-09-11

tbd
First official `novae` release. Preprint coming soon.
15 changes: 11 additions & 4 deletions data/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,15 @@
# Public datasets

We detail below how to download public spatial transcriptomics datasets. The data will be saved in this directory, and will be used to train `novae`.
We detail below how to download public spatial transcriptomics datasets.

## Download
## Option 1: Hugging Face Hub

We store our dataset on [Hugging Face Hub](https://huggingface.co/datasets/MICS-Lab/novae).
To automatically download these slides, you can use the [`novae.utils.load_dataset`](https://mics-lab.github.io/novae/api/novae.utils/#novae.utils.load_dataset) function.

NB: not all slides are uploaded on Hugging Face yet, but we are progressively adding new slides. To get the full dataset right now, use the "Option 2" below.

## Option 2: Download

For consistency, all the scripts below need to be executed at the root of the `data` directory (i.e., `novae/data`).

Expand Down Expand Up @@ -50,15 +57,15 @@ All above datasets can be downloaded using a single command line. Make sure you
sh _scripts/1_download_all.sh
```

## Preprocess and prepare for training
### Preprocess and prepare for training

The script bellow will copy all `adata.h5ad` files into a single directory, compute UMAPs, and minor preprocessing. See the `argparse` helper of this script for more details.

```sh
python _scripts/2_prepare.py
```

## Usage
### Usage

These datasets can be used during training (see the `scripts` directory at the root of the `novae` repository).

Expand Down
2 changes: 2 additions & 0 deletions docs/api/novae.plot.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,5 @@
::: novae.plot.pathway_scores

::: novae.plot.paga

::: novae.plot.spatially_variable_genes
123 changes: 71 additions & 52 deletions docs/tutorials/main_usage.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion novae/plot/_bar.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
from ._utils import get_categorical_color_palette


def domains_proportions(adata: AnnData | list[AnnData], obs_key: str | None, figsize: tuple[int, int] = (2, 5)):
def domains_proportions(adata: AnnData | list[AnnData], obs_key: str | None = None, figsize: tuple[int, int] = (2, 5)):
"""Show the proportion of each domain in the slide(s).
Args:
Expand Down
3 changes: 3 additions & 0 deletions novae/plot/_graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,9 @@ def _domains_hierarchy(
def paga(adata: AnnData, obs_key: str | None = None, **paga_plot_kwargs: int):
"""Plot a PAGA graph.
Info:
Currently, this function only supports one slide per call.
Args:
adata: An AnnData object.
obs_key: Name of the key from `adata.obs` containing the Novae domains. By default, the last available domain key is shown.
Expand Down
3 changes: 3 additions & 0 deletions novae/plot/_heatmap.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ def pathway_scores(
) -> pd.DataFrame | None:
"""Show a heatmap of pathway scores for each domain.
Info:
Currently, this function only supports one slide per call.
Args:
adata: An `AnnData` object.
pathways: Either a dictionary of pathways (keys are pathway names, values are lists of gane names), or a path to a [GSEA](https://www.gsea-msigdb.org/gsea/msigdb/index.jsp) JSON file.
Expand Down
21 changes: 19 additions & 2 deletions novae/plot/_spatial.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ def domains(
Info:
Make sure you have already your Novae domains assigned to the `AnnData` object. You can use `model.assign_domains(...)` to do so.
Args:
adata: An `AnnData` object, or a list of `AnnData` objects.
obs_key: Name of the key from `adata.obs` containing the Novae domains. By default, the last available domain key is shown.
Expand Down Expand Up @@ -113,7 +112,25 @@ def spatially_variable_genes(
min_positive_ratio: float = 0.05,
return_list: bool = False,
**kwargs: int,
) -> list[str]:
) -> None | list[str]:
"""Plot the most spatially variable genes (SVG) for a given `AnnData` object.
!!! info
Currently, this function only supports one slide per call.
Args:
adata: An `AnnData` object corresponding to one slide.
obs_key: Key in `adata.obs` that contains the domains. By default, it will use the last available Novae domain key.
top_k: Number of SVG to be shown.
show: Whether to show the plot.
cell_size: Size of the cells or spots (`spot_size` argument of `sc.pl.spatial`).
min_positive_ratio: Genes whose "ratio of cells expressing it" is lower than this threshold are not considered.
return_list: Whether to return the list of SVG instead of plotting them.
**kwargs: Additional arguments for `sc.pl.spatial`.
Returns:
A list of SVG names if `return_list` is `True`.
"""
assert isinstance(adata, AnnData), f"Received adata of type {type(adata)}. Currently only AnnData is supported."

obs_key = utils.check_available_domains_key([adata], obs_key)
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "novae"
version = "0.0.5"
version = "0.1.0"
description = "Graph-based foundation model for spatial transcriptomics data"
documentation = "https://mics-lab.github.io/novae/"
homepage = "https://mics-lab.github.io/novae/"
Expand Down

0 comments on commit a9a2464

Please sign in to comment.