Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ivirshup/census builder spatial #1245

Draft
wants to merge 90 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
56bd44b
Add setup instructions to integrate soma spatial
prathapsridharan May 28, 2024
76c2a8a
Install `git` in census builder docker container
prathapsridharan May 28, 2024
a38c215
Add back pyarrow pin on builder for testing
prathapsridharan May 28, 2024
8ae84af
Modify builder pins to test builder unit tests on GH
prathapsridharan May 28, 2024
b330183
Remove cellxgene_census package dependency for testing
prathapsridharan May 29, 2024
6655196
Pin to tiledbsoma commit to test builder unit tests
prathapsridharan May 29, 2024
e91b2fa
Pin to tiledbsoma git commit for 1.9.5 to test
prathapsridharan May 29, 2024
4115601
Unpin pyarrow in builder to test builder unit tests
prathapsridharan May 29, 2024
1c7f985
Pin to tiledbsoma git commit for 1.10.2 to test
prathapsridharan May 29, 2024
4aa689e
Pin to tiledbsoma git commit for 1.11.1 to test
prathapsridharan May 29, 2024
327cee6
Pin tiledbsoma to 16f481f - head of spatial branch
prathapsridharan May 30, 2024
97ef790
Pin pyarrow back to 15.0.2
prathapsridharan May 30, 2024
3567dc7
Pin tiledbsoma to fc5f8e7 to fix census builder tests
prathapsridharan Jun 3, 2024
ba5ac8f
Add comments to notebook
prathapsridharan Jun 4, 2024
b448027
Create notebook to demo census object creation
prathapsridharan Jun 4, 2024
0f1f20a
Use absolute file path for contents of manifest file
prathapsridharan Jun 5, 2024
5c81aaa
Add "EFO:0010961" to the list of allowed assays
prathapsridharan Jun 5, 2024
ce4b525
Add comments for clarity in pyproject.toml
prathapsridharan Jun 5, 2024
1f654b6
Fix filepaths in notebook
prathapsridharan Jun 5, 2024
4712e95
Make census builder run without errors on spatial datasets
prathapsridharan Jun 5, 2024
e7e6b07
Add census_data and census_spatial collections
prathapsridharan Jun 7, 2024
44c1109
Update notebook
prathapsridharan Jun 7, 2024
1854f41
Pin tiledbsoma to commit 5069714 for latest spatial
prathapsridharan Jun 12, 2024
14d682e
Update tiledbsoma spatial notebook
prathapsridharan Jun 12, 2024
bf0fdea
Update tiledbsoma spatial notebook
prathapsridharan Jun 12, 2024
ebbe1b1
Update tiledbsoma spatial notebook
prathapsridharan Jun 12, 2024
ff73498
Pin tiledbsoma to commit 69d699e for latest spatial
prathapsridharan Jun 12, 2024
a5c21a1
Pin tiledbsoma to commit 9eb540f for latest spatial
prathapsridharan Jun 13, 2024
2cef2bb
Update tiledbsoma spatial notebook
prathapsridharan Jun 14, 2024
03afeb4
Use latest commit from tiledbsoma spatial branch
ivirshup Jul 12, 2024
f3b699b
get census spatial builder notebook running (i think)
ivirshup Jul 12, 2024
aca4bac
Update ingestion notebook
ivirshup Jul 17, 2024
8b08ac8
Merge branch 'main' into ivirshup/census-builder-spatial
ivirshup Jul 17, 2024
c10ac78
Fix typo
ivirshup Jul 17, 2024
441fb64
Update source datasets to v5.1.0 versions
ivirshup Jul 17, 2024
62e3ae0
Update ingest notebook
ivirshup Jul 17, 2024
e4d2bf2
First attempt at writing images, current in progress
ivirshup Jul 17, 2024
229379c
Fix up writing of images
ivirshup Jul 18, 2024
953b710
Start test suite
ivirshup Jul 24, 2024
61b901e
Write locations
ivirshup Jul 24, 2024
833260d
Write obs_scene
ivirshup Jul 24, 2024
48cd6e3
silence mypy
ivirshup Jul 24, 2024
8499603
sorta update test
ivirshup Jul 24, 2024
8f8adcd
update notebooks
ivirshup Jul 24, 2024
3e504de
Fix builds on CI by using hatch as build tool
ivirshup Jul 24, 2024
cf9be22
Get build running for visium
ivirshup Aug 1, 2024
0a257f9
Add tests for images and obs_scene
ivirshup Aug 3, 2024
a6989f0
add notebook so I can change branches
ivirshup Aug 3, 2024
cb891e8
Allow manifests to contain s3 uris, add manifest folder
ivirshup Aug 6, 2024
f39300b
Add manifest for slideseq
ivirshup Aug 7, 2024
41c7c18
Merge branch 'main' into ivirshup/census-builder-spatial
ivirshup Aug 7, 2024
d33c7dd
Add slide-seqv2 to allowed assays
ivirshup Aug 8, 2024
9ca4327
Branch to allow for datasets that don't have associated images
ivirshup Aug 9, 2024
07e6e30
Use dask to write images + pin tiledbsoma to working commit
ivirshup Aug 16, 2024
d6a1ab6
Merge branch 'main' into ivirshup/census-builder-spatial
ivirshup Aug 16, 2024
cd96482
Add support for scene class
ivirshup Aug 16, 2024
4a4316d
Bump to newer commit on tiledbsoma branch
ivirshup Aug 28, 2024
6c86a75
Update pinned tiledbsoma version (not quite working)
ivirshup Sep 4, 2024
5b9f70a
Merge branch 'main' into ivirshup/census-builder-spatial
ivirshup Sep 13, 2024
8e6fa76
Merge branch 'main' into ivirshup/census-builder-spatial, plus update…
ivirshup Oct 22, 2024
1058956
Merge branch 'main' into ivirshup/census-builder-spatial
ivirshup Oct 28, 2024
f9f72b8
Fix write_obs/var_dataframe for cases where that dataframe is None
ivirshup Oct 28, 2024
ef720da
Working again! (mostly)
ivirshup Oct 29, 2024
4241c9d
Add tiledb dep back
ivirshup Oct 29, 2024
a4b34c1
update manifests
ivirshup Oct 29, 2024
5cf87f0
Merge branch 'main' into ivirshup/census-builder-spatial
ivirshup Nov 18, 2024
b06361f
Bump tiledbsoma version. Temporarily turn off validation for testing.
ivirshup Nov 22, 2024
4fb3198
Bump to tiledbsoma==1.15.0rc4
ivirshup Nov 23, 2024
a75e510
Update manifests
ivirshup Dec 6, 2024
39969f5
Merge branch 'main' into ivirshup/census-builder-spatial
ivirshup Dec 13, 2024
6a7c21d
Relax requirements
ivirshup Dec 13, 2024
bbe7dbb
Account for differences in obs_spatial_presence
ivirshup Dec 13, 2024
c2c2967
Get builds up to date (transforms are still broken)
ivirshup Dec 16, 2024
3a6229d
Bump tiledbsoma dep to include latest spatialdata exporter
ivirshup Jan 9, 2025
196f11a
Fix transformations
ivirshup Jan 10, 2025
d4db232
Bump version of tiledbsoma used to 1.15.3
ivirshup Jan 11, 2025
f5efa68
Merge branch 'main' into ivirshup/census-builder-spatial
ivirshup Jan 11, 2025
8b54aea
Update required version of tiledbsoma for client
ivirshup Jan 11, 2025
cac4f04
Fix consolidation (#1329)
ebezzi Jan 13, 2025
a37b74e
Merge branch 'main' into ivirshup/census-builder-spatial
ivirshup Jan 13, 2025
966d2e9
fix validation key (#1332)
ebezzi Jan 13, 2025
4f97fb2
Merge branch 'main' into ivirshup/census-builder-spatial
ivirshup Jan 13, 2025
3b1d7ec
Only check files listed in manifest
ivirshup Jan 15, 2025
c8cc589
Add test for normalized matrix on spatial
ivirshup Jan 17, 2025
e34ee99
Merge branch 'ivirshup/census-builder-improved-test-cases' into ivirs…
ivirshup Jan 21, 2025
b32f14a
Set name of visium/ slide-seq to census_spatial_sequencing
ivirshup Jan 21, 2025
04e4de4
Don't write a normalized layer for spatial sequencing experiments
ivirshup Jan 21, 2025
a1b4969
add pooch to deps (used in testing)
ivirshup Jan 21, 2025
add4977
Add pytest-xdist to builder test deps
ivirshup Jan 21, 2025
e53bcf8
Remove alpha level from images
ivirshup Jan 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion api/python/cellxgene_census/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ dependencies= [
# NOTE: the tiledbsoma version must be >= to the version used in the Census builder, to
# ensure that the assets are readable (tiledbsoma supports backward compatible reading).
# Make sure this version does not fall behind the builder's tiledbsoma version.
"tiledbsoma>=1.12.3,!=1.14.1,<1.15",
"tiledbsoma>=1.15.3",
"anndata",
"numpy>=1.23,<2.0",
"requests",
Expand Down
3 changes: 1 addition & 2 deletions api/python/cellxgene_census/tests/test_open.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@
import numpy as np
import pytest
import requests_mock as rm
import tiledb
import tiledbsoma as soma

import cellxgene_census
Expand Down Expand Up @@ -442,7 +441,7 @@ def test_opening_census_without_anon_access_fails_with_bogus_creds() -> None:
os.environ["AWS_SECRET_ACCESS_KEY"] = "fake_key"
# Passing an empty context
with pytest.raises(
(tiledb.TileDBError, soma.DoesNotExistError),
soma.DoesNotExistError,
match=r"does not exist",
):
cellxgene_census.open_soma(census_version="latest", context=soma.SOMATileDBContext())
Expand Down
2 changes: 2 additions & 0 deletions tools/cellxgene_census_builder/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,14 @@ ARG COMMIT_SHA
ENV COMMIT_SHA=${COMMIT_SHA}

# Ubuntu 22 contains only the python3.11 RC as of 2023-12-21, so use deadsnakes
# TODO (spatial): `git` is added to this dockerfile to be able to install python packages from github. Remove when it is not needed.
RUN apt update && \
apt install -y software-properties-common && \
add-apt-repository -y ppa:deadsnakes/ppa && \
apt update && \
apt -y full-upgrade && \
apt -y install python3.11 python3.11-venv python3-pip awscli && \
apt -y install git && \
apt-get clean

# set python3.11 as default
Expand Down
9 changes: 9 additions & 0 deletions tools/cellxgene_census_builder/SPATIAL-README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
## Development Environment Setup and Run

- `pip install -e tools/cellxgene_census_builder`

**NOTE:** When running the builder on MacOS, unpin `pyarrow` in [census builder pyproject.toml](./pyproject.toml)

- `pip install -e api/python/cellxgene_census`

- [Dev tools for spatial](./spatial_dev_tools/) contains scripts and notebooks to aid development and testing
238 changes: 238 additions & 0 deletions tools/cellxgene_census_builder/manifests/slideseq_manifest.csv

Large diffs are not rendered by default.

193 changes: 193 additions & 0 deletions tools/cellxgene_census_builder/manifests/visium_manifest.csv

Large diffs are not rendered by default.

283 changes: 283 additions & 0 deletions tools/cellxgene_census_builder/manifests/visium_manifest_s3.csv

Large diffs are not rendered by default.

72 changes: 46 additions & 26 deletions tools/cellxgene_census_builder/pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,9 +1,15 @@
[build-system]
requires = ["setuptools>=45", "setuptools_scm[toml]>=6.2"]
build-backend = "setuptools.build_meta"
# TODO: I've updated the build system so I can install the local version of cellxgene_census using hatch's context formatting._format_
# This is a little half-assed, and should be either reverted or improved as we approach a release.
# Right now the wheels/ sdists being produces may include a different set of files since there are no include/ exclude configuration values set
# https://hatch.pypa.io/dev/config/context/
requires = ["hatchling", "hatch-vcs"]
build-backend = "hatchling.build"
# requires = ["setuptools>=45", "setuptools_scm[toml]>=6.2"]
# build-backend = "setuptools.build_meta"

[project]
name = "cellxgene_census_builder"
name = "cellxgene-census-builder"
dynamic = ["version"]
description = "Build Cell Census"
authors = [
Expand All @@ -24,40 +30,54 @@ classifiers = [
"Programming Language :: Python :: 3.11",
]
dependencies= [
"typing_extensions==4.10.0",
"pyarrow==15.0.2",
"pandas[performance]==2.2.1",
"anndata==0.10.6",
"numpy==1.26.4",
"typing_extensions>=4.10.0",
"pyarrow>=15.0.2",
"pandas[performance]>=2.2.1",
"anndata",
"numpy>=1.26.4",
# IMPORTANT: consider TileDB format compat before advancing this version. It is important that
# the tiledbsoma _format_ version lag that used in cellxgene-census package, ensuring that
# recent cellxgene-census _readers_ are able to read the results of a Census build (writer).
# The compatibility matrix is defined here:
# https://github.com/TileDB-Inc/TileDB/blob/dev/format_spec/FORMAT_SPEC.md
"tiledbsoma==1.11.4",
"cellxgene-census==1.15.0",
"cellxgene-ontology-guide==1.2.0",
"scipy==1.12.0",
"fsspec[http]==2024.3.1",
"s3fs==2024.3.1",
"requests==2.32.0",
"aiohttp==3.10.2",
"Cython", # required by owlready2
"wheel", # required by owlready2
"owlready2==0.44",
"gitpython==3.1.42",
"attrs==23.2.0",
"psutil==5.9.8",
"pyyaml==6.0.1",
"numba==0.59.1",
"dask==2024.3.1",
"distributed==2024.3.1",
# TODO (spatial): tiledbsoma pin to a PyPI release is temporarily commented out in favor git commit pin
"tiledbsoma==1.15.3",
# TODO (spatial): Pin tiledbsoma dependency to an actual released version after tiledbsoma spatial code has been released
# "tiledbsoma @ git+https://github.com/single-cell-data/TileDB-SOMA.git@16467fa7405d59ab1763f103081318b839f87610#egg=tiledbsoma&subdirectory=apis/python/",
# TODO (spatial): Deal with the following line before release
"cellxgene-census @ {root:parent:parent:uri}/api/python/cellxgene_census",
"cellxgene-ontology-guide>=1.2",
"scipy>=1.12.0",
"fsspec[http]",
"s3fs",
"requests>=2.32.0",
"aiohttp",
"gitpython>=3.1.42",
"attrs>=23.2.0",
"psutil>=5.9.8",
"pyyaml>=6.0.1",
"numba>=0.59.1",
"dask",
"distributed",
"tiledb",
# Spatial testing
"filelock",
"spatialdata",
"pooch",
"pytest-xdist",
]

[project.urls]
homepage = "https://github.com/chanzuckerberg/cellxgene-census"
repository = "https://github.com/chanzuckerberg/cellxgene-census"

[tool.hatch.metadata]
allow-direct-references = true # This allows us to pin dependencies to specific commits during development, e.g. tiledbsoma @ git+...

[tool.hatch.version]
source = "vcs"
raw-options = { root = "../.." }

[tool.setuptools.packages.find]
where = ["src"]
include = ["cellxgene_census_builder*"] # package names should match these glob patterns (["*"] by default)
Expand Down
Empty file.
Loading
Loading