Wilms Tumor Dataset Annotation (SCPCP000006) _ clustering #2

maud-p · 2024-07-31T16:17:07Z

If you are filing this issue based on a specific GitHub Discussion, please link to the relevant Discussion.

Describe the goals of the changes to the analysis module.

The main addition to the module is a RMardown report for one Wilms tumor sample (dataset SCPCP000006, sample SCPCS000169). The aim would be to discuss the report and improvement before adapting and rendering it to all samples in the dataset.

The analysis will be as the following:

[0] We build a seurat object based on the counts data and went through the seurat workflow [normalization --> reduction --> clustering] following the Seurat workflow.

[1] We perform some quality check to assess any QC-induced clustering (nFeature, nCount, percent.mito).

[2] We add cell cycle information, as we know that in a specific cell cycle state, the transcriptional program is mostly/exclusively related to cell cycle genes and the identity of cells is difficult to determine. We expect these cells to cluster together in a cluster of proliferating cells.

[3] We run DElegate::FindAllMarkers2 to find markers of the different clusters and manually check if they do make sense. DElegate::FindAllMarkers2 is an improved version of Seurat::FindAllMarkers based on pseudobulk differential expression method.

[4] We look at specific marker genes that we reported in the table marker.sets/CellType_metadata.csv to check the relevance of the clustering.

[5] We plot pca/umap reduction grouping with available annotations from the DataLab (singler_, cellassign_). We expect at least immune cells to be correctly label and fall into a few set of clusters.

[6] We run label transfer (Azimuth) to transfer annotation from the fetal kidney atlas human reference. We plot pca/umap reduction grouping with latest labels. We expect it to be the nost representative of the cell types in the sample.

What will your pull request contain?

The pull request will contain the rmd file 01-clustering_SCPCS000169.Rmd in the cell-type-wilms-tumor-06 folder and the html report in the notebook folder.
The dockerfile required to build the docker image and start RStudio will be updated with packages required.
We will add clinical data from the dataset SCPCP000006_metadata.tsv to better track and understand the sample if needed.

Will you require additional software beyond what is already in the analysis module?

We continue working with RStudio and try to keep the dockerfile updated with additional packages.

Will you require different computational resources beyond what the analysis module already uses?

I work on my machine. Not sure how to answer that question, but here is a screenshot of the memory usage report of my r session, in case it can help.

If known, when do you expect to file the pull request?

~02/08/2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wilms Tumor Dataset Annotation (SCPCP000006) _ clustering #2

Wilms Tumor Dataset Annotation (SCPCP000006) _ clustering #2

maud-p commented Jul 31, 2024

Wilms Tumor Dataset Annotation (SCPCP000006) _ clustering #2

Wilms Tumor Dataset Annotation (SCPCP000006) _ clustering #2

Comments

maud-p commented Jul 31, 2024

If you are filing this issue based on a specific GitHub Discussion, please link to the relevant Discussion.

Describe the goals of the changes to the analysis module.

What will your pull request contain?

Will you require additional software beyond what is already in the analysis module?

Will you require different computational resources beyond what the analysis module already uses?

If known, when do you expect to file the pull request?