Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wilms Tumor Dataset Annotation (SCPCP000006) _ clustering #2

Open
maud-p opened this issue Jul 31, 2024 · 0 comments
Open

Wilms Tumor Dataset Annotation (SCPCP000006) _ clustering #2

maud-p opened this issue Jul 31, 2024 · 0 comments

Comments

@maud-p
Copy link
Owner

maud-p commented Jul 31, 2024

If you are filing this issue based on a specific GitHub Discussion, please link to the relevant Discussion.

AlexsLemonade#635

Describe the goals of the changes to the analysis module.

The main addition to the module is a RMardown report for one Wilms tumor sample (dataset SCPCP000006, sample SCPCS000169). The aim would be to discuss the report and improvement before adapting and rendering it to all samples in the dataset.

The analysis will be as the following:

[0] We build a seurat object based on the counts data and went through the seurat workflow [normalization --> reduction --> clustering] following the Seurat workflow.

[1] We perform some quality check to assess any QC-induced clustering (nFeature, nCount, percent.mito).

[2] We add cell cycle information, as we know that in a specific cell cycle state, the transcriptional program is mostly/exclusively related to cell cycle genes and the identity of cells is difficult to determine. We expect these cells to cluster together in a cluster of proliferating cells.

[3] We run DElegate::FindAllMarkers2 to find markers of the different clusters and manually check if they do make sense. DElegate::FindAllMarkers2 is an improved version of Seurat::FindAllMarkers based on pseudobulk differential expression method.

[4] We look at specific marker genes that we reported in the table marker.sets/CellType_metadata.csv to check the relevance of the clustering.

[5] We plot pca/umap reduction grouping with available annotations from the DataLab (singler_, cellassign_). We expect at least immune cells to be correctly label and fall into a few set of clusters.

[6] We run label transfer (Azimuth) to transfer annotation from the fetal kidney atlas human reference. We plot pca/umap reduction grouping with latest labels. We expect it to be the nost representative of the cell types in the sample.

What will your pull request contain?

  • The pull request will contain the rmd file 01-clustering_SCPCS000169.Rmd in the cell-type-wilms-tumor-06 folder and the html report in the notebook folder.

  • The dockerfile required to build the docker image and start RStudio will be updated with packages required.

  • We will add clinical data from the dataset SCPCP000006_metadata.tsv to better track and understand the sample if needed.

Will you require additional software beyond what is already in the analysis module?

We continue working with RStudio and try to keep the dockerfile updated with additional packages.

Will you require different computational resources beyond what the analysis module already uses?

I work on my machine. Not sure how to answer that question, but here is a screenshot of the memory usage report of my r session, in case it can help.
image

If known, when do you expect to file the pull request?

~02/08/2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant