-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start wilms analysis #681
Start wilms analysis #681
Changes from 12 commits
c1071c3
81c1712
caddbbf
c29e1cb
999d2f2
671d4d7
ef22cb6
ab263d3
201bd53
3f7d64b
81e1478
b754e5d
b2a1167
b096507
89d8299
e25c391
97799c2
85d4a09
09e102b
09de604
fa2d6a5
38c1ae2
cbd6be3
835b0c5
3adc01b
9a2f1cf
6862fcc
83a6fe5
c4ec00a
4973f16
230bf62
b4eebec
c4e64c1
46c7e8b
ebd2bf1
41bf440
afaff45
d3bce0e
dd17e27
650df7b
47a84bc
dca3199
410ff39
4d6e232
5559521
edd0e01
3f6439b
405730a
a71de56
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# Ignore everything by default | ||
* | ||
|
||
# Include specific files in the docker environment | ||
!/renv.lock | ||
!/requirements.txt | ||
!/environment.yml | ||
!/conda-lock.yml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# Results should not be committed | ||
/results/* | ||
!/results/README.md | ||
|
||
# Ignore the scratch directory (but keep it present) | ||
/scratch/* | ||
!/scratch/.gitkeep |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# pull base image | ||
FROM bioconductor/tidyverse:3.19 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had a problem building this locally, but it seems that the base image might be the issue. I will need to investigate a little more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I changed for this: # Base image on the Bioconductor 3.19 image
FROM bioconductor/r-ver:3.19 and build locally like this podman buildx build . -t openscpca/cell-type-wilms-tumor-06:latest --platform linux/amd64 and it works :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am building locally to make some tested recommendations regarding what to do with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you that would be great! I am initiating a renv environment from the current R Session to then simplify the Dockerfile using the renv.lock (https://openscpca.readthedocs.io/en/latest/ensuring-repro/docker/docker-images/) renv_init() is taking a while... I'll continue tomorrow :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had trouble accessing RStudio on the image I built locally. I'm attempting to isolate the problem... but that unfortunately requires rebuilding things 😅 ⏳ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for being on it :) I have troubles trying to implement the renv() environment and to change for the bioconductor/r-ver:3.19 based image... So far I can only open RStudio using the following Dockerfile (just a bit "cleaned" compared to the PR). # Base image on the Bioconductor 3.19 image
FROM bioconductor/tidyverse:3.19
# Set global R options
RUN echo "options(repos = 'https://cloud.r-project.org')" > $(R --no-echo --no-save -e "cat(Sys.getenv('R_HOME'))")/etc/Rprofile.site
ENV RETICULATE_MINICONDA_ENABLED=FALSE
RUN R --no-echo --no-restore --no-save -e "install.packages('remotes')"
RUN R -e "devtools::install_github('enblacar/SCpubr')"
RUN R -e "remotes::install_github('satijalab/seurat', 'seurat5', quiet = TRUE)" # this also install patchwork (and others)
RUN R -e "remotes::install_github('satijalab/azimuth', quiet = TRUE)" # this also install SingleCellExperiment, DT (and others)
RUN R -e "remotes::install_github('cancerbits/DElegate')"
RUN R -e "install.packages('viridis')"
RUN R -e "install.packages('ggplotify')"
RUN R -e "BiocManager::install('edgeR')"
# make sure all R related binaries are in PATH in case we want to call them directly
ENV PATH ${R_HOME}/bin:$PATH There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jaclyn-taroni FYI, I removed from the run.sh et config file unnecessary volume and path to data that are specific to our group. I realized it prevent the execution of the docker image if not defined! Just in case it can help. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Last comment of the week ;) I added the renv.lock file from the RStudio session of docker image I am running on my machine. I am now trying to use it to build the image as described here: (https://openscpca.readthedocs.io/en/latest/ensuring-repro/docker/docker-images/), but building of the image is full of ERROR because of BiocManager version not matching the bioconductor version.. (as described here rstudio/renv#517) Will take some time of fine tunning ⏳ But I think it's worth trying to have the Docker image in this format, then it might be easier to share/reproduce? What do you think @jaclyn-taroni ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that our end goal should be to use I am specifically struggling with what base image should be used right now. I expect folks working on this project (including all of our staff at the Data Lab) to be on Macs with Apple Silicon. My understanding is that you want to be able to develop using the RStudio Server IDE from within the running container. This is often part of my workflow, and I expect many other project participants might have this use case! (That is to say, I think we are bumping into a problem that will come up again and again in the project...) However, I'm not sure we can use So, we might instead want to use the
And until we implement
Then I believe we'd have an image we can build and push to Elastic Container Registry using GitHub Actions that can also be built and used locally on ARM machines. This compatibility seems like a good goal to me. I would have liked to test this all conclusively, but installing Azimuth is taking a very long time for me 😅 Appreciate your patience, @maud-p! |
||
|
||
|
||
# Change the default CRAN mirror | ||
RUN echo "options(repos = c(CRAN = 'https://mran.microsoft.com/snapshot/2022-02-01'), download.file.method = 'libcurl')" >> ${R_HOME}/etc/Rprofile.site | ||
maud-p marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
# Set global R options | ||
RUN echo "options(repos = 'https://cloud.r-project.org')" > $(R --no-echo --no-save -e "cat(Sys.getenv('R_HOME'))")/etc/Rprofile.site | ||
ENV RETICULATE_MINICONDA_ENABLED=FALSE | ||
|
||
|
||
RUN R -e "devtools::install_github('enblacar/SCpubr')" | ||
RUN R --no-echo --no-restore --no-save -e "install.packages('remotes')" | ||
RUN R -e "remotes::install_github('satijalab/seurat', 'seurat5', quiet = TRUE)" # this also install patchwork (and others) | ||
RUN R -e "remotes::install_github('satijalab/azimuth', quiet = TRUE)" # this also install SingleCellExperiment, DT (and others) | ||
maud-p marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
#RUN chmod -R a+rw ${R_HOME}/site-library # so that everyone can dynamically install more libraries within container | ||
#RUN chmod -R a+rw ${R_HOME}/library | ||
|
||
# add custom options for rstudio sessions | ||
# make sure sessions stay alive forever | ||
RUN echo "session-timeout-minutes=0" >> /etc/rstudio/rsession.conf | ||
# make sure we get rstudio server logs in the container | ||
# RUN echo $'[*]\nlog-level=warn\nlogger-type=file\n' > /etc/rstudio/logging.conf | ||
|
||
# make sure all R related binaries are in PATH in case we want to call them directly | ||
ENV PATH ${R_HOME}/bin:$PATH |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
# Wilms Tumor Dataset Annotation (SCPCP000006) | ||
|
||
Wilms tumor (WT) is the most common pediatric kidney cancer characterized by an exacerbated intra- and inter- tumor heterogeneity. The genetic landscape of WT is very diverse in each of the histological contingents. The COG classifies WT patients into two groups: the favorable histology and diffuse anaplasia. Each of these groups is composed of the blastemal, epithelial, and stromal populations of cancer cells in different proportions, as well as cells from the normal kidney, mostly kidney epithelial cells, endothelial cells, immune cells and normal stromal cells (fibroblast). | ||
maud-p marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Description | ||
|
||
Here, we first aim to annotate the Wilms Tumor snRNA-seq samples in the SCPCP000006 (n=40) dataset. To do so we will: | ||
|
||
• Provide annotations of normal cells composing the kidney, including normal kidney epithelium, endothelium, stroma and immune cells | ||
• Provide annotations of tumor cell populations that may be present in the WT samples, including blastemal, epithelial, and stromal populations of cancer cells | ||
Based on the provided annotation, we would like to additionally provide a reference of marker genes for the three cancer cell populations, which is so far lacking for the WT community. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🎉 |
||
|
||
The analysis is/will be divided as the following: | ||
|
||
[x] Metadata file: compilation of a metadata file of marker genes for expected cell types that will be used for validation at a later step | ||
[ ] Script: clustering of cells across a set of parameters for few samples | ||
[ ] Script: label transfer from the fetal kidney atlas reference using runAzimuth | ||
[ ] Script: run InferCNV | ||
[ ] Notebook: explore results from steps 2 to 4 for about 5 to 10 samples | ||
[ ] Script: compile scripts 2 to 4 in a RMardown file with required adjustements and render it across all samples | ||
[ ] Notebook: explore results from step 6, integrate all samples together and annotate the dataset using (i) metadatafile, (ii) CNV information, (iii) label transfer information | ||
maud-p marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Usage | ||
From Rstudio, run the Rmd reports or render the R scripts (see below R studio session set up). Please before running the script, make sure that the paths are correct. | ||
You can also simply have a look at the html reports in the notebook folder. Here, no need to run anything, we try to guide you through the analysis. Have a look at the code using the unhide code button on the top right of each chunk! | ||
maud-p marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Input files | ||
|
||
In this module, we start with the processed `SingleCellExperiment` objects from the ScPCA Portal. | ||
maud-p marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Data have been downloaded locally and are found in mnt_data. the mnt_data folder has to be define in the config.yaml file or changed in the notebook accordingly. | ||
|
||
```{r paths} | ||
path_to_data <- "~/mnt_data/Wilms ALSF/SCPCP000006_2024-06-25" | ||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think using the From this directory, one could run:
(This would use the defaults = download the And then your future notebooks could develop against the path (relative to the root of the repository) This requires AWS CLI setup to run as intended, so let us know if you have any questions! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AWS CLI setup and download into data worked well 👍 ../../download-data.py --projects SCPCP000006 I'll change the READ.ME file, config.yaml and (future) scripts! |
||
|
||
## Output files | ||
|
||
## Marker sets | ||
|
||
This folder is a resource for later validation of the annotated cell types. | ||
|
||
### The table CellType_metadata.csv contains the following column and information: | ||
- "gene_symbol" contains the symbol of the described gene, using the HUGO Gene Nomenclature | ||
- ENSEMBL_ID contains the stable identifier from the ENSEMBL database | ||
- cell_class is either "malignant" for marker genes specific to malignant population, or "non-malignant" for markers genes specific to non-malignant tissue or "both" for marker genes that can be found in malignant as well as non-malignant tissue but are still informative in respect to the cell type. | ||
- cell_type contains the list of the cell types that are attributed to the marker gene | ||
- DOI contains the list of main publication identifiers supporting the choice of the marker gene | ||
- comment can be empty or contains any additional information | ||
|
||
|gene_symbol|ENSEMBL_ID|cell_class|cell_type|DOI|comment| | ||
|---|---|---|---|---|---| | ||
|WT1|ENSG00000184937|malignant|cancer_cell|10.1242/dev.153163|Tumor_suppressor_WT1_is_lost_in_some_WT_cells| | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think of sc/snRNA-seq as better suited to picking up overexpression than loss. I know you've worked on these data before, so I'm just curious if you expect or have observed differences in WT1 expression in the cancer cells. Although, I see that you put:
in #635, so maybe we don't know yet! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added the (?) for the reason you mentionned, as we are looking for loss of function, I am not sure that we can really use it for annotation. However, about 20% of Wilms tumor would have imperment of WT1, and I would expect the WT1 mutated Wilms tumor to have a specific transcriptional program. At the final step of integration of the 40 samples together, I would expect a cluster negative for WT1. Also, the normal kidney should be WT1 positive. So in this last step I think looking at WT1 would make sense. |
||
|IGF2|ENSG00000167244|malignant|cancer_cell|10.1038/ng1293-408|NA| | ||
|TP53|ENSG00000141510|malignant|anaplastic|10.1158/1078-0432.CCR-16-0985|Might_also_be_in_small_non_anaplastic_subset| | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From the abstract of this publication, I wonder if looking at TP53 loss/activation at a pathway level would be interesting 🤔 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a great idea, I will implement this in the next PR:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|MYCN|ENSG00000134323|malignant|anaplastic|10.18632/oncotarget.3377|Also_in_non_anaplastic_poor_outcome| | ||
|MAX|ENSG00000125952|malignant|anaplastic|10.1016/j.ccell.2015.01.002|Also_in_non_anaplastic_poor_outcome| | ||
|SIX1|ENSG00000126778|malignant|blastema|10.1016/j.ccell.2015.01.002|NA| | ||
|SIX2|ENSG00000170577|malignant|blastema|10.1016/j.ccell.2015.01.002|NA| | ||
Comment on lines
+89
to
+90
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar to my TP53 comment – from a quick look at this publication, I wonder if looking at the altered expression patterns rather than the individual genes could be helpful. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the MSigdB C3/MIR gene sets, we have gene sets for DICER and DROSHA, I could also give a try to run enrichment for this dataset. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|CITED1|ENSG00000125931|malignant|blastema|10.1593/neo.07358|Also_in_embryonic_kidney| | ||
|PTPRC|ENSG00000081237|immune|NA|10.1101/gr.273300.120|NA| | ||
|CD68|ENSG00000129226|immune|myeloid|10.1186/1746-1596-7-12|NA| | ||
|CD163|ENSG00000177575|immune|macrophage|10.1186/1746-1596-7-12|NA| | ||
|VWF|ENSG00000110799|endothelium|endothelium|10.1134/S1990747819030140|NA| | ||
|CD3E|ENSG00000198851|immune|T_cell|10.1101/gr.273300.120|NA| | ||
|MS4A1|ENSG00000156738|immune|B_cell|10.1101/gr.273300.120|NA| | ||
|FOXP3|ENSG00000049768|immune|T_cell|10.1101/gr.273300.120|Treg| | ||
|CD4|ENSG00000010610|immune|T_cell|10.1101/gr.273300.120|NA| | ||
|CD8A|ENSG00000153563|immune|T_cell|10.1101/gr.273300.120|NA| | ||
|EPCAM|ENSG00000119888|NA|epithelial|10.1016/j.stemcr.2014.05.013|epithelial_malignant_and_non_malignant| | ||
|NCAM1|ENSG00000149294|malignant|blastema|10.1016/j.stemcr.2014.05.013|might_also_be_expressed_in_non_malignant| | ||
|PODXL|ENSG00000128567|non-malignant|podocyte|10.1016/j.stem.2019.06.009|NA| | ||
|COL6A3|ENSG00000163359|malignant|mesenchymal|10.2147/OTT.S256654|might_also_be_expressed_in_non_malignant_stroma| | ||
|THY1|ENSG00000154096|malignant|mesenchymal|10.1093/hmg/ddq042|might_also_be_expressed_in_non_malignant_stroma| | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm curious why this gene differs from some of the ones outlined in the abstract and is included. I suppose I would not expect the ones outlined in the abstract to be specific to malignant cells necessarily. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unfortunately I don't know about one mesenchymal gene specific for mesenchymal Wilms tumor cells... For some colleagues who wanted to characterize CAF, the best approach I found was:
This is not perfect I think, but at least we should have clusters enriched in the target population. Stromal cells are really easily identified based on either few markers, or label transfer from the fetal kidney reference. They often form one single cluster. This is the reason why I didn't spend too much time adding marker genes for them, but I can add few more mesenchymal markers and references for correctness :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd defer to you since you've spent more time thinking about this problem 😄 You don't need to add them – we'll have a record of this conversation! |
||
|
||
|
||
### The table GeneticAlterations_metadata.csv contains the following column and information: | ||
- alteration contains the number and portion of the affected chromosome | ||
- gain_loss contains the information regarding the gain or loss of the corresponding genetic alteration | ||
- cell_class is "malignant" | ||
- cell_type contains the list of the malignant cell types that are attributed to the marker gene, either blastemal, stromal, epithelial or NA if none of the three histology is more prone to the described genetic alteration | ||
- DOI contains the list of main publication identifiers supporting the choice of the genetic alteration | ||
- comment can be empty or contains any additional information | ||
|
||
|alteration|gain_loss|cell_class|cell_type|DOI|PMID|comment | ||
|---|---|---|---|---|---|---| | ||
|11p13|loss|malignant|NA|10.1242/dev.153163|NA|NA| | ||
|11p15|loss|malignant|NA|10.1128/mcb.9.4.1799|NA|NA| | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I get a 404 at: https://doi.org/10.1128/mcb.9.4.1799 – perhaps a typo in the DOI? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh sorry, complete doi is https://doi.org/10.1128/mcb.9.4.1799-1803.1989 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should be corrected in the READ.ME and csv file :) |
||
|16q|loss|malignant|NA|NA|1317258|Associated_with_relapse| | ||
|1p|loss|malignant|NA|NA|8162576|Associated_with_relapse| | ||
|1q|gain|malignant|NA|10.1016/S0002-9440(10)63982-X|NA|Associated_with_relapse| | ||
|
||
|
||
|
||
## Software requirements | ||
|
||
To perform the analysis, run the RMarkdown script in R (version 4.4.1). | ||
The main packages used are: | ||
- Seurat version 5 | ||
- Azimuth version 5 | ||
- inferCNV | ||
- SCpubr for visualization | ||
- DT for table visualization | ||
- DElegate for differential expression analysis | ||
|
||
For complete reproducibility of the results, you can build and run the docker image using the Dockerfile. This will allow you to work on RStudio (R version 4.4.1) from the based image bioconductor/tidyverse:3.19. | ||
|
||
In the config.yaml file, define your system specific parameter and paths (e.g. to the data). | ||
Execute the run.sh file and open RStudio in your browser (http://localhost:8080/). | ||
By default, username = rstudio, password = wordpass. | ||
|
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add an H4 section here on Halbritter lab internal development please? I'd expect that would include how to run the script (this is taking into account some review feedback):
This hopefully helps you with your own development if, for example, you go on vacation for two weeks and come back to this! 😄 But it also helps others understand that this bash script isn't for their use. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll have to ask for some help in the Halbritter lab, add it on my to do list ;) |
||
|
||
|
||
## Computational resources | ||
|
||
No need to run any analysis, just open the metadata table! |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
project_name: maudp_ScPCA | ||
project_docker: cancerbits/dockr:maudp_ScPCAOpen_podman | ||
|
||
|
||
# SYSTEM-SPECIFIC PARAMETERS (please change to match your local setup): | ||
project_root_host: $CODEBASE/OpenScPCA-analysis/ | ||
data_root_host: $DATA | ||
out_root_host: $OUT | ||
resource_root_host: $RESOURCES | ||
|
||
# PATHS AS VISIBLE WITHIN RSTUDIO (should not be changed): | ||
project_root: /home/rstudio | ||
data_root: /home/rstudio/mnt_data | ||
out_root: /home/rstudio/mnt_out | ||
resource_root: /home/rstudio/mnt_resources/ | ||
tmp_root: /home/rstudio/mnt_tmp/ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
gene_symbol,ENSEMBL_ID,cell_class,cell_type,DOI,comment | ||
WT1,ENSG00000184937,malignant,cancer_cell,10.1242/dev.153163,Tumor_suppressor_WT1_is_lost_in_some_WT_cells | ||
IGF2,ENSG00000167244,malignant,cancer_cell,10.1038/ng1293-408,NA | ||
TP53,ENSG00000141510,malignant,anaplastic,10.1158/1078-0432.CCR-16-0985,Might_also_be_in_small_non_anaplastic_subset | ||
MYCN,ENSG00000134323,malignant,anaplastic,10.18632/oncotarget.3377,Also_in_non_anaplastic_poor_outcome | ||
MAX,ENSG00000125952,malignant,anaplastic,10.1016/j.ccell.2015.01.002,Also_in_non_anaplastic_poor_outcome | ||
SIX1,ENSG00000126778,malignant,blastema,10.1016/j.ccell.2015.01.002,NA | ||
SIX2,ENSG00000170577,malignant,blastema,10.1016/j.ccell.2015.01.002,NA | ||
CITED1,ENSG00000125931,malignant,blastema,10.1593/neo.07358,Also_in_embryonic_kidney | ||
PTPRC,ENSG00000081237,immune,NA,10.1101/gr.273300.120,NA | ||
CD68,ENSG00000129226,immune,myeloid,10.1186/1746-1596-7-12,NA | ||
CD163,ENSG00000177575,immune,macrophage,10.1186/1746-1596-7-12,NA | ||
VWF,ENSG00000110799,endothelium,endothelium,10.1134/S1990747819030140,NA | ||
CD3E,ENSG00000198851,immune,T_cell,10.1101/gr.273300.120,NA | ||
MS4A1,ENSG00000156738,immune,B_cell,10.1101/gr.273300.120,NA | ||
FOXP3,ENSG00000049768,immune,T_cell,10.1101/gr.273300.120,Treg | ||
CD4,ENSG00000010610,immune,T_cell,10.1101/gr.273300.120,NA | ||
CD8A,ENSG00000153563,immune,T_cell,10.1101/gr.273300.120,NA | ||
EPCAM,ENSG00000119888,NA,epithelial,10.1016/j.stemcr.2014.05.013,epithelial_malignant_and_non_malignant | ||
NCAM1,ENSG00000149294,malignant,blastema,10.1016/j.stemcr.2014.05.013,might_also_be_expressed_in_non_malignant | ||
PODXL,ENSG00000128567,non-malignant,podocyte,10.1016/j.stem.2019.06.009,NA | ||
COL6A3,ENSG00000163359,malignant,mesenchymal,10.2147/OTT.S256654,might_also_be_expressed_in_non_malignant_stroma | ||
THY1,ENSG00000154096,malignant,mesenchymal,10.1093/hmg/ddq042,might_also_be_expressed_in_non_malignant_stroma |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
alteration,gain_loss,cell_class,cell_type,DOI,PMID,comment | ||
11p13,loss,malignant,NA,10.1242/dev.153163,NA,NA | ||
11p15,loss,malignant,NA,10.1128/mcb.9.4.1799,NA,NA | ||
16q,loss,malignant,NA,NA,1317258,Associated_with_relapse | ||
1p,loss,malignant,NA,NA,8162576,Associated_with_relapse | ||
1q,gain,malignant,NA,10.1016/S0002-9440(10)63982-X,NA,Associated_with_relapse |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Results directory instructions | ||
|
||
Files in the results directory should not be directly committed to the repository. | ||
|
||
Instead, copy results files to an S3 bucket and add a link to the S3 location in this README file. |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we rename this to Did you need to add this file with force, i.e., Generally, you want to avoid using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. File renamed :) Actually I think it do not need to be in this github folder. I might have been a bit too enthousiastic to have it run and quickly saved the few lines of codes in a run.sh file. But I could save it somewhere else for myself only. Tbh I added quickly this file on the bowser interface, using "Add file". I didn't know about the |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,29 @@ | ||||||
#!/bin/bash | ||||||
|
||||||
|
||||||
# parse config parameters: | ||||||
source bash/parse_yaml.sh | ||||||
eval $(parse_yaml config.yaml CONF_) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if we expect everyone who might want to run this module to have access to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ... Start being a bit too high-level for me to be honnest... I'll ask him once he is back from vacation! Would this replace the run.sh and config.yaml files in the end? |
||||||
|
||||||
# ids defined in image for the rstudio user | ||||||
uid=1000 | ||||||
gid=1000 | ||||||
# subid ranges on host | ||||||
subuidSize=$(( $(podman info --format "{{ range .Host.IDMappings.UIDMap }}+{{.Size }}{{end }}" ) - 1 )) | ||||||
subgidSize=$(( $(podman info --format "{{ range .Host.IDMappings.GIDMap }}+{{.Size }}{{end }}" ) - 1 )) | ||||||
|
||||||
|
||||||
podman run -d --rm \ | ||||||
--name ${CONF_project_name}_${USER} \ | ||||||
-e RUNROOTLESS=false \ | ||||||
--uidmap $uid:0:1 --uidmap 0:1:$uid --uidmap $(($uid+1)):$(($uid+1)):$(($subuidSize-$uid)) \ | ||||||
--gidmap $gid:0:1 --gidmap 0:1:$gid --gidmap $(($gid+1)):$(($gid+1)):$(($subgidSize-$gid)) \ | ||||||
--group-add=keep-groups \ | ||||||
-p 8080:8787 \ | ||||||
-e PASSWORD=wordpass \ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
-e TZ=Europe/Vienna \ | ||||||
--volume=$(realpath ${CONF_project_root_host%%*##*( )}):${CONF_project_root} \ | ||||||
--volume=$(realpath ${CONF_resource_root_host%%*##*( )}):${CONF_resource_root}:ro \ | ||||||
--volume=$(realpath ${CONF_out_root_host%%*##*( )}):${CONF_out_root} \ | ||||||
--volume=$(realpath ${CONF_data_root_host%%*##*( )}):${CONF_data_root} \ | ||||||
${CONF_project_docker} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workflow added with
.github/workflows/docker_cell-type-wilms-tumor-06.yml
passed: https://github.com/AlexsLemonade/OpenScPCA-analysis/actions/runs/10238728443/job/28323567327So, now we can be confident that this builds successfully 🥳
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a great news, thank you!!!
Should we wait a more complete Docker image before pushing it?
I would like to adapt the notebook with you comment (enrichment analysis of some gene sets) before filling the next pull request, with a docker image (renv.lock) that should be more complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to wait – incremental changes are okay and expected! As you add R packages you need, I expect the
renv.lock
file to change, which means the Docker image will get rebuilt with the new R packages installed and pushed to ECR. That is to say that the workflow is set up to accommodate incremental changes!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great !