Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to link peaks to genes #1855

Open
zgb963 opened this issue Nov 26, 2024 · 3 comments
Open

Unable to link peaks to genes #1855

zgb963 opened this issue Nov 26, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@zgb963
Copy link

zgb963 commented Nov 26, 2024

Hello,

I've been following the Joint RNA and ATAC analysis: 10x multiomic tutorial on the Stuart Lab website to process output from the 10X Genomics Cellranger ARC pipeline. I've been trying to run the following steps in the 'Link peaks to genes' section for one of my sample Seurat objects and for one or two genes, but I keep getting errors and I'm not sure how to fix them in order to make coverage plots. Any insight on how to fix this would be appreciated, thanks.

# change from RNAseq assay to ATACseq assay
DefaultAssay(liftoff_1_MI5_V1_SO) <- "ATAC"

# first compute the GC content for each peak (worked! Got a warning though)
liftoff_1_MI5_V1_SO <- RegionStats(liftoff_1_MI5_V1_SO, genome = BSgenome.Mfascicularis.NCBI.5.0)
# Warning: Not all seqlevels present in supplied genome

liftoff_1_MI5_V1_SO <- LinkPeaks(
  object = liftoff_1_MI5_V1_SO,
  peak.assay = "ATAC",
  expression.assay = "SCT",
  genes.use = c("OSTN", "FOS")
)

Processed 35548 groups out of 35548. 100% done. Time elapsed: 3s. ETA: 0s.
Testing 1 genes and 210517 peaks
  |                                                  | 0 % ~calculating  Warning: Requested more features than present in supplied data.
            Returning 0 featuresError in density.default(x = query.feature[[featmatch]], kernel = "gaussian",  :
argument 'x' must be numeric

# > liftoff_1_MI5_V1_SO
# An object of class Seurat 
# 271812 features across 8787 samples within 3 assays 
# Active assay: ATAC (210519 features, 210519 variable features)
#  2 layers present: counts, data
#  2 other assays present: RNA, SCT
#  5 dimensional reductions calculated: pca, lsi, umap, umap.atac, wnn.umap


# check if genes present in SCT expression assay
# > all(c("OSTN", "FOS") %in% rownames(liftoff_1_MI5_V1_SO[["SCT"]]))
# [1] TRUE


# Check the peak identifiers in the ATAC assay
# > head(rownames(liftoff_1_MI5_V1_SO[["ATAC"]]))
# [1] "chr1-30919-31783" "chr1-32491-33425" "chr1-36538-37474" "chr1-54567-55405" "chr1-57748-58645" "chr1-61338-62189"

I then tried to make a coverage plot with just one gene, but I also got an error.

liftoff_1_MI5_V1_SO <- LinkPeaks(
  object = liftoff_1_MI5_V1_SO,
  peak.assay = "ATAC",
  expression.assay = "SCT",
  genes.use = c("OSTN")
)

Error in density.default(x = query.feature[[featmatch]], kernel = "gaussian",  : 
  argument 'x' must be numeric

When I set genes.use to NULL to determine genes from expression assay, I get a similar error

liftoff_1_MI5_V1_SO <- LinkPeaks(
  object = liftoff_1_MI5_V1_SO,
  peak.assay = "ATAC",
  expression.assay = "SCT",
  genes.use = NULL
)

# Testing 23041 genes and 210517 peaks
#   |                                                  | 0 % ~calculating  Warning: Requested more features than present in supplied data.
#             Returning 0 featuresError in density.default(x = query.feature[[featmatch]], kernel = "gaussian",  : 
#   argument 'x' must be numeric

@zgb963 zgb963 added the bug Something isn't working label Nov 26, 2024
@nh-codem
Copy link

Have you encountered an issue where peaks and their linked genes are located on different chromosomes? @zgb963

@zgb963
Copy link
Author

zgb963 commented Dec 9, 2024

Have you encountered an issue where peaks and their linked genes are located on different chromosomes? @zgb963

Hi @nh-codem I'm not sure. How can I check for this? Have you run into this issue before?

@zgb963
Copy link
Author

zgb963 commented Dec 13, 2024

Hello @timoast I was able to fix the warning when running the RegionStats function by changing the seqlevel/chromosome level naming convention to UCSC style to match the UCSC style of the annotation stored in the sample seurat object.

# try this and then rerun RegionStats
seqlevelsStyle(BSgenome.Mfascicularis.NCBI.5.0) <- "UCSC"


# check seqlevels again
seqlevels(BSgenome.Mfascicularis.NCBI.5.0)

# 			[1] "chr1"                      "chr2"                      "chr3"                      "chr4"                      "chr5"                      "chr6
#   [997] "chrUn_AQIA01074376"        "chrUn_AQIA01074377"        "chrUn_AQIA01074378"        "chrUn_AQIA01074379"       
#   [ reached getOption("max.print") -- omitted 6601 entries ]


# change default assay in sample 1 to ATAC
DefaultAssay(liftoff_1_MI5_V1_SO) <- "ATAC"

# rerun this with the modified genome object (didn't get warning)
liftoff_1_MI5_V1_SO <- RegionStats(liftoff_1_MI5_V1_SO, genome = BSgenome.Mfascicularis.NCBI.5.0)

However, I'm still unable to run the `LinkPeaks' function without getting the same error

# check that active ATAC assay is in sample 1
liftoff_1_MI5_V1_SO


# An object of class Seurat 
# 271812 features across 8211 samples within 3 assays 
# Active assay: ATAC (210519 features, 210519 variable features)
#  2 layers present: counts, data
#  2 other assays present: RNA, SCT
#  5 dimensional reductions calculated: pca, lsi, umap, umap.atac, wnn.umap
# 


liftoff_1_MI5_V1_SO <- LinkPeaks(
  object = liftoff_1_MI5_V1_SO,
  peak.assay = "ATAC",
  expression.assay = "SCT",
  genes.use = c("OSTN")
)

# Testing 1 genes and 210505 peaks
#   |                                                  | 0 % ~calculating  Warning: Requested more features than present in supplied data.
#             Returning 0 featuresError in density.default(x = query.feature[[featmatch]], kernel = "gaussian",  : 
#   argument 'x' must be numeric


# testing 210,505 peaks but there are 210,519 variable features in object?


# try link peaks again without specifying a gene?



# > liftoff_1_MI5_V1_SO <- LinkPeaks(
# +     object = liftoff_1_MI5_V1_SO,
# +     peak.assay = "ATAC",
# +     expression.assay = "SCT")
# Testing 22897 genes and 210505 peaks
#   |                                                  | 0 % ~calculating  Error in density.default(x = query.feature[[featmatch]], kernel = "gaussian",  : 
#   argument 'x' must be numeric
# In addition: Warning message:
# In MatchRegionStats(meta.feature = meta.use, query.feature = pk.use[x,  :
#   Requested more features than present in supplied data.
#             Returning 0 features

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants