Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error reading arrow file #3

Open
pchiang5 opened this issue Oct 13, 2022 · 4 comments · May be fixed by #4
Open

error reading arrow file #3

pchiang5 opened this issue Oct 13, 2022 · 4 comments · May be fixed by #4

Comments

@pchiang5
Copy link

Hello,

An error was encountered when I read the arrow file with the following input:

adata = ArchR_h5ad.read_arrow('S5.arrow', use_matrix="TileMatrix")

Was it due to a lot of variables with 0 counts in the TileMatrix? Thank you.

Reading ArchR TileMatrix to AnnData
Chromosomes: 0%| | 0/23 [00:00<?, ?it/s]

Traceback (most recent call last):
File "", line 1, in
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/ArchR_h5ad/_main/_read_arrow_to_adata.py", line 66, in _read_arrow_to_adata
arrow.to_adata(use_matrix=use_matrix, write_h5ad=write_h5ad)
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/ArchR_h5ad/_main/_Arrow.py", line 52, in to_adata
self._adata = _compose_anndata(DataDict=self._DataDict,
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/ArchR_h5ad/_compose_adata/_compose_anndata.py", line 27, in _compose_anndata
adata = _add_obs_var(adata, metadata, feature_df)
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/ArchR_h5ad/_compose_adata/_add_obs_var.py", line 49, in _add_obs_var
adata.var = _add_var(feature_df)
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/anndata/_core/anndata.py", line 882, in var
self._set_dim_df(value, "var")
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/anndata/_core/anndata.py", line 798, in _set_dim_df
value_idx = self._prep_dim_index(value.index, attr)
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/anndata/_core/anndata.py", line 810, in _prep_dim_index
raise ValueError(
ValueError: Length of passed value for var_names is 11627605, but this AnnData has shape: (1590, 6062095)

@mvinyard mvinyard self-assigned this Oct 14, 2022
@mvinyard
Copy link
Owner

@pchiang5 thanks for your message - it's hard for me to say without more information. Can you provide some details about the # of cells / features expected based on the TileMatrix as observed in the ArchR API? Do you have success when running the same function but for the GeneScoreMatrix? I.e.:

adata = ArchR_h5ad.read_arrow('S5.arrow', use_matrix="GeneScoreMatrix")

@pchiang5
Copy link
Author

Hi @mvinyard

Using GeneScoreMatrix also encountered a similar error (please see below). Because the arrow file was created with custom genome/annotation files of both human (chr) and mouse (mchr) sources. I wonder if the unconventional mchr was the culprit of these errors because the chromosome to read shall be > 23 and the actual tile/genescore var_names in the AnnData files were around half of those in the arrow file.

adata = ArchR_h5ad.read_arrow('S5.arrow', use_matrix="TileMatrix")

Reading ArchR TileMatrix to AnnData
Chromosomes: 0%| | 0/23 [00:00<?, ?it/s]
Traceback (most recent call last):
File "", line 1, in
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/ArchR_h5ad/_main/_read_arrow_to_adata.py", line 66, in _read_arrow_to_adata
arrow.to_adata(use_matrix=use_matrix, write_h5ad=write_h5ad)
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/ArchR_h5ad/_main/_Arrow.py", line 52, in to_adata
self._adata = _compose_anndata(DataDict=self._DataDict,
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/ArchR_h5ad/_compose_adata/_compose_anndata.py", line 27, in _compose_anndata
adata = _add_obs_var(adata, metadata, feature_df)
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/ArchR_h5ad/_compose_adata/_add_obs_var.py", line 49, in _add_obs_var
adata.var = _add_var(feature_df)
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/anndata/_core/anndata.py", line 882, in var
self._set_dim_df(value, "var")
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/anndata/_core/anndata.py", line 798, in _set_dim_df
value_idx = self._prep_dim_index(value.index, attr)
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/anndata/_core/anndata.py", line 810, in _prep_dim_index
raise ValueError(
ValueError: Length of passed value for var_names is 11627605, but this AnnData has shape: (1590, 6062095)

adata = ArchR_h5ad.read_arrow('S5.arrow', use_matrix="GeneScoreMatrix")

Reading ArchR GeneScoreMatrix to AnnData
Chromosomes: 0%| | 0/23 [00:00<?, ?it/s]
Traceback (most recent call last):
File "", line 1, in
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/ArchR_h5ad/_main/_read_arrow_to_adata.py", line 66, in _read_arrow_to_adata
arrow.to_adata(use_matrix=use_matrix, write_h5ad=write_h5ad)
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/ArchR_h5ad/_main/_Arrow.py", line 52, in to_adata
self._adata = _compose_anndata(DataDict=self._DataDict,
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/ArchR_h5ad/_compose_adata/_compose_anndata.py", line 27, in _compose_anndata
adata = _add_obs_var(adata, metadata, feature_df)
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/ArchR_h5ad/_compose_adata/_add_obs_var.py", line 49, in _add_obs_var
adata.var = _add_var(feature_df)
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/anndata/_core/anndata.py", line 882, in var
self._set_dim_df(value, "var")
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/anndata/_core/anndata.py", line 798, in _set_dim_df
value_idx = self._prep_dim_index(value.index, attr)
File "/home/pc/miniconda3/envs/snap/lib/python3.10/site-packages/anndata/_core/anndata.py", line 810, in _prep_dim_index
raise ValueError(
ValueError: Length of passed value for var_names is 115928, but this AnnData has shape: (1590, 60058)

@mvinyard
Copy link
Owner

mvinyard commented Oct 14, 2022

@pchiang5 Thanks for this additional info. I think you are right; the unconventional concatenated human/mouse organization is most likely causing the issue. I see two options:

  1. separate the objects in ArchR before conversion
  2. supplied with an example dataset that reproduces the error, I could look into extending the tool to include such a case. Not sure if this is trivial or more involved without further investigation. Let me know what you think

@pchiang5
Copy link
Author

@mvinyard

Thank you for your willingness to help. I sent you the dataset in the previous reply. Please let me know if you miss the link.

@mvinyard mvinyard linked a pull request Oct 16, 2022 that will close this issue
@mvinyard mvinyard linked a pull request Oct 16, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants