-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DeepMased - stalled out on finding features #8
Comments
Hi Teal. Can you post the command that you're running? |
Hi! I ran this command: python3 -m DeepMAsED features bam_fasta_table.tab |
Maybe it's an issue with tensorflow freezing/stalling? I did have some issues with that for certain versions. Which version are you running? Does running another DL model with that version of TF work with your setup? Check out this issue for some potential solutions: tensorflow/tensorflow#32017 |
When I run [ python3 -c 'import tensorflow as tf; print(tf.__version__)' ]
I get: 2.3.1
Our system admin had to do some voodoo to get deepmased to work:
comics
python3 -m venv venv --system-site-packages --prompt deepmased
. venv/bin/activate
pip install --upgrade pip
pip install tensorflow-cpu
That last command outputs:
(deepmased) (omics) tealfurn@alpena:~/MUSCATO/URDB_PAPER/SAMPLE_53600$ pip
install tensorflow-cpu
Requirement already satisfied: tensorflow-cpu in
./venv/lib/python3.7/site-packages (2.3.1)
Requirement already satisfied: google-pasta>=0.1.8 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (0.2.0)
Requirement already satisfied: opt-einsum>=2.3.2 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (3.3.0)
Requirement already satisfied: absl-py>=0.7.0 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (0.11.0)
Requirement already satisfied: wrapt>=1.11.1 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (1.12.1)
Requirement already satisfied: tensorboard<3,>=2.3.0 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (2.3.0)
Requirement already satisfied: grpcio>=1.8.6 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (1.33.2)
Requirement already satisfied: wheel>=0.26 in
/usr/lib/python3/dist-packages (from tensorflow-cpu) (0.32.3)
Requirement already satisfied: astunparse==1.6.3 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (1.6.3)
Requirement already satisfied: h5py<2.11.0,>=2.10.0 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (2.10.0)
Requirement already satisfied: gast==0.3.3 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (0.3.3)
Requirement already satisfied: termcolor>=1.1.0 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (1.1.0)
Requirement already satisfied: tensorflow-estimator<2.4.0,>=2.3.0 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (2.3.0)
Requirement already satisfied: numpy<1.19.0,>=1.16.0 in
/usr/lib/python3/dist-packages (from tensorflow-cpu) (1.16.2)
Requirement already satisfied: keras-preprocessing<1.2,>=1.1.1 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (1.1.2)
Requirement already satisfied: protobuf>=3.9.2 in
./venv/lib/python3.7/site-packages (from tensorflow-cpu) (3.13.0)
Requirement already satisfied: six>=1.12.0 in
/usr/lib/python3/dist-packages (from tensorflow-cpu) (1.12.0)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in
./venv/lib/python3.7/site-packages (from
tensorboard<3,>=2.3.0->tensorflow-cpu) (0.4.2)
Requirement already satisfied: google-auth<2,>=1.6.3 in
./venv/lib/python3.7/site-packages (from
tensorboard<3,>=2.3.0->tensorflow-cpu) (1.23.0)
Requirement already satisfied: markdown>=2.6.8 in
./venv/lib/python3.7/site-packages (from
tensorboard<3,>=2.3.0->tensorflow-cpu) (3.3.3)
Requirement already satisfied: requests<3,>=2.21.0 in
/usr/lib/python3/dist-packages (from tensorboard<3,>=2.3.0->tensorflow-cpu)
(2.21.0)
Requirement already satisfied: setuptools>=41.0.0 in
./venv/lib/python3.7/site-packages (from
tensorboard<3,>=2.3.0->tensorflow-cpu) (50.3.2)
Requirement already satisfied: werkzeug>=0.11.15 in
/usr/lib/python3/dist-packages (from tensorboard<3,>=2.3.0->tensorflow-cpu)
(0.14.1)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in
./venv/lib/python3.7/site-packages (from
tensorboard<3,>=2.3.0->tensorflow-cpu) (1.7.0)
Requirement already satisfied: requests-oauthlib>=0.7.0 in
./venv/lib/python3.7/site-packages (from
google-auth-oauthlib<0.5,>=0.4.1->tensorboard<3,>=2.3.0->tensorflow-cpu)
(1.3.0)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in
./venv/lib/python3.7/site-packages (from
google-auth<2,>=1.6.3->tensorboard<3,>=2.3.0->tensorflow-cpu) (4.1.1)
Requirement already satisfied: pyasn1-modules>=0.2.1 in
./venv/lib/python3.7/site-packages (from
google-auth<2,>=1.6.3->tensorboard<3,>=2.3.0->tensorflow-cpu) (0.2.8)
Requirement already satisfied: rsa<5,>=3.1.4; python_version >= "3.5" in
./venv/lib/python3.7/site-packages (from
google-auth<2,>=1.6.3->tensorboard<3,>=2.3.0->tensorflow-cpu) (4.6)
Requirement already satisfied: importlib-metadata; python_version < "3.8"
in ./venv/lib/python3.7/site-packages (from
markdown>=2.6.8->tensorboard<3,>=2.3.0->tensorflow-cpu) (2.0.0)
Requirement already satisfied: oauthlib>=3.0.0 in
./venv/lib/python3.7/site-packages (from
requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<3,>=2.3.0->tensorflow-cpu)
(3.1.0)
Collecting pyasn1<0.5.0,>=0.4.6
Using cached pyasn1-0.4.8-py2.py3-none-any.whl (77 kB)
Requirement already satisfied: zipp>=0.5 in
./venv/lib/python3.7/site-packages (from importlib-metadata; python_version
< "3.8"->markdown>=2.6.8->tensorboard<3,>=2.3.0->tensorflow-cpu) (3.4.0)
Installing collected packages: pyasn1
Attempting uninstall: pyasn1
Found existing installation: pyasn1 0.4.2
Not uninstalling pyasn1 at /usr/lib/python3/dist-packages, outside
environment /geomicro/data2/tealfurn/MUSCATO/URDB_PAPER/SAMPLE_53600/venv
Can't uninstall 'pyasn1'. No files were found to uninstall.
Successfully installed pyasn1-0.4.8
…On Fri, Nov 13, 2020 at 8:14 AM Nick Youngblut ***@***.***> wrote:
Maybe it's an issue with tensorflow freezing/stalling? I did have some
issues with that for certain versions. Which version are you running? Does
running another DL model with that version of TF work with your setup?
Check out this issue for some potential solutions:
tensorflow/tensorflow#32017
<tensorflow/tensorflow#32017>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEGCMSVYENXYMFLFIURMR6TSPUWM7ANCNFSM4TTP2BGA>
.
|
You were not able to install DeepMAsED via bioconda? |
Hi Robert,
I'm talking here with Nick Youngblut, who created DeepMased, about our
installation.
I could not get it to work on the first step - find features and figured it
was related to all those warning flags we get that you said to ignore.
I know you said you had some trouble installing, hence the virtual
environment, and maybe you could explain better why bioconda didn't work?
Best
Teal
…On Mon, Nov 16, 2020 at 7:57 AM Nick Youngblut ***@***.***> wrote:
Our system admin had to do some voodoo to get deepmased to work:
You were not able to install DeepMAsED via bioconda
<https://bioconda.github.io/recipes/deepmased/README.html#package-deepmased>
?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEGCMSUQ3CNH2TUUGRQFUXDSQEOU5ANCNFSM4TTP2BGA>
.
|
Greetings!
*Robert set it up for me to run through anaconda3*
module load Anaconda/3
conda create -c bioconda -n deepmased deepmased
source activate deepmased
*It still stalls out in the find features.*
(deepmased) tfurn@ymannx:~/PIPELINE/SAMPLE_42896/ROBERT$ python3 -m
DeepMAsED features bam_fasta_table.tab
2020-11-29 20:09:46,647 - Indexing file: MERGED_CONTIGS_NOSPLIT_XDD.fa
2020-11-29 20:09:53,984 - Processing: MERGED_CONTIGS_NOSPLIT_XDD_sorted.bam
2020-11-29 20:09:56,166 - Number of contigs in the bam file: 2238133
*I also notice it is running only one core - not sure exactly what the
script is running during find features, gene prediction? MetaQuast? but
seems inefficient considering how much compute power we have on hand.*
top - 12:04:16 up 40 days, 1:36, 9 users, load average: 2.68, 2.76, 2.73
Tasks: 820 total, 3 running, 488 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.0 us, 0.6 sy, 0.0 ni, 97.4 id, 0.0 wa, 0.0 hi, 0.0 si,
0.0 st
KiB Mem : 79124640+total, 22944363+free, 5286204 used, 55651654+buff/cache
KiB Swap: 46858236 total, 46772732 free, 85504 used. 78042592+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM
TIME+ COMMAND
25374 tfurn 20 0 5226716 3.995g 174120 R 100.0 0.5
953:24.88 python3 -m DeepMAsED features bam_fasta_table.tab
*Let me know what I should do.*
Best,
Teal
…On Mon, Nov 16, 2020 at 7:57 AM Nick Youngblut ***@***.***> wrote:
Our system admin had to do some voodoo to get deepmased to work:
You were not able to install DeepMAsED via bioconda
<https://bioconda.github.io/recipes/deepmased/README.html#package-deepmased>
?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEGCMSUQ3CNH2TUUGRQFUXDSQEOU5ANCNFSM4TTP2BGA>
.
|
Your command doesn't include
It may be an issue with pysam. |
Hi Nick,
The option --procs is not on the github and not in the deepmased -h command
line output - you may want to update that.
I will try increasing the procs and see if I can get it to finish.
I'll check the version of pysam.
BTW: while I'm trying to get things to work. When we map our reads to
generate the bam file, how should we treat multimapping/ambiguous reads
(usually between 30-80% of total reads are multimapping for meta-NGS data
due to strains/related species) -> allow full multimap? keep 1 random?
toss?
Best,
Teal
…On Mon, Nov 30, 2020 at 12:55 PM Nick Youngblut ***@***.***> wrote:
not sure exactly what the script is running during find features, gene
prediction? MetaQuast?
DeepMAsED features generates feature tables from the bam files via pysam.
bam files are processing parallel if --procs is >1.
I also notice it is running only one core
Your command doesn't include --procs, so DeepMAsED features defaults to 1
core.
*It still stalls out in the find features.*
It may be an issue with pysam. DeepMAsED features is really just pysam to
extract features from the bam files. It's therefore almost definitely
something to do with pysam. You could try installing a different version of
pysam or send me some example files so that I can try and reproduce the
issue.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEGCMSSSVXI2VAPNSYKWA7LSSPL7TANCNFSM4TTP2BGA>
.
|
The UI is a standard command-subcommand format (like
We used default bowtie2 params for mapping. You can assess different mapping approaches on accuracy if you'd like by running the |
Hi Nick,
*I created a much smaller 50K contig and 50M reads test data set: *
72 Dec 1 15:57 bam_fasta_table.tab
80 Dec 1 17:41 feature_file_table.tsv
6708944839 Dec 1 15:32 test_DM_50K_contigs_AMBRAND_sorted.bam
3523736 Dec 1 15:34 test_DM_50K_contigs_AMBRAND_sorted.bam.bai
12548326952 Dec 1 17:41 test_DM_50K_contigs_AMBRAND_sorted_feats.tsv
154772928 Dec 1 14:52 test_DM_50K_contigs.fa
*And ran the following commands:*
python3 -m DeepMAsED features bam_fasta_table.tab --procs 20 #took
2.5hrs to complete
DeepMAsED predict feature_file_table.tsv
*The features command did work after a couple hours, so I guess that means
pysam does work BUT... *
*On the predict command - it throws the following error:*
2020-12-01 22:42:32,578 - Loading model...
2020-12-01 22:42:32,579 - Loading mstd...
2020-12-01 22:42:32,579 - Loading h5...
Traceback (most recent call last):
File "/geomicro/data2/tealfurn/.conda/envs/deepmased/bin/DeepMAsED", line
10, in <module>
sys.exit(main())
File
"/geomicro/data2/tealfurn/.conda/envs/deepmased/lib/python3.6/site-packages/DeepMAsED/__main__.py",
line 57, in main
args.func(args)
File
"/geomicro/data2/tealfurn/.conda/envs/deepmased/lib/python3.6/site-packages/DeepMAsED/Commands/Predict.py",
line 76, in main
Predict.main(args)
File
"/geomicro/data2/tealfurn/.conda/envs/deepmased/lib/python3.6/site-packages/DeepMAsED/Predict.py",
line 50, in main
model = load_model(F, custom_objects=custom_obj)
File
"/geomicro/data2/tealfurn/.conda/envs/deepmased/lib/python3.6/site-packages/tensorflow/python/keras/saving/save.py",
line 184, in load_model
return hdf5_format.load_model_from_hdf5(filepath, custom_objects,
compile)
File
"/geomicro/data2/tealfurn/.conda/envs/deepmased/lib/python3.6/site-packages/tensorflow/python/keras/saving/hdf5_format.py",
line 166, in load_model_from_hdf5
f = h5py.File(filepath, mode='r')
File
"/geomicro/data2/tealfurn/.conda/envs/deepmased/lib/python3.6/site-packages/h5py/_hl/files.py",
line 408, in __init__
swmr=swmr)
File
"/geomicro/data2/tealfurn/.conda/envs/deepmased/lib/python3.6/site-packages/h5py/_hl/files.py",
line 173, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (unable to lock file, errno = 37, error
message = 'No locks available')
*I can try to send you the data if you have some way to send such large
files, Globus maybe?*
*Best,Teal*
|
I'm glad that you got
This appears to be an open issue with h5py: h5py/h5py#1101 |
Greetings,
I am trying to see if my merge-overlap.pl script creates chimeras when I combine output from different assemblers.
As you can see, nothing has happened in days:
Using TensorFlow backend.
2020-11-10 21:33:21,376 - Indexing file: MERGED_CONTIGS_NOSPLIT_VDD.fa
2020-11-10 21:33:25,745 - Cannot find MERGED_CONTIGS_NOSPLIT_VDD_sorted.bam.bai; creating...
2020-11-10 21:35:08,541 - Processing: MERGED_CONTIGS_NOSPLIT_VDD_sorted.bam
2020-11-10 21:35:10,378 - Number of contigs in the bam file: 1682485
Checking "top" - it is still running, though I wonder it is only using 1 core (100%).
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
40639 txxf 20 0 11.542g 9.143g 145724 R 100.0 1.2 2262:04 python3
There's nothing in the output either.
-rw-r--r-- 1 txxf gmb 0 Nov 10 21:35 MERGED_CONTIGS_NOSPLIT_VDD_sorted_feats.tsv
Any ideas why it is not working?
How long should this normally take?
Best,
Teal
The text was updated successfully, but these errors were encountered: