-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add deeparg #6646
base: main
Are you sure you want to change the base?
Add deeparg #6646
Changes from all commits
b4963fb
97fac0c
642476f
046c1cc
7a6cbca
3744c47
9253f50
42bb122
d3c4e9c
00fc4b3
1dc7046
399c1aa
1181c77
b7724b7
15d9c99
17342e1
5a6cb2a
1578ae5
0f1efe3
20a8a2a
c72a087
a33ee96
e6b2304
7fa629d
c0c5c93
c0c9eaf
9250425
e189dea
38a06b3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
name: data_manager_deeparg | ||
owner: iuc | ||
description: "Download deepARG database" | ||
homepage_url: "https://github.com/gaarangoa/deeparg" | ||
long_description: | | ||
Data manager to download the database needed for deepARG to run (datasets, databases, models) | ||
remote_repository_url: "https://github.com/galaxyproject/tools-iuc/tree/master/data_managers/data_manager_deeparg" | ||
type: unrestricted | ||
categories: | ||
- Data Managers | ||
suite: | ||
name: "suite_deeparg" | ||
description: "DeepARG is a deep learning based approach to predict Antibiotic Resistance Genes (ARGs) from metagenomes with short or long sequences" |
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,65 @@ | ||||||||
<tool id="data_manager_deeparg" name="Download data for DeepARG" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" tool_type="manage_data" profile="22.05"> | ||||||||
<description></description> | ||||||||
<macros> | ||||||||
<token name="@TOOL_VERSION@">1.0.4</token> | ||||||||
<token name="@VERSION_SUFFIX@">0</token> | ||||||||
</macros> | ||||||||
<requirements> | ||||||||
<requirement type="package" version="@TOOL_VERSION@">deeparg</requirement> | ||||||||
</requirements> | ||||||||
<stdio> | ||||||||
<exit_code range=":-1" level="fatal" description="Error: Cannot open file"/> | ||||||||
<exit_code range="1:" level="fatal" description="Error"/> | ||||||||
</stdio> | ||||||||
<command><![CDATA[ | ||||||||
mkdir -p '$out_file.extra_files_path' && | ||||||||
deeparg download_data -o 'deeparg_$version' && | ||||||||
mv 'deeparg_$version' '$out_file.extra_files_path' && | ||||||||
Comment on lines
+16
to
+17
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just:
Suggested change
|
||||||||
cp '$dmjson' '$out_file' | ||||||||
]]></command> | ||||||||
<configfiles> | ||||||||
<configfile name="dmjson"><![CDATA[ | ||||||||
#from datetime import date | ||||||||
{ | ||||||||
"data_tables":{ | ||||||||
"deeparg_database_versioned":[ | ||||||||
{ | ||||||||
"value": "deeparg_$version-#echo date.today().strftime('%d%m%Y')#", | ||||||||
"name": "Files needed for running deepARG v-$version-#echo date.today().strftime('%d%m%Y')#", | ||||||||
"path": "deeparg_$version", | ||||||||
"db_version": "$version" | ||||||||
} | ||||||||
] | ||||||||
} | ||||||||
}]]> | ||||||||
</configfile> | ||||||||
</configfiles> | ||||||||
<inputs> | ||||||||
<param name="version" type="select" label="DB version"> | ||||||||
<option value="1.0.4" selected="true">Data needed for running DeepARG v1.0.4</option> | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we use Do you know if the tool can also download other versions. Do you know if future versions can use older downloads? |
||||||||
</param> | ||||||||
</inputs> | ||||||||
<outputs> | ||||||||
<data name="out_file" format="data_manager_json" label="${tool.name}"/> | ||||||||
</outputs> | ||||||||
<tests> | ||||||||
<test expect_num_outputs="1"> | ||||||||
<param name="version" value="1.0.4"/> | ||||||||
<output name="out_file"> | ||||||||
<assert_contents> | ||||||||
<has_text text='"deeparg_database_versioned":'/> | ||||||||
<has_text text='"db_version": "1.0.4"'/> | ||||||||
<has_text_matching expression='"value": "deeparg_1.0.4-[0-9]{8}"'/> | ||||||||
<has_text_matching expression='"name": "Files needed for running deepARG v-1.0.4-[0-9]{8}"'/> | ||||||||
<has_text text='"path": "deeparg_1.0.4"'/> | ||||||||
</assert_contents> | ||||||||
</output> | ||||||||
</test> | ||||||||
</tests> | ||||||||
<help><![CDATA[ | ||||||||
DeepARG is a tool to predict antibiotic resistance genes (ARGs) in metagenomic samples. | ||||||||
]]></help> | ||||||||
<citations> | ||||||||
<citation type="doi">10.1186/s40168-018-0401-z</citation> | ||||||||
</citations> | ||||||||
</tool> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
<?xml version="1.0"?> | ||
<data_managers> | ||
<data_manager tool_file="data_manager/data_manager_deeparg.xml" id="data_manager_deeparg"> | ||
<data_table name="deeparg_database_versioned"> <!-- Defines a Data Table to be modified. --> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could just go with |
||
<output> <!-- Handle the output of the Data Manager Tool --> | ||
<column name="value"/> <!-- columns that are going to be specified by the Data Manager Tool --> | ||
<column name="name"/> <!-- columns that are going to be specified by the Data Manager Tool --> | ||
<column name="path" output_ref="out_file"> | ||
<move type="directory"> | ||
<source>${path}</source> | ||
<target base="${GALAXY_DATA_MANAGER_DATA_PATH}">deeparg_db/${path}</target> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should contain There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would use the datatable name instead of |
||
</move> | ||
<value_translation>${GALAXY_DATA_MANAGER_DATA_PATH}/deeparg_db/${path}</value_translation> | ||
<value_translation type="function">abspath</value_translation> | ||
</column> | ||
<column name="db_version"/> <!-- columns that are going to be specified by the Data Manager Tool --> | ||
</output> | ||
</data_table> | ||
</data_manager> | ||
</data_managers> | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
#This is a sample file distributed with Galaxy that enables tools | ||
#to use a directory of metagenomics files. | ||
#file has this format (white space characters are TAB characters) | ||
#deeparg_1.0.4 Files needed for running deepARG (1.0.4) deeparg_1.0.4-20241010 /path/to/data 1.0.4 | ||
deeparg_1.0.4-19122024 Files needed for running deepARG v-1.0.4-19122024 /tmp/tmpizmxs2l_/galaxy-dev/tool-data/deeparg_db/deeparg_1.0.4 1.0.4 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#This is a sample file distributed with Galaxy that enables tools | ||
#to use a directory of metagenomics files. | ||
#file has this format (white space characters are TAB characters) | ||
#deeparg_1.0.4 Files needed for running deepARG (1.0.4) deeparg_1.0.4-20241010 /path/to/data 1.0.4 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
<tables> | ||
<table name="deeparg_database_versioned" comment_char="#"> | ||
<columns>value, name, path, db_version</columns> | ||
<file path="tool-data/deeparg_database_versioned.loc.sample"/> | ||
</table> | ||
</tables> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
<tables> | ||
<table name="deeparg_database_versioned" comment_char="#"> | ||
<columns>value, name, path, db_version</columns> | ||
<file path="${__HERE__}/test-data/deeparg_database_versioned.loc.test"/> | ||
</table> | ||
</tables> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
name: deeparg | ||
owner: iuc | ||
long_description: | | ||
A deep learning based approach to predict Antibiotic Resistance Genes (ARGs) from metagenomes | ||
categories: | ||
- Sequence Analysis | ||
remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/master/tools/deeparg | ||
homepage_url: https://github.com/gaarangoa/deeparg | ||
type: unrestricted | ||
auto_tool_repositories: | ||
name_template: "{{ tool_id }}" | ||
description_template: "Wrapper for the DeepARG tool suite: {{ tool_name }}" | ||
suite: | ||
name: "suite_deeparg" | ||
description: "DeepARG is a deep learning based approach to predict Antibiotic Resistance Genes (ARGs) from metagenomes with short or long sequences" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
<tool id="deeparg_predict" name="DeepARG predict" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> | ||
<description>Antibiotic Resistance Genes (ARGs) from metagenomes</description> | ||
<macros> | ||
<import>macros.xml</import> | ||
</macros> | ||
<expand macro="xrefs"/> | ||
<expand macro="requirements"/> | ||
<command detect_errors="exit_code"><![CDATA[ | ||
##Used only for test | ||
#if str($hide_db_build) == 'true': | ||
deeparg download_data -o deeparg_1.0.4 && | ||
#end if | ||
## | ||
mkdir -p deeparg_predict_output && | ||
deeparg predict | ||
--model '$model' | ||
-i '$input' | ||
-o 'deeparg_predict_output/deeparg_predict' | ||
-d '$deeparg_db.fields.path' | ||
--type '$type' | ||
--min-prob $min_prob | ||
--arg-alignment-identity $arg_alignment_identity | ||
--arg-alignment-evalue $arg_alignment_evalue | ||
--arg-alignment-overlap $arg_alignment_overlap | ||
--arg-num-alignments-per-entry $arg_num_alignments_per_entry | ||
]]></command> | ||
<inputs> | ||
<!-- used only for tests, as the deeparg database contains large files that cannot be deleted or reduced. --> | ||
<param name="hide_db_build" type="hidden" value=""/> | ||
hugolefeuvre marked this conversation as resolved.
Show resolved
Hide resolved
|
||
<!-- --> | ||
<param name="input" type="data" format="fasta" label="Input file"/> | ||
<param name="deeparg_db" type="select" label="DeepARG database"> | ||
<options from_data_table="deeparg_database_versioned"> | ||
<validator message="No deeparg database is available" type="no_options"/> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we filter for the version here? |
||
</options> | ||
</param> | ||
<param argument="--model" type="select" label="Select model to use"> | ||
<option value="SS" selected="true">SS (short sequences for reads)</option> | ||
<option value="LS">LS (long sequences for genes)</option> | ||
</param> | ||
<param argument="--type" type="select" label="Molecular data type"> | ||
<option value="nucl" selected="true">Nucleotid (default)</option> | ||
<option value="prot">Protein</option> | ||
</param> | ||
<param argument="--min-prob" type="float" min="0" max="1" value="0.8" label="Minimum probability cutoff [Default: 0.8]"/> | ||
<param argument="--arg-alignment-identity" type="integer" min="0" value="50" label="Identity cutoff for sequence alignment [Default: 50]"/> | ||
<param argument="--arg-alignment-evalue" type="float" min="0" value="1e-10" label="Evalue cutoff [Default: 1e-10]"/> | ||
<param argument="--arg-alignment-overlap" type="float" min="0" max="1" value="0.8" label="Alignment read overlap [Default: 0.8]"/> | ||
<param argument="--arg-num-alignments-per-entry" type="integer" min="0" value="1000" label="Diamond, minimum number of alignments per entry [Default: 1000]"/> | ||
<section name="output_files" title="Selection of the output files"> | ||
<param name="output_selection" type="select" label="Output files selection" display="checkboxes" multiple="true"> | ||
<option value="file_ARG_tsv" selected="true">ARG detected with prob higher or equal to --prob in TSV</option> | ||
<option value="file_potential_ARG_tsv" selected="true">ARG detected with prob below --prob in TSV</option> | ||
<option value="file_all_hits_tsv" selected="false">All hits detected in TSV</option> | ||
</param> | ||
</section> | ||
</inputs> | ||
<outputs> | ||
<data name="output_mapping_ARG" format="tabular" from_work_dir="deeparg_predict_output/deeparg_predict.mapping.ARG" label="${tool.name} on ${on_string} : ARG detected (prob higher or equal to --prob)"> | ||
<filter>output_files['output_selection'] and "file_ARG_tsv" in output_files['output_selection']</filter> | ||
</data> | ||
<data name="output_mapping_potential_ARG" format="tabular" from_work_dir="deeparg_predict_output/deeparg_predict.mapping.potential.ARG" label="${tool.name} on ${on_string} : Potential ARG (prob below --prob)"> | ||
<filter>output_files['output_selection'] and "file_potential_ARG_tsv" in output_files['output_selection']</filter> | ||
</data> | ||
<data name="output_all_hits" format="tabular" from_work_dir="deeparg_predict_output/deeparg_predict.align.daa.tsv" label="${tool.name} on ${on_string} : all hits detected"> | ||
<filter>output_files['output_selection'] and "file_all_hits_tsv" in output_files['output_selection']</filter> | ||
</data> | ||
</outputs> | ||
<tests> | ||
<!-- Test 1 --> | ||
<test expect_num_outputs="3"> | ||
<param name="hide_db_build" value="true"/> | ||
<param name="input" value="ORFs.fa" ftype="fasta"/> | ||
<param name="deeparg_db" value="deeparg_1.0.4-19122024"/> | ||
<param name="model" value="SS"/> | ||
<param name="type" value="nucl"/> | ||
<section name="output_files"> | ||
<param name="output_selection" value="file_ARG_tsv,file_potential_ARG_tsv,file_all_hits_tsv"/> | ||
</section> | ||
<output name="output_mapping_ARG" ftype="tabular"> | ||
<assert_contents> | ||
<has_text text="YP_003283625.1|FEATURES|tet(K)|tetracycline|tet(K)"/> | ||
<has_text text="RPOB2"/> | ||
</assert_contents> | ||
</output> | ||
<output name="output_mapping_potential_ARG" ftype="tabular"> | ||
<assert_contents> | ||
<has_text text="gi:545254650:ref:WP_021551023.1:|FEATURES|mdtB|multidrug|mdtB"/> | ||
<has_text text="MUXB"/> | ||
</assert_contents> | ||
</output> | ||
<output name="output_all_hits" ftype="tabular"> | ||
<assert_contents> | ||
<has_size value="226000" delta="10000"/> | ||
<has_text text="ADV91011.1|FEATURES|RbpA|rifamycin|RbpA"/> | ||
</assert_contents> | ||
</output> | ||
</test> | ||
</tests> | ||
<help> | ||
DeepARG Predict is a computational tool designed to classify and annotate antibiotic resistance genes (ARGs) from nucleotide or protein sequences. | ||
|
||
It takes as input a **fasta nucleotide or protein file** containing short (SS model) or long (LS model) sequences. | ||
|
||
DeepARG output | ||
--------------- | ||
|
||
DeepARG generates two main files: .ARG that contains the sequences with a probability sup or = --prob (0.8 default) and .potential.ARG with sequences containing a probability inf to --prob (0.8 default). The .potential.ARG file can still contain ARG-like sequences, howevere, it is necessary inspect its sequences. | ||
|
||
The output format for both files consists of the following fields: | ||
|
||
* ARG_NAME | ||
* QUERY_START | ||
* QUERY_END | ||
* QUERY_ID | ||
* PREDICTED_ARG_CLASS | ||
* BEST_HIT_FROM_DATABASE | ||
* PREDICTION_PROBABILITY | ||
* ALIGNMENT_BESTHIT_IDENTITY (%) | ||
* ALIGNMENT_BESTHIT_LENGTH | ||
* ALIGNMENT_BESTHIT_BITSCORE | ||
* ALIGNMENT_BESTHIT_EVALUE | ||
* COUNTS | ||
|
||
If you want to annotate paired-end short read sequencing data use the DeepARG Short Reads tool. | ||
|
||
</help> | ||
<expand macro="citations"/> | ||
</tool> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use
detect_errors="exit_code"
in the command .. instead of the stdio block