AMBER (Assessment of Metagenome BinnERs) is an evaluation package for the comparative assessment of genome reconstructions from metagenome benchmark datasets. It provides performance metrics, results rankings, and comparative visualizations for assessing multiple programs or parameter effects. The provided metrics were used in the first community benchmarking challenge of the initiative for the Critical Assessment of Metagenomic Interpretation.
As input, the main tool evaluate.py of AMBER uses three files:
- a gold standard mapping of contigs or read IDs to genomes in the CAMI binning Bioboxes format; see here example (note: only columns SEQUENCEID and BINID are required)
- one or more files with bin assignments for the sequences also in the CAMI binning Bioboxes format, with each file containing all the bin assignments from the run of a binning program. A tool for converting FASTA files, such that each file represents a bin, is available (see utils/convert_fasta_bins_to_biobox_format.py below)
- a FASTA or FASTQ file with the sequences for obtaining their lengths. Optionally, the lenghts may be added to the gold standard mapping file using tool utils/add_length_column.py (see below). In this way, evaluate.py no longer requires a FASTA or FASTQ file
Additional parameters may be specified - see below.
- python ≥ 3.5
- python-tk
- numpy ≥ v1.13.0
- biopython ≥ v1.69.0
- matplotlib ≥ v2.0.2
- tox (only for automatic tests)
- The examples below require a gold standard assembly. Please download it to the test directory.
- avg_precision: precision averaged over genome bins
- std_dev_precision: standard deviation of precision averaged over genome bins
- sem_precision: standard error of the mean of precision averaged over genome bins
- avg_recall: recall averaged over genome bins
- std_dev_recall: standard deviation of recall averaged over genome bins
- sem_recall: standard error of the mean of recall averaged over genome bins
- precision: precision weighed by base pairs
- recall: recall weighed by base pairs
- rand_index_by_bp: Rand index weighed by base pairs
- rand_index_by_seq: Rand index weighed by sequence counts
- a_rand_index_by_bp: adjusted Rand index weighed by base pairs
- a_rand_index_by_seq: adjusted Rand index weighed by sequence counts
- percent_assigned_bps: percentage of base pairs that were assigned to bins
- >0.5compl<0.1cont: number of bins with more than 50% completeness and less than 10% contamination
- >0.7compl<0.1cont: number of bins with more than 70% completeness and less than 10% contamination
- >0.9compl<0.1cont: number of bins with more than 90% completeness and less than 10% contamination
- >0.5compl<0.05cont: number of bins with more than 50% completeness and less than 5% contamination
- >0.7compl<0.05cont: number of bins with more than 70% completeness and less than 5% contamination
- >0.9compl<0.05cont: number of bins with more than 90% completeness and less than 5% contamination
usage: evaluate.py [-h] -g GOLD_STANDARD_FILE [-f FASTA_FILE] [-l LABELS]
[-p FILTER] [-r GENOMES_FILE] [-k KEYWORD] -o OUTPUT_DIR
bin_files [bin_files ...]
Compute all metrics and figures for one or more binning files; output summary
to screen and results per binning file to chosen directory
positional arguments:
bin_files Binning files
optional arguments:
-h, --help show this help message and exit
-g GOLD_STANDARD_FILE, --gold_standard_file GOLD_STANDARD_FILE
Gold standard - ground truth - file
-f FASTA_FILE, --fasta_file FASTA_FILE
FASTA or FASTQ file with sequences of gold standard
(required if gold standard file misses column _LENGTH)
-l LABELS, --labels LABELS
Comma-separated binning names
-p FILTER, --filter FILTER
Filter out [FILTER]% smallest bins (default: 0)
-r GENOMES_FILE, --genomes_file GENOMES_FILE
File with list of genomes to be removed
-k KEYWORD, --keyword KEYWORD
Keyword in second column of input for bins to be
removed (no keyword=remove all in list)
-o OUTPUT_DIR, --output_dir OUTPUT_DIR
Directory to write the results to
Example:
./evaluate.py -g test/gsa_mapping.binning \
-f test/CAMI_low_RL_S001__insert_270_GoldStandardAssembly.fasta.gz \
-l "MaxBin 2.0, CONCOCT, MetaBAT" \
-p 1 \
-r test/unique_common.tsv \
-k "circular element" \
test/naughty_carson_2 \
test/goofy_hypatia_2 \
test/elated_franklin_0 \
-o output_dir/
Output:
tool avg_precision std_dev_precision sem_precision avg_recall std_dev_recall sem_recall precision recall rand_index_by_bp rand_index_by_seq a_rand_index_by_bp a_rand_index_by_seq percent_assigned_bps >0.5compl<0.1cont >0.7compl<0.1cont >0.9compl<0.1cont >0.5compl<0.05cont >0.7compl<0.05cont >0.9compl<0.05cont
MaxBin 2.0 0.948 0.095 0.016 0.799 0.364 0.058 0.934 0.838 0.995 0.951 0.917 0.782 0.864 28 28 24 23 23 21
CONCOCT 0.837 0.266 0.052 0.517 0.476 0.069 0.684 0.936 0.972 0.946 0.644 0.751 0.967 18 17 15 16 16 14
MetaBAT 0.822 0.256 0.047 0.57 0.428 0.065 0.724 0.825 0.976 0.965 0.674 0.860 0.917 17 16 12 17 16 12
Additionally, directory output_dir will contain figures avg_precision_recall.png + .pdf (average precision vs. average recall) and ari_vs_assigned_bps.png + .pdf (adjusted Rand index vs. percentage of assigned base pairs), and rankings.txt (binnings sorted by average precision, average recall, and average precision + recall). In the same directory, subdirectories naughty_carson_2, goofy_hypatia_2, and elated_franklin_0 will be created with the following files:
- rand_index.tsv: contains value of (adjusted) Rand index and percentage of assigned/binned bases. Rand index is both weighed and unweighed by base pairs
- precision_recall.tsv: contains precision and recall per genome bin
- precision_recall_avg.tsv: contains precision and recall averaged over genome bins. Includes standard deviation and standard error of the mean
- precision_recall_by_bpcount.tsv: contains precision and recall weighed by base pairs
- genomes_sorted_by_precision.png + .pdf: figure of precision and recall per genome with genomes sorted by precision
- genomes_sorted_by_recall.png + .pdf: figure of precision and recall per genome with genomes sorted by recall
usage: precision_recall.py [-h] -g GOLD_STANDARD_FILE [-f FASTA_FILE]
[-l LABELS] [-p FILTER] [-r GENOMES_FILE]
[-k KEYWORD]
bin_files [bin_files ...]
Compute precision and recall, including standard deviation and standard error
of the mean, for binning files
positional arguments:
bin_files Binning files
optional arguments:
-h, --help show this help message and exit
-g GOLD_STANDARD_FILE, --gold_standard_file GOLD_STANDARD_FILE
Gold standard - ground truth - file
-f FASTA_FILE, --fasta_file FASTA_FILE
FASTA or FASTQ file with sequences of gold standard
(required if gold standard file misses column _LENGTH)
-l LABELS, --labels LABELS
Comma-separated binning names
-p FILTER, --filter FILTER
Filter out [FILTER]% smallest bins (default: 0)
-r GENOMES_FILE, --genomes_file GENOMES_FILE
File with list of genomes to be removed
-k KEYWORD, --keyword KEYWORD
Keyword in second column of input for bins to be
removed (no keyword=remove all in list)
Example:
./precision_recall.py -g test/gsa_mapping.binning \
-f test/CAMI_low_RL_S001__insert_270_GoldStandardAssembly.fasta.gz \
-r test/unique_common.tsv -k "circular element" \
-p 1 \
-l "MaxBin 2.0, CONCOCT, MetaBAT" \
test/naughty_carson_2 test/goofy_hypatia_2 test/elated_franklin_0
Output:
tool precision std_dev_precision sem_precision recall std_dev_recall sem_recall
MaxBin 2.0 0.948 0.095 0.016 0.799 0.364 0.058
CONCOCT 0.837 0.266 0.052 0.517 0.476 0.069
MetaBAT 0.822 0.256 0.047 0.570 0.428 0.065
usage: precision_recall_per_genome.py [-h] -g GOLD_STANDARD_FILE
[-f FASTA_FILE]
bin_file
Compute table of precision and recall per genome bin
positional arguments:
bin_file Binning file
optional arguments:
-h, --help show this help message and exit
-g GOLD_STANDARD_FILE, --gold_standard_file GOLD_STANDARD_FILE
Gold standard - ground truth - file
-f FASTA_FILE, --fasta_file FASTA_FILE
FASTA or FASTQ file with sequences of gold standard
(required if gold standard file misses column _LENGTH)
Example:
./precision_recall_per_genome.py -g test/gsa_mapping.binning \
-f test/CAMI_low_RL_S001__insert_270_GoldStandardAssembly.fasta.gz \
test/naughty_carson_2
Output:
genome precision recall predicted_size correctly_predicted real_size
1049005 0.998458014811 1.0 4114177 4107833 4107833
evo_1035930.032 1.0 1.0 2294831 2294831 2294831
Sample18_8 0.409809009246 1.0 121786 49909 49909
evo_1035930.029 0.995223021915 1.0 2423708 2412130 2412130
...
usage: exclude_genomes.py [-h] -r GENOMES_FILE [-k KEYWORD] [file]
Exclude genome bins from table of precision and recall per genome. The table
can be provided as file or via the standard input
positional arguments:
file File containing precision and recall for each genome
optional arguments:
-h, --help show this help message and exit
-r GENOMES_FILE, --genomes_file GENOMES_FILE
File with list of genomes to be removed
-k KEYWORD, --keyword KEYWORD
Keyword in second column of input for bins to be
removed (no keyword=remove all in list)
Example:
The example computes the table of precision and recall and pipes it to utils/exclude_genomes.py.
./precision_recall_per_genome.py -g test/gsa_mapping.binning \
-f test/CAMI_low_RL_S001__insert_270_GoldStandardAssembly.fasta.gz \
test/naughty_carson_2 | \
./utils/exclude_genomes.py -r test/unique_common.tsv -k "circular element"
Output:
The output the is the table from precision_recall_per_genome.py without the excluded genomes.
usage: precision_recall_average.py [-h] [-p FILTER] [-l LABEL] [file]
Compute precision and recall, including standard deviation and standard error
of the mean, from table of precision and recall per genome. The table can be
provided as file or via the standard input
positional arguments:
file File containing precision and recall for each genome
optional arguments:
-h, --help show this help message and exit
-p FILTER, --filter FILTER
Filter out [FILTER]% smallest bins (default: 0)
-l LABEL, --label LABEL
Binning name
Example:
./precision_recall_per_genome.py -g test/gsa_mapping.binning \
-f test/CAMI_low_RL_S001__insert_270_GoldStandardAssembly.fasta.gz \
test/naughty_carson_2 | \
./utils/exclude_genomes.py -r test/unique_common.tsv -k "circular element" | \
./precision_recall_average.py -p 1 -l "MaxBin 2.0"
Output:
tool precision std_dev_precision sem_precision recall std_dev_recall sem_recall
MaxBin 2.0 0.948 0.095 0.016 0.799 0.364 0.058
usage: precision_recall_by_bpcount.py [-h] -g GOLD_STANDARD_FILE
[-f FASTA_FILE]
bin_file
Compute precision and recall weighed by base pair counts (not averaged over
genome bins) from binning file
positional arguments:
bin_file Binning file
optional arguments:
-h, --help show this help message and exit
-g GOLD_STANDARD_FILE, --gold_standard_file GOLD_STANDARD_FILE
Gold standard - ground truth - file
-f FASTA_FILE, --fasta_file FASTA_FILE
FASTA or FASTQ file with sequences of gold standard
(required if gold standard file misses column _LENGTH)
Example:
./precision_recall_by_bpcount.py -g test/gsa_mapping.binning \
-f test/CAMI_low_RL_S001__insert_270_GoldStandardAssembly.fasta.gz \
test/naughty_carson_2
Output:
precision recall
0.934 0.838
usage: rand_index.py [-h] -g GOLD_STANDARD_FILE [-f FASTA_FILE] bin_file
Compute (adjusted) Rand index from binning file, unweighed and weighed by base
pairs, and percentage of binned base pairs
positional arguments:
bin_file Binning file
optional arguments:
-h, --help show this help message and exit
-g GOLD_STANDARD_FILE, --gold_standard_file GOLD_STANDARD_FILE
Gold standard - ground truth - file
-f FASTA_FILE, --fasta_file FASTA_FILE
FASTA or FASTQ file with sequences of gold standard
(required if gold standard file misses column _LENGTH)
Example:
./rand_index.py -g test/gsa_mapping.binning \
-f test/CAMI_low_RL_S001__insert_270_GoldStandardAssembly.fasta.gz \
test/naughty_carson_2
Output:
rand_index_by_bp rand_index_by_seq a_rand_index_by_bp a_rand_index_by_seq percent_assigned_bps
0.995 0.951 0.917 0.782 0.864
usage: genome_recovery.py [-h] [-p FILTER] [-l LABEL] [file]
Compute number of genomes in ranges of completeness and contamination
positional arguments:
file File containing precision and recall for each genome
optional arguments:
-h, --help show this help message and exit
-p FILTER, --filter FILTER
Filter out [FILTER]% smallest bins (default: 0)
-l LABEL, --label LABEL
Binning name
Example:
./precision_recall_per_genome.py -g test/gsa_mapping.binning \
-f test/CAMI_low_RL_S001__insert_270_GoldStandardAssembly.fasta.gz \
test/naughty_carson_2 | \
./genome_recovery.py -l "MaxBin 2.0" -p 1
Output:
MaxBin 2.0 >50% complete >70% complete >90% complete
<10% contamination 28 28 24
<5% contamination 23 23 21
usage: plot_by_genome.py [-h] [-s {recall,precision}] [-o OUT_FILE] [file]
Plot precision and recall per genome. Genomes can be sorted by recall
(default) or precision
positional arguments:
file File containing precision and recall for each genome
optional arguments:
-h, --help show this help message and exit
-s {recall,precision}, --sort_by {recall,precision}
Sort by either precision or recall (default: recall)
-o OUT_FILE, --out_file OUT_FILE
Path to store image (default: only show image)
Example:
./precision_recall_per_genome.py -g test/gsa_mapping.binning \
-f test/CAMI_low_RL_S001__insert_270_GoldStandardAssembly.fasta.gz \
test/naughty_carson_2 | \
./plot_by_genome.py
Output: Figure is shown on screen.
usage: convert_fasta_bins_to_biobox_format.py [-h] [-o OUTPUT_FILE]
paths [paths ...]
Convert bins in FASTA files to CAMI tsv format
positional arguments:
paths FASTA files including full paths
optional arguments:
-h, --help show this help message and exit
-o OUTPUT_FILE, --output_file OUTPUT_FILE
Output file
Example:
./utils/convert_fasta_bins_to_cami.py \
/path/to/file/maxbin.out.001.fasta \
/path/to/file/maxbin.out.002.fasta \
/path/to/file/maxbin.out.003.fasta \
/path/to/file/maxbin.out.004.fasta \
/path/to/file/maxbin.out.005.fasta \
-o bins.tsv
Alternatively:
./utils/convert_fasta_bins_to_cami.py /path/to/file/maxbin.out.0* -o bins.tsv
Output: File bins.tsv is created in the working directory.
usage: add_length_column.py [-h] -g GOLD_STANDARD_FILE -f FASTA_FILE
Add length column _LENGTH to gold standard mapping and print mapping on the
standard output
optional arguments:
-h, --help show this help message and exit
-g GOLD_STANDARD_FILE, --gold_standard_file GOLD_STANDARD_FILE
Gold standard - ground truth - file
-f FASTA_FILE, --fasta_file FASTA_FILE
FASTA or FASTQ file with sequences of gold standard
Example:
./utils/add_length_column.py -g test/gsa_mapping.binning \
-f test/CAMI_low_RL_S001__insert_270_GoldStandardAssembly.fasta.gz
Output:
@Version:0.9.1
@SampleID:gsa
@@SEQUENCEID BINID TAXID _contig_id _number_reads _LENGTH
RL|S1|C10817 Sample18_57 45202 Sample18_57_from_2_to_20519_total_20518 44394 20518
RL|S1|C11497 Sample22_57 10239 Sample22_57_from_4_to_37675_total_37672 18432 37672
RL|S1|C6571 evo_1286_AP.033 1385 contig_1_4_from_3_to_69916_total_69914 30978 69914
RL|S1|C10560 evo_1286_AP.033 1385 contig_1_4_from_69981_to_1065637_total_995657 443334 995657
...
Note that only columns SEQUENCEID and BINID are required in a gold standard mapping file. The added optional column _LENGTH, however, eliminates the need for a FASTA or FASTQ file when evaluating binnings.
We are using tox for project automation.
If you want to run tests, just type tox in the project's root directory:
tox