A perl script to parse KOBAS 'annotate' output to tabular format.
KOBAS is a well known gene set enrichment tool, offering the capability to annotate provided sequences or IDs as well as conduct enrichment analysis. The annotation of transcripts or genes via homology comparison with for example Arabidopsis thaliana is a valueable tool to gain amongst others GO terms or Pathway IDs of previously unavailable annotations.
However, the format of the results after annotation is not very convenient, as it is not structured as a table. Therefore it is necessary to parse the achieved results of the annotate tool for potential use in downstream analysis.
'annotate_parser' is a perl script parsing the results of the 'Annotate' section to tabular format. Using the default settings on the KOBAS site, the columns of the formatted output table are in order: Query name, Gene id, Gene name, Entrez id, Pathway, GO and GO slim.
perl annotate_parser.pl -i inputfile -o outputfile
Parameter | Description | Comment |
---|---|---|
-i |
/path/to/input_file | comma separated if multiple |
-o |
/path/to/output_file | if not specified, output is send to the console |
--tsv |
output format | default: tab delimited |
--csv |
output format | default: tab delimited |
The 'test' folder contains a template file, generated with the 'annotate' tool from KOBAS using their provided test IDs.