Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Information request regarding file upload #2

Open
stefcamp opened this issue Sep 24, 2018 · 1 comment
Open

Information request regarding file upload #2

stefcamp opened this issue Sep 24, 2018 · 1 comment

Comments

@stefcamp
Copy link

Dear Andreas,
I am trying to use MetQy following the manual provided, but I am a little bit lost because I am not very good in using R. The software is properly installed in my computer, I followed examples and they provide results (except some figures e.g. sunburst are missing the text).
Basically I have some genomes annotated using KEGG and I would simply perform the “query_genomes_to_modules” with the “user-specified gene sets”.
My input file has header and is organized as you suggests in the example:

ID ORG_ID ORGANISM KOs ECs
T09999 aaa A K00013;K00014;K00018;… “empty field”

Tabular values separate the different fields (ID ORG_ID ORGANISM KOs ECs) in the header and also in the first line. Is this correct? I do not have EC numbers (empty field), only KEGG IDs for genes.
Could you please report some minimal command lines to do the following:
1-Import the file in R in order to be usable from your software;
2- Calculate the module completion fraction (mcf) for all the modules;
3-Export to a text file the mcf values obtained for all the pathways.
Thanks a lot in advance for your help.
Sincerely

@asmvernon
Copy link
Collaborator

Hi Stephano,
Thank you for your interest in MetQy and your patience while waiting for my reply.

Below I've included some code that will helpfully be useful.
Note that when reading the data, you must specify the stringsAsFactors = F. Otherwise, the character variables are imported as factors (which have a class attribute) and are not compatible with the MetQy function.

Do let me know if you have any more questions!

All the best,
Andrea

## USE THE EXAMPLE DATA
data(data_example_multi_ECs_KOs)
write.csv(data_example_multi_ECs_KOs,file = "data_example_multi_ECs_KOs.csv")

## IMPORT DATA
myData <- read.csv("data_example_multi_ECs_KOs.csv",header = T,stringsAsFactors = F)

## CALCULATE THE MCF
OUT_myData <- query_genomes_to_modules(myData,GENOME_ID_COL = "ID",
                                       GENES_COL = "KOs",
                                       MODULE_ID = paste("M0000",1:5,sep=""),
                                       META_OUT=T,ADD_OUT=T)

## WRITE THE DATA AS A .csv
write.csv(OUT_myData$MATRIX,file = "mcf_matrix.csv")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants