Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geneontology association table and uniprot table entry duplication #12

Open
sinanshi opened this issue Nov 23, 2015 · 0 comments
Open
Assignees
Milestone

Comments

@sinanshi
Copy link
Contributor

http://geneontology.org/gene-associations/gene_association.goa_uniprot_noiea.gz
Run the script with both old and new files, errors occur telling

Can't execute: Duplicate entry 'UniProtKB-P04637-GO:0005737-IDA-PMID:16131611-' for key 'PRIMARY'
DBD::mysql::st execute failed: Duplicate entry 'UniProtKB-P04637-GO:0005737-IDA-PMID:16131611-' for key 'PRIMARY' at perl/yogy_add_go_assocs.pl line 94, <GO_TERMS> line 154471.

One example in gene_association.dictyBase:

dictyBase       DDB_G0268004    snrp70          GO:0005685      GO_REF:0000024  ISS     UniProtKB:Q00916        C       U1 small nuclear ribonucleoprotein 70 kDa protein               gene    taxon:44689     20060120        dictyBase
dictyBase       DDB_G0268004    snrp70          GO:0005685      GO_REF:0000024  ISS     UniProtKB:P08621        C       U1 small nuclear ribonucleoprotein 70 kDa protein               gene    taxon:44689     20131213        UniProt

Aparently these two entries are identical, except the last two columns, which are not used as the primary keys in SQL insert. So I guess the best will be remove the duplicated rows. Please let me know if you have some other concerns.

@sinanshi sinanshi added this to the yogy update milestone Nov 23, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants