Skip to content

chunhuicai/TDI

 
 

Repository files navigation

This class served as a set of common TDI queries.

(Intermediate) Function 1: findGeneId(geneName)
Description: Given gene name, return TDI database id for this gene
Input: gene name
Output: TDI database id
Example: findGeneId(“PIK3CA”)

(Intermediate) Function 2: findGeneName(geneID)
Description: Given TDI database gene_id, return gene name for the gene
Input: TDI database id
Output: gene name
Example: findGeneName(11375)

(Intermediate) Function 3: findSGAUnitGroupId(SGA)
Description: Given a SGA unit/group name, return TDI database id for this SGA unit/group
Input: SGA unit/group name
Output: TDI database id of input SGA
Example: findSGAUnitGroupId(“SGA.unit.10”)

(Intermediate) Function 4: findSGAUnitGroupName(SGA_id)
Description: Given a TDI database SGA unit/group id, return its SGA unit/group name
Input: TDI database id of a SGA unit/group
Output: SGA unit/group name
Example: findSGAUnitGroupName(1)

(Intermediate) Function 5: findPatientName(patient_id) 
Description: Given a TDI database patient id, return is tumor name
Input: TDI database patient id
Output: tumor name

(Intermediate) Function 6: findPatientId(patient_name)
Description: Given tumor name, return its TDI database patient_id
Input: tumor name
OutPut: TDI database patient_id

Function 7: findDriverInAGivenTumor(tumor_name)
Description: Given a tumor, find drivers including SGA and SGA unit/group that are drivers in this tumor. Drivers are defined as regulate no less that 5 DEGs with posterior equal or larger than the threshold for a specific SGA gene and SGA unit/group
Input: tumor name
Output: a cvs file with two columns, first column is driver name, second column is numbers of DEGs this driver regulated in given tumor

Function 8: findDriversForListOfDEGsInAGivenTumor(tumor_name, degList)
Description: Given a tumor and a list of DEGs, find drivers in this tumor that regulate at least one gene in given DEGs list
Input: a tumor name, a list of DEGs
Output: a csv file with two columns, first column is driver name, second column is numbers of DEGs this driver regulated in given tumor also existed in given DEG list
Example: findDriversForListOfDEGsInAGivenTumor(tumor_name, degList)

(Intermediate) Function 9: findTumorsInWhichAGivenSGAIsDriver(SGA)
Description: Given a SGA(it could be a gene name or SGA unit/group name), find in which tumors this Given SGA is a driver
Input: SGA name(gene name or SGA unit group name) 
Output: a dictionary, key is a tumor, value is a list of DEGs in the tumor that are regulated by a given SGA
Example: findTumorsInWhichAGivenSGAIsDriver(SGA)

(Intermediate) Function 10: findTumorsHasSMForAGivenSGA(SGA)
Description: Given a SGA gene, find in which tumors it has somatic mutations
Input: SGA gene name
Output: a list a tumors in which the given SGA has somatic mutations
Example: findTumorsHasSMForAGivenSGA(“PIK3CA”)

(Intermediate) Function 11: findTumorsHasSMForAGivenSGAInGivenHopspot(SGA, hotspot)
Description: Given a SGA gene, find in which tumors it has somatic mutations at given hotspot
Input: SGA gene name
Output: a list a tumors in which the given SGA has somatic mutations at Given hotspot
Example: findTumorsHasSMForAGivenSGAInGivenHopspot(“PIK3CA”, 1047)

(Intermediate) Function 12: findTumorsHasSCNAForAGivenSGA(SGA)
Description: Given a SGA gene, find in which tumors it has copy number alternation
Input: SGA gene name
Output: a list a tumors in which the given SGA has copy number alternation at Given hotspot
Example: findTumorsHasSCNAForAGivenSGA(“PIK3CA”)

(Intermediate) Function 13: queryPatientsAndDEGsForAGivenSGAandTumorset(SGA, tumors)
Description: Given a SGA name(it could be a SGA gene name or a SGA unit/group name) and a list of tumors, find DEGs that regulated by this SGA in each tumor
Input: SGA name, a list of tumors
Output: a dictionary, key is DEG that regulated by given SGA, value is a list of tumors, it means in these tumors given SGA called key DEG as a target

Function 14: findDEGforAGivenSGA(SGA, cutoff, tumors = None)
Description: tumors parameter is an optional parameter here. If no tumors parameter, it means given a SGA (a SGA unit/group name or a SGA gene name) and a cutoff value, find the DEGs regulated by the given SGA in all tumors with call rate no less than given cutoff, otherwise it means given a SGA (a SGA unit/group name or a SGA gene name), and a cutoff value, find the DEGs regulated by the given SGA in given tumor sets with call rate no less than given cutoff.
Input: a SGA unit/group name or a SGA gene name, cutoff value, a list of tumors
Output: a cvs file with 4 columns, column1 is DEG, column2 is numbers of tumors that called given SGA as a driver, column3 is numbers of tumors the DEG is called target by given SGA, column4 is call rate(value in column2 divided by value is column3)

Function 15: findDEGofSCNAForAGivenSGA(SGA, cutoff)
Description: given a SGA gene name, find targeted DEG of SCNA regulated by this SGA with call rate no less than given cutoff
Input: a SGA gene, a cutoff
Output: a cvs file with 4 columns, column1 is DEG, column2 is numbers of tumors that called given SGA as a driver, column3 is numbers of tumors the DEG is called target by given SGA, column4 is call rate(value in column2 divided by value is column3)

Function 16: findDEGsOfSMForAGivenSGA(SGA, cutoff)
Description: given a SGA gene name, find targeted DEG of Somatic Mutation regulated by this SGA with call rate no less than given cutoff
Input: a SGA gene, a cutoff
Output: a cvs file with 4 columns, column1 is DEG, column2 is numbers of tumors that called given SGA as a driver, column3 is numbers of tumors the DEG is called target by given SGA, column4 is call rate(value in column2 divided by value is column3)

Function 17: findDEGsInHotspotOfAGivenSGA(SGA, hotspot, cutoff)
Description: Given a SGA gene name, a mutation hotspot, and a cutoff value, find target DEGs regulated by given SGA that has mutation on given hotspot with call rate no less than given cutoff value
Input: a SGA gene name, a mutation hotspot, a cutoff value
Output: a cvs file with 4 columns, column1 is DEG, column2 is numbers of tumors that called given SGA as a driver, column3 is numbers of tumors the DEG is called target by given SGA, column4 is call rate(value in column2 divided by value is column3)

Function 18: findDEGsInHotspotOfAGivenSGAForTwoHotspot(SGA, hotspot1, hotspot2, cutoff)
Description: Given a SGA gene name, two mutation hotspots, and a cutoff value, find target DEGs regulated by given SGA that has mutations on given hotspots with call rate no less than given cutoff value
Input: a SGA gene name, a mutation hotspot, a cutoff value
Output: a cvs file with 4 columns, column1 is DEG, column2 is numbers of tumors that called given SGA as a driver, column3 is numbers of tumors the DEG is called target by given SGA, column4 is call rate(value in column2 divided by value is column3)

Function 19: findTumorsInWhichASGAIsADriverOutputBySMAndSCNA(SGA)
Description: Given a SGA gene name, find tumors called given SGA as a driver
Input: a SGA gene name
Output: a csv file with 7 columns, column1 is tumor name. Column2 is alternation type. Column3 is gistic_score if column2 is SCNA, otherwise column 3 is null. Column4 is mutate type if column2 is SM otherwise it is null. Column5 is aa_norm value if column2 is SM otherwise it is null, column 6 is aa_loc value if column2 is SM otherwise it is null, column7 is aa_mut value if if column2 is SM otherwise it is null.

Function 20: findCommonDEGListForTwoSGA(SGA1, SGA2)
Description: given two SGAs (the SGA could be a SGA gene name or a SGA unit/group name), find the common DEG list for these two SGA
Input: two SGAs
Output: a csv file with 1 column of DEG

Function 21: findSGARegulateAPairOfDEGs(DEG1, DEG2)
Description: given a pair of DEGs, find SGA drivers regulate given pair of DEGs in a tumor
Input: two DEG gene name
Output: a csv file with 1 column of SGA

Function 22: findTumorsCalledAGivenSGADriver(SGA)
Description: given a SGA unit or a SGA gene, find regualted deg list in driven tumors group by tumor type
Input: SGA, it can be a SGA unit or a SGA gene
Output: a csv file containning 4 columns. Column1 is cancer type. Column2 is number of tumors which means total numbers of tumors has SGA events for given SGA. column3 is number of tumors call driver, which means the number of tumors that called given SGA as a driver. Column4 is is call rate, it comes from the value in column2 divided by column3

Function 23: findAllSGACallRate()
Description: count number of SGA events and number of driver calls and calculate driver call rate for each SGA
Input: SGA, it can be a SGA unit or a SGA gene
Output: a csv file containning 4 columns. Column1 is SGA. Column2 is number of tumors which means total numbers of tumors has SGA events for given SGA. column3 is number of tumors call driver, which means the number of tumors that called given SGA as a driver. Column4 is is call rate, it comes from the value in column2 divided by column3



About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%