Skip to content

Integrative Python library for RNA Secondary Structure Analysis

Notifications You must be signed in to change notification settings

lipan6461188/IPyRSSA

Repository files navigation

IPyRSSA

IPyRSSA (Integrative Python library for RNA Secondary Structure Analysis) is a set of Python library to analyze RNA secondary structure and SHAPE data.

New

Python 3 is supported now. Python 2 is not supported.

Update your local library

git pull origin

General module

`import General`
Function name Usage
load_fasta Read fasta file
write_fasta Write fasta file
load_dot Read dotBracket file
write_dot Write dotBracket file
load_shape Read SHAPE .out file
load_SHAPEMap Read SHAPEmap file
load_ct Read .ct file
write_ct Write .ct file
init_pd_rect Build a dataframe
init_list_rect Build a list matrix
find_all_match Find all match regions with a regex
bi_search Binary search
calc_shape_gini Calculate SHAPE gini index
calc_shape_structure_ROC Calculate the ROC points structure and shape scores
calc_AUC Calculate AUC with ROC points
calc_AUC_v2 Calculate AUC with dot and shape_list
seq_entropy Calculate the entropy of the sequence.

Colors module

`import Colors`
Function name Usage
format or f Format a colorful text
color_SHAPE Convert SHAPE list to colorful blocks
color_Seq_SHAPE Convert sequence to colorful sequence
browse_shape Print and compare single/multiple shape scores example
browse_multi_shape Align multiple sequences and print shape scores example

Cluster module

`import Cluster`

Warning: This module can only be used on loginviewxx/mgtxx

Function name Usage
new_job Get a job handle
handle.set_job_depends The job will be executed when parameter jobs done
handle.submit Submit the job to queue
handle.has_finish Return True if finished
handle.job_status Return one of Not_Found, DONE, RUN, PEND, EXIT
handle.wait Wait the job to finish
handle.kill Kill the job

Seq module

`import Seq`

Prerequisites: pyliftover, pysam

Function name Usage
reverse_comp Get reversed complementary sequence of raw sequence
flat_seq Flatten the long sequence to multiline sequence
format_gene_type Classify the raw gene type in annotation to common gene type
Class:seqClass A class to fetch sequence from big genome
lift_genome Convert the genome version (hg19=>hg38)
search_subseq_from_genome Search a pattern in genome region

Structure module

`import Structure`
Function name Usage
predict_structure Prediction secondary structure combine SHAPE or not
bi_fold Prediction RNA interaction
search_TT_cross_linking Search TT cross linking sites in structure
dyalign Predict a common secondary structure for two sequences
multialign Predict a common secondary structure for multiple sequences
estimate_energy Calculate the folding free energy change of a structure
partition Calculate the partition function
maxExpect Calculate the max-expect structure
evaluate_dot Evaluate the Sensitivty and PPV for a predicted structure relative to target structure
calc_structure_similarity Calculate the structure similarity,distance
dot2ct Dotbracket to list
dot2bpmap Dotbracket to dictionary
parse_pseudoknot Parse pseudoknot with ctList
ct2dot ctList to dotbracket
write_ctFn Save dot-bracket structure to .ct file
dot2align Convert secondary structure to aligned sequence.
dot_from_ctFile Read a dotbracket from .ct file
trim_stem Trim a stem loop
find_stem_loop Find stem loop from secondary structure
find_bulge_interiorLoop Find bulges and interior loops from secondary structure
calcSHAPEStructureScore Calculate strcuture - SHAPE agreement score for stem loop
sliding_score_stemloop Find stem-loops in RNA with a sliding window
multi_alignment Multiple sequence alignment with muscle
kalign_alignment Multiple sequence alignment with kalign
global_search Global align short sequences to multiple long sequences
align_find Find the unaligned sequence region from aligned sequence
locate_homoseq Locate homologous region in multiple sequences
dot_to_alignDot Dotbracket to aligned dotbracket
shape_to_alignSHAPE SHAPE list to aligned SHAPE list
annotate_covariation Annotate raw sequence to colorful sequence by highlight the covariation sites
dot_F1 Compare predicted structure and true structure and calculate the F1 score
parse_structure Given a dot-bracket structure, parse structure into all kinds of single-stranded bases and paired bases
refine_structure_interior Check and make some some canonical base pairs in interior loops paired
refine_structure_stackingclosing Check and make some some canonical base pairs in stacking end paired
refine_structure_hairpinclosing Check and make some some canonical base pairs in hairpin paired

Visual module

`import Visual`

Prerequisites: java, VARNA (http://varna.lri.fr)

Function name Usage
Plot_RNAStructure_Shape Plot the RNA structure combine with SHAPE scores
Plot_RNAStructure_Base Plot the RNA structure with different colors for ATCG
Plot_RNAStructure_highlight Plot the RNA structure and highlight some regions
Map_rRNA_Shape Output rRNA structure with PostScript format
get_rRNA_refseq Return reference rRNA sequence

Rosetta module

`from D3 import Rosetta `

Prerequisites: ROSETTA, it can be only run in cluster

Function name Usage
pred_3D_rosetta Predict RNA 3D structure with ROSETTA

MCSym module

`from D3 import MCSym `
Function name Usage
upload_MCSym_job Upload MCSym RNA 3D structure prediction job
get_MCSym_status Get the status of the job
minimize_MCSym_newThread Minimize the pdbs
score_MCSym_newThread Score and ranking pdbs
fetch_top_MCSym_pdb_newThread Download top scored pdbs

HDOCK module

`from D3 import HDOCK`
Function name Usage
upload_HDOCK_job Upload HDOCK RNA-protein docking job
get_HDOCK_status Get the status of the job
guess_HDOCK_time_left guess the time to leave
fetch_HDOCK_results Download all results
fetch_HDOCK_top10_results Download top 10 results

Figures module

`import Figures`
Function name Usage
stackedBarPlot Plot a stacked bar figure
violinPlot Plot a violin figure
piePlot Plot a pie figure
boxPlot Plot a box figure
cdf Plot a CDF curve

GPU module

import GPU

Function name Usage
get_gpu_processes Get process handles running on GPU
get_gpu_list Get a list of available gpu
get_free_gpus Get a list of GPU id without process run on it

Alignment

import Alignment

Function name Usage
blast_seq Use blastn to search sequence
annotate_seq Given a sequence and blastdb, search and annotate the sequence

Covariation

import Covariation

Function name Usage
dot2sto Covert dot to stockholm file
cmbuild Create .cm file with stockholm alignment
cmcalibrate Calibrate a .cm file
cmsearch Call cmsearch programe to search aligned sequence agaist cm model
R_scape Call R-scape to call covariation base pairs
read_RScape_result Read the R-scape result
get_alignedPos2cleanPos_dict Get a distionary {align_pos: raw_pos}
call_covariation Give sequence and dot. Run covariation pipeline
calc_MI Calculate the Mutual information for aligned sequences
calc_RNAalignfold Calculate the RNAalifold covariation score for aligned sequences
calc_RNAalignfold_stack Calculate the RNAalifold covariation score (consider stack) for aligned sequences
collect_columns Given multialignment, return alignment columns
calc_covBP_from_sto Given multialignment, return covariation score for each column pair

About

Integrative Python library for RNA Secondary Structure Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published