forked from Weeks-UNC/Superfold
-
Notifications
You must be signed in to change notification settings - Fork 0
SuperFold is a pipeline that uses output data from ShapeMapper to model RNA secondary structures, including pseudoknots; identify de novo regions with well-defined and stable structures; and visualize most probable and alternative helices.
License
dougalII/Superfold
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
################################################################################### Superfold installation, execution, and troubleshooting. Gregg Rice 2014 [email protected] ################################################################################### Requirements: =================================================================================== python 2.7 =================================================================================== RNAStructure - https://rna.urmc.rochester.edu/RNAstructure.html Fold and partition executables necessary to predict secondary structure and base pairing probabilites Download command-line applications for your platform Extract to home directory build binaries using the 'make all' command in the RNAStructure directory. add following 2 lines to ~/.bash_profile export PATH=$PATH:$HOME/RNAstructure/exe export DATAPATH=$HOME/RNAstructure/data_tables =================================================================================== matplotlib (python module required for .pdf figure rendering) - Download source Extract to any directory cd to the extracted directory run the command "python setup.py install --user" =================================================================================== httplib2 (python module only required if rendering structures) - Download httplib2-0.7.6.tar.gz (or later version) Extract to any directory cd to httplib2 directory run the command "python setup.py install --user" =================================================================================== ################################################################################### ################################################################################### Execution instructions: SuperFold can be run using one command: python SuperFold.py RNA.map All the other flags are optional. Use the --help flag for explainations of command line options python SuperFold.py --help File Setup: The only required file is a .map file. This output is automatically generated by the ShapeMapper pipeline. The .map file consistes of the nucleotide #, SHAPE reactivity, Error, and Nucleotide sequence. T nucleotides will automatically be converted to U by SuperFold. ---myFavoriteRNA.map--- 1 0.002512 0.053798 G 2 -0.034906 0.143529 T 3 -0.077852 0.257623 T 4 -0.068123 0.122385 T Differential SHAPEMap file: The differential file consists of the nucleotide#, differntial SHAPE reactivity, std error, nucleotide sequence and Z-factor of the difference calculated by 1- 3(1m6_err + nmia_err)/abs(shape1-shape2). --myRNAnmia-1m6.mapd-- 1 -999.0 -999.0 G -999.0 2 -0.0124 0.2673 U -74.2440186566 3 0.0951 0.0833 U -2.34887508212 4 0.0409 0.0929 U -7.96984706503 A differential SHAPEMap file is created by running the utility differenceByWindowSHAPAEMAP.py. This program has the following usage: Usage: <nmia.txt.map> <1m6.txt.map> <difference.dif.mapd> <i> Create your .mapd file using the following command: python differenceByWindowSHAPEMAP.py nmia.map 1m6.map nmia-1m6.mapd 25 where nmia.map and 1m6.map are the names of the NMIA and 1M6 map files. The new file "nmia-1m6.mapd" will contain the differential map file suitable to be given to the --differentialFile flag of SuperFold. Single Strand Constraints: Include any other single stranded constraints that you have other evidence shouldn't be considered for folding here. ex: ---ssConstraints.txt--- < this part is just the name, not in the file 34 35 36 78 77 76 PK constraints: In a second file. List the PKs in pairs. We will use this paired PK file to reassemble your pk'd nucleotides in the final step. ex: ---ListofPKs_ds.txt--- 34 78 35 77 36 76 ShapeMapper 2.2+ and --dms: If the data was generated with ShapeMapper 2.2+ in DMS mode (--dms) Superfold should be run with the --DMS flag. This will modify the submitted fold and partition commands in a manner compatible with DMS SM 2.2+ data. ################################################################################### Output description and troubleshooting: Occasionally (depending on the RNA and SHAPE constraints) it may be required to use a smaller window size for partition and for Fold in order to obtain base pairs in the output. This can be accomplished with the: --partitionWindowSize --foldWindowSize 1000 is a good size to select for the partition window. For window sizes less than 1000 set --trimInterior to 200 nucleotides in order to obtain an output for interior windows. Smaller window sizes will result in a bias toward shorter range interactions. Outputs are listed in the order of execution: Folders are created by superfold automatically to store the output. In order to prevent a collision with file names a cryptographic hash of the input values is appended to the folder and file names. A log file detailing the run is in the results folder. Intermediate partition function calculatoins are in the partition folder. Intermediate fold calcualtions are in the fold folder. Merged partition function and minimum free energy structures are in the results folder and begin with the title merged. Likely base pairs from partition function are plotted as arc in the arcs file. The following is the key: green > 80% blue > 30% yellow > 10% gray > 3% The Shannon entropy and SHAPE analysis is plotted in the ShannonSHAPE pdf file. Region cutsites are written to the log file. Indvidual region structure files and plots are written to the regions folder with the region range in the filename
About
SuperFold is a pipeline that uses output data from ShapeMapper to model RNA secondary structures, including pseudoknots; identify de novo regions with well-defined and stable structures; and visualize most probable and alternative helices.
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published
Languages
- Python 100.0%