forked from aaronshehan/phylogenetic-tree-generator
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
38 lines (24 loc) · 1.73 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Aaron Shehan
Het Patel
Our source for lcs.py and lcs function calls in our other files is:
https://www.geeksforgeeks.org/longest-common-substring-dp-29/
Our source for the generateLcsMatrix.py file is:
https://github.com/SupriyaL/Hierarchical-Clustering/blob/master/Preprocess.py
Our source for the generateTree.py file is:
https://github.com/SupriyaL/Hierarchical-Clustering/blob/master/code_1.py
Our source for the extraCredit.txt is:
https://www.ncbi.nlm.nih.gov/gene/7289
https://www.sciencedirect.com/science/article/pii/S0960982219300831
https://www.sciencedirect.com/science/article/pii/S0960982219300831
1) The LCS code is provided in a file called lcs.py
2) To test lcs.py, one must pass in a command line argument that represents the file name that is being read from.
To execute lcs.py:
python lcs.py <filename>
3) There are two programs utilized to generate the phylogenetic tree: GenerateTree.py and GenerateLcsMatrix.py. A python script called script.py is provided to run these programs. The filename containing the protein sequences must be passed in as a command line argument.
To execute script.py:
python script.py <filename>
The phylogenetic tree that is generated is contained in a file called "PyhlogeneticTreePicture.png".
4) Our description of part 3 is located in "Part3DescriptionOfOurAlgorithmAndSoftware.txt".
5) Our comparison between our tree which is found in PyhlogeneticTreePicture.png and the tree found on https://www.ebi.ac.uk/Tools/msa/clustalo/ for part 4 is located in "Part4ComparingOurDataWithTheWebsite.txt".
A comparison between the two graphs can be found in this file: 4110_assignment3_trees.pdf
6) Our description for the Tulp3 gene for extra credit is located in "extraCredit.txt"