Click Here to Launch an online session:
Scripts for Population Genetic Diversity of Sclerotinia sclerotiorum, Causal Agent of White Mold Disease of Dry Bean, and Implications for Fungicide Resistance / Disease Management.
This directory contains the scripts, written by Anthony Pannullo, to generate the figures and tables used in this manuscript. These scripts were modified by Zhian Kamvar in the winter of 2017/2018 to ensure reproducibility of the results.
These data and analyses are archived on the Open Science framework. If you use them, please use the following citation in your work:
Pannullo A, Kamvar ZN, Miorini TJJ, Steadman J, Everhart SE (2018) Data and Analysis for Genetic variation and structure of Sclerotinia sclerotiorum populations from soybean in Brazil. https://osf.io/u6xsm.
All of the results are controlled via the Makefile. A diagram of how the makefile proceeds can be found in workflow.pdf:
00-install.R
is run, which installs/updates the packages needed for these analyses. This creates bootstrap.txt and the folders "figs", "results", and "tables"- All the R scripts are run, creating their results/*.Rout files and the figures and tables needed for the manuscript.
- rsync will copy all of the results (ignoring .git/ and .Rproj.user/) to the box drive (assuming OSX)
There are two data sets. In both data sets missing data are represented with
NA
values.
This data set contains genotypes and location names with the following columns:
Column | Value |
---|---|
Sample | Sample name |
Continent_Country_Population | population factor separated by underscores |
5-2 | locus 1 |
6-2 | locus 2 |
7-2 | locus 3 |
8-3 | locus 4 |
9-2 | locus 5 |
12-2 | locus 6 |
20-3 | locus 7 |
55-4 | locus 8 |
110-4 | locus 9 |
114-4 | locus 10 |
17-3 | locus 11 |
Metadata of information from the JR Steadman lab database. These data were compiled by SEE on 2018-02-14 and checked by ZNK.
Column | Value |
---|---|
MCG | Mycelial Compatibility Grouping as recorded by TJM |
AP-GenoID | Isolate ID used by AP |
InventoryID | ID used in JRS lab collection |
in-JRS-collection | Logical indicator for presence of isolate in JRS collection |
AP-Continent_Country_Population | population factor used by AP (encoding canged) |
JRS-Isolate # | Identical to InventoryID, was linked to JRS database for confirmation |
JRS-Collection Date | Collection year or date in D/M/Y format |
JRS-Source (Host) | Host Name |
JRS-Geographical Location | Location in various formats. Usually City, State. For BR isolates: City/State, Country |
JRS-Notes | Notes from the JRS data base |
Combined metadata and haplotype data after cleaning, merging, and checking for accuracy. This is generated by 01-CleanData.R. These data are in ISO-8859-1 encoding.
Column | Value |
---|---|
GenoID | Sample name |
MCG | Mycelial Compatibility Grouping as recorded by TJM |
Year | Collection year |
Continent | Continent of origin |
Country | Country of origin |
Population | State/Region of origin |
Subpop | Locality/County/City of origin |
5-2 | locus 1 |
6-2 | locus 2 |
7-2 | locus 3 |
8-3 | locus 4 |
9-2 | locus 5 |
12-2 | locus 6 |
20-3 | locus 7 |
55-4 | locus 8 |
110-4 | locus 9 |
114-4 | locus 10 |
17-3 | locus 11 |
Missing values are recorded as NA
, except for location, where the highest
heirarchical level is recorded. For example, We do not have a specific locality
recorded fro the Argentina samples so, the data are recorded like so:
Continent | Country | Population | Subpop |
---|---|---|---|
Argentina | Argentina | Argentina | Argentina |
The identifiers data.csv were based on the original isolate names from Brazil. However, because of name clashes with the JR Steadman lab database, these names were changed as soon as the isolates were entered into the database. These two files are harmonized in 01-CleanData.R before being used in any other script to ensure consistent use of region names.
On 2017-08-24, Zhian Kamvar has created the Rproject file as an anchor for this project and removed all instances of specific file paths with invocations with the package "here".
DO NOT OPEN THIS Rproj FILE FROM THE BOX FOLDER. Instead, create a new folder where you want to work (e.g. a new folder in your documents called "Thesis Project") and copy all the files within the Box folder "Scripts" there. Once you have copied these files, you can open the RStudio project and select "Build > Build All" from the dropdown menu. This will update your packages and compile all the figures and tables.