Releases: stevemussmann/admixturePipeline
AdmixPipe v3.2.1
Added new options
- -H option in admixturePipeline.py: allows user to perform haploid data analysis in Admixture
- -s option in distructRerun.py: allows user to sort by q-value within sample groups when plotting Admixture outputs.
AdmixPipe v3.2
Updates added for v3.2 release and pending publication:
- Filtering options now enabled for PLINK-formatted files
- Documentation and example files updated.
- Tutorial expanded and updated (see example_files folder)
AdmixPipe v3.1
New features:
- Direct input of PLINK .ped format (PLINK option --encode12) and .bed format is now allowed. See documentation for available filtering options.
- CLUMPAK is now integrated into the Docker container, allowing it to be run locally. The submitClumpak.py module now interacts with the local installation in addition to retaining functionality of submitting to the CLUMPAK webserver. Note some command line options have changed in this module.
Bug fixes:
- Fixed a major bug that caused all Admixture outputs per K value to be overwritten if the user was running the pipeline from several directories above where their input files were located and supplying a relative path to the input.
Admixpipe v3.0.3
Same as v3.0.2 but with updated example files.
Admixpipe v3.0.2
New features:
- cvSum.py now calculates summary statistics and plots for loglikelihood values. This is done for both major and minor clusters identified by CLUMPAK
Bug fixes:
- Changed import order of pandas and matplotlib in graphics.py. The previous order was causing cvSum.py to crash on some systems.
- Fixed a bug in distructRerun.py that could potentially cause duplication of CV values if distructRerun.py was run on the same dataset multiple times.
Other changes:
- Removed some legacy code from admixturePipeline.py that was no longer being used (i.e., system calls to grep commands that were printing CV and loglikelihood values to files).
AdmixPipe v3.0.1
Update List:
- Beginning in AdmixPipe v3.0, functionality was added to distructRerun.py to store paths to .q files output from CLUMPAK in a json file to avoid parsing the CLUMPAK output a second time in runEvalAdmix.py. When averaging the values in the .q files to plot the EvalAdmix output for major/minor clusters, the code was failing to throw an error when it found no files to average. I inserted a block of code to test for this error, and report a more informative error message that tells the user what might have happened (i.e., they moved the location of their files after running distructRerun.py, or they somehow managed to run distructRerun.py outside of the Docker container).
- Code for parsing popmap files was updated so that 1) the code called from admixturePipeline.py and runEvalAdmix.py is now the same, and 2) it now prints a more informative error message when >2 columns are present on any line of the popmap file.
- Docker container was updated to install the programs wget, less, and vim so that users can view, edit, and pull files from the web more easily from within the container.
- Docker container also now defaults to the data directory upon launch (/app/data) so that users do not have to immediately change directories upon launching the Docker container.
AdmixPipe v3.0
v3.0 has several "behind the scenes" changes, bug fixes, and enhancements to existing modules. Two new modules have also been developed for the following purposes:
- automatic submission of admixturePipeline.py output to the CLUMPAK website.
- assessment of the best K using the evalAdmix package (http://www.popgen.dk/software/index.php/EvalAdmix).
Other important notes for v3.0:
- Some outputs from AdmixPipe v2.0 are incompatible with v3.0 because I now use json files to track data and file names from early parts of the pipeline that are needed in some of the later modules.
- Whereas AdmixPipe v2.0 was backward compatible with Python 2.7.x, v3.0+ requires Python 3.
- Some command line options have changed slightly (especially long form commands - you can get the current list of commands by running any module with the --help option).
- There is now a Docker container to streamline the installation process.
- The submitClumpak.py module is completely optional. You can accomplish the same results by manually submitting your admixturePipeline.py module outputs to the CLUMPAK website.
- The submitClumpak.py module will not function in the Docker container. If you wish to use it, this module must be set up on your own computer. It requires selenium and currently is only compatible if you have Firefox installed.
- The data processing and plotting functions of the cvSum.py module underwent a complete rewrite for v3.0.
AdmixPipe v2.0.2
Publication release.
Changed default setting for biallelic filter from off to on.
Admixture Pipeline v2.0.1
Bug fix - drawparams files created for Distruct were not finding input files when absolute paths were being used.
Admixture Pipeline v2.0
Pre-release for new admixturePipeline version.
v2.0 combines admixturePipeline with my admixture_cv_sum and distruct-rerun repositories.
New features include:
- python3 compatibility
- added distructRerun.py and cvSum.py
- distructRerun.py can now execute distruct on drawparams files
- New options and flexibility are included for selecting a color palette in distructRerun.py
- distructRerun.py now performs functions that previously required additional bash commands to be run by the user
Bug fixes:
- The correct input population list for CLUMPAK is now generated when using the blacklist option or excluding individuals due to high amounts of missing data in admixturePipeline.
- Fixed a condition in which admixturePipeline would erroneously include .Q files from subdirectories.
- Fixed parsing of boolean command line options
- Various minor bug fixes
Experimental options:
- distructRerun.py now has an option that may allow you to re-run distruct on STRUCTURE data that was analyzed in CLUMPAK. Option is not robustly tested.