Skip to content

Commit

Permalink
version 2.0.0 release (#10)
Browse files Browse the repository at this point in the history
Version 2.0.0, reimplemented in Rust
  • Loading branch information
cmatKhan authored Dec 4, 2024
1 parent 5409b8f commit 29fba2f
Show file tree
Hide file tree
Showing 58 changed files with 5,965 additions and 26,922 deletions.
39 changes: 39 additions & 0 deletions .github/workflows/linting.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
name: Rust Linting and Formatting

on:
push:
branches:
- main
- dev
pull_request:
branches:
- main
- dev
jobs:
lint:
runs-on: ubuntu-latest

steps:
# Step 1: Check out the code from the repository
- name: Check out code
uses: actions/checkout@v3

# Step 2: Install OpenMPI
- name: install openmpi
run: sudo apt-get update && sudo apt-get install -y libopenmpi-dev openmpi-bin

# Step 3: Install Rust using rustup
- name: Install Rust
uses: actions-rs/toolchain@v1
with:
toolchain: stable
components: rustfmt, clippy
override: true

# Step 4: Run cargo fmt to check formatting
- name: Run cargo fmt
run: cargo fmt --all -- --check

# Step 5: Run cargo clippy to check for lints
- name: Run cargo clippy
run: cargo clippy --all-targets --all-features -- -D warnings
112 changes: 112 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
name: Rust Release

on:
push:
branches:
- main

env:
GH_TOKEN: ${{ github.token }}

jobs:
build:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
feature: ["default", mpi]

steps:
# Step 1: Checkout the code
- name: Checkout
uses: actions/checkout@v3

# Step 2: Install OpenMPI (for mpi feature only)
- name: Install OpenMPI
if: matrix.feature == 'mpi' && runner.os == 'Linux'
run: |
sudo apt-get update \
&& sudo apt-get install -y libopenmpi-dev openmpi-bin
# Step 3: Cache Rust dependencies
- name: Cache Rust dependencies
uses: actions/cache@v3
with:
path: target
key: ${{ runner.OS }}-${{ matrix.feature }}-build-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.OS }}-${{ matrix.feature }}-build-
# Step 4: Install Rust toolchain
- name: Install Rust toolchain
uses: actions-rs/toolchain@v1
with:
toolchain: beta
default: true
override: true
target: ${{ matrix.os == 'macos-latest' && 'x86_64-apple-darwin' || matrix.os == 'windows-latest' && 'x86_64-pc-windows-msvc' || '' }}

# Step 5: Build the binary with features
- name: Build
shell: bash
run: |
if [ "${{ matrix.feature }}" == "mpi" ] && [ "${{ matrix.os }}" != "ubuntu-latest" ]; then
echo "Skipping mpi build for non-Linux systems"
exit 0
fi
cargo build --release ${{ matrix.feature == 'mpi' && '--features mpi' || '' }}
mv target/release/dual_threshold_optimization \
target/release/dual_threshold_optimization-${{ matrix.os }}-${{ matrix.feature }}${{ runner.os == 'Windows' && '.exe' || '' }}
# Step 6: Upload artifact
- name: Upload artifact
uses: actions/upload-artifact@v3
if: success()
with:
name: dual_threshold_optimization-${{ matrix.os }}-${{ matrix.feature }}
path: target/release/dual_threshold_optimization-${{ matrix.os }}-${{ matrix.feature }}${{ runner.os == 'Windows' && '.exe' || '' }}

release:
runs-on: ubuntu-latest
needs: [build]

steps:
# Step 1: Checkout the code
- name: Checkout
uses: actions/checkout@v3

# Step 2: Extract version from Cargo.toml
- name: Get version from Cargo.toml
id: version
run: |
VERSION=$(grep '^version =' Cargo.toml | head -n 1 | cut -d '"' -f 2)
echo "VERSION=$VERSION"
echo "version=$VERSION" >> $GITHUB_ENV
# Step 3: Create a release
- name: Create GitHub release
uses: actions/create-release@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
tag_name: "v${{ env.version }}"
release_name: "${{ env.version }}"
draft: false
prerelease: false

# Step 4: Download artifacts from build jobs
- name: Download artifacts
uses: actions/download-artifact@v3
with:
path: binaries

# Step 5: Upload binaries to the release
- name: Upload binaries to release
run: |
for file in binaries/*/*; do
if [ -f "$file" ]; then
echo "Uploading $file"
gh release upload "v${{ env.version }}" "$file" --clobber
fi
done
47 changes: 47 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Rust Tests

on:
push:
branches:
- main
- dev
pull_request:
branches:
- main
- dev

jobs:
test:
name: Run Cargo Tests
runs-on: ubuntu-latest

steps:
# Checkout the code from the repository
- name: Checkout Code
uses: actions/checkout@v3

# Install OpenMPI
- name: Install OpenMPI
run: sudo apt-get update && sudo apt-get install -y libopenmpi-dev openmpi-bin

# Set up Rust
- name: Set up Rust
uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true

# Cache Cargo dependencies
- name: Cache Cargo Dependencies
uses: actions/cache@v3
with:
path: |
~/.cargo/registry
~/.cargo/git
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-
# Run cargo test
- name: Run Tests
run: cargo test --all --verbose
42 changes: 28 additions & 14 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,14 +1,28 @@
Code/*.pyc
Code/tmp/
Job_scripts
ResourcesDataframes
ExtraFiles/
Notebooks/
Output/
PaperFigures/
PaperSupplementals/
Patel*/
archive/
venv/
.ipynb_checkpoints/
__pycache__
# vscode
.vscode/

# ignore local files
/tmp

# Featurerated by Cargo
# will have compiled files and executables
debug/
target/
target-debug/

# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
# More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
# Cargo.lock

# These are backup files generated by rustfmt
**/*.rs.bk

# MSVC Windows builds of rustc generate these, which store debugging information
*.pdb

# RustRover
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
17 changes: 17 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
repos:
- repo: local
hooks:
- id: rust-linting
name: Rust linting
description: Run cargo fmt on files included in the commit. rustfmt should be installed before-hand.
entry: cargo fmt --all --
pass_filenames: true
types: [file, rust]
language: system
- id: rust-clippy
name: Rust clippy
description: Run cargo clippy on files included in the commit. clippy should be installed before-hand.
entry: cargo clippy --all-targets --all-features -- -Dclippy::all
pass_filenames: false
types: [file, rust]
language: system
46 changes: 46 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## 2.0.0

This is a complete re-write in Rust. In addition to changing the language, the
the following modifications to the algorithm have been made:

1. The heart of this algorithm is $O(n^2)$. In order to approximate the null distribution,
this $O(n^2)$ operation is performed n times. These permutations are now parallelized and
will scale linearly down to the time it takes to run a single search through the
threshold space. Run time on human data with ~16k genes on 30 CPU is less than an hour
now, and uses less than 3G space.
1. The cmd line input has been greatly simplified. This additionally represents a
significant change to the protocol:
- Previously, while in the documentation the operation was described as 'ranking',
it would be more appropriate to call it sorting or ordering. Ties were not
addressed. In version 2.0.0, we now expect that the input is a ranked list where
the first column is the feature identifier, and the second column is the rank. It
is up to the user to appropriately rank their data. Examples and recommendations
are provided in the documentation.
1. There are messages printed to stderr when there is more than one set of thresholds
with the same minimum p-value and the same intersect size. This occurs due to the
nature of the hypergeometric p-value.

### Added

1. `profiling/` stores runtime and memory usage information from
[hyperfine](https://github.com/sharkdp/hyperfine) and
[heaptrack](https://github.com/KDE/heaptrack) respectively
1. github actions CI has been added to run tests on pushes to `dev` and `main`
1. Semantic versioning and github releases have been added
1. The package is distributed through crates.io and bioconda
1. Docstrings with examples and module level documentation
1. tests
1. an MPI implementation to parallelize across multiple machines

## 1.0.0 -- Initial release

This version was written by [Yiming Kang](https://github.com/yiming-kang) and is the
version which was used to produce the results in
[Dual threshold optimization and network inference reveal convergent evidence from TF binding locations and TF perturbation responses](https://doi.org/10.1101/gr.259655.119)
Loading

0 comments on commit 29fba2f

Please sign in to comment.