Skip to content

Commit

Permalink
Merge pull request #6 from genotoul-bioinfo/dev
Browse files Browse the repository at this point in the history
Add doc and other improvements
  • Loading branch information
JeanMainguy authored Jan 22, 2024
2 parents a70f2c6 + 7cc6653 commit 6e9f448
Show file tree
Hide file tree
Showing 24 changed files with 913 additions and 59 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/binette_ci.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python

name: Test Binette
name: CI

on:
pull_request:
Expand Down
24 changes: 24 additions & 0 deletions .github/workflows/build_draft_pdf.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: build draft paper pdf
on: [push]

jobs:
paper:
runs-on: ubuntu-latest
name: Paper Draft
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Build draft PDF
uses: openjournals/openjournals-draft-action@master
with:
journal: joss
# This should be the path to the paper within your repo.
paper-path: paper/paper.md
- name: Upload
uses: actions/upload-artifact@v1
with:
name: paper
# This is the output path where Pandoc will write the compiled
# PDF. Note, this should be the same directory as the input
# paper.md
path: paper/paper.pdf
48 changes: 48 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# This workflow will upload a Python Package using Twine when a release is created
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python#publishing-to-package-registries

# This workflow uses actions that are not certified by GitHub.
# They are provided by a third-party and are governed by
# separate terms of service, privacy policy, and support
# documentation.

name: Upload Python Package

on:
release:
types: [published]

# on: [push]

permissions:
contents: read
id-token: write

jobs:
deploy:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install build
- name: Build package
run: python -m build

- name: Publish package distributions to TestPyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
# repository-url: https://test.pypi.org/legacy/




39 changes: 39 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2
python:
install:
- method: pip
path: .
extra_requirements:
- doc
- main_deps



# Set the OS, Python version and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.8"



# Build documentation in the "docs/" directory with Sphinx
sphinx:
configuration: docs/conf.py

# Optionally build your docs in additional formats such as PDF and ePub
# formats:
# - pdf
# - epub

# Optional but recommended, declare the Python requirements required
# to build your documentation
# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
# python:
# install:
# - requirements: docs/requirements.txt
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/binette/README.html) [![Anaconda-Server Badge](https://anaconda.org/bioconda/binette/badges/downloads.svg)](https://anaconda.org/bioconda/binette) [![Test Coverage](https://genotoul-bioinfo.github.io/Binette/coverage-badge.svg)](https://genotoul-bioinfo.github.io/Binette/)
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/binette/README.html) [![Anaconda-Server Badge](https://anaconda.org/bioconda/binette/badges/downloads.svg)](https://anaconda.org/bioconda/binette) [![Anaconda-Server Badge](https://anaconda.org/bioconda/binette/badges/license.svg)](https://anaconda.org/bioconda/binette) [![Anaconda-Server Badge](https://anaconda.org/bioconda/binette/badges/version.svg)](https://anaconda.org/bioconda/binette)

[![Test Coverage](https://genotoul-bioinfo.github.io/Binette/coverage-badge.svg)](https://genotoul-bioinfo.github.io/Binette/) [![CI Status](https://github.com/genotoul-bioinfo/Binette/actions/workflows/binette_ci.yml/badge.svg)](https://github.com/genotoul-bioinfo/Binette/actions/workflows) [![Documentation Status](https://readthedocs.org/projects/binette/badge/?version=latest)](https://binette.readthedocs.io/en/latest/?badge=latest)


# Binette

Expand All @@ -9,10 +12,10 @@ From the input bin sets, Binette constructs new hybrid bins. A bin can be seen a
- Difference bin: This bin contains the contigs that are exclusively found in one bin and not present in the others.
- Union bin: The union bin includes all the contigs contained within the overlapping bins

It then uses checkm2 to assess bins quality to finally select the best bins possible.
It then uses CheckM2 to assess bins quality to finally select the best bins possible.

Binette is inspired from the metaWRAP bin-refinement tool but it effectively solves all the problems from that very tool.
- Enhanced Speed: Binette significantly improves the speed of the refinement process. It achieves this by launching the initial steps of checkm2, such as prodigal and diamond runs, only once on all contigs. These intermediate results are then utilized to assess the quality of any given bin, eliminating redundant computations and accelerating the refinement process.
- Enhanced Speed: Binette significantly improves the speed of the refinement process. It achieves this by launching the initial steps of CheckM2, such as Prodigal and Diamond runs, only once on all contigs. These intermediate results are then utilized to assess the quality of any given bin, eliminating redundant computations and accelerating the refinement process.
- No Limit on Input Bin Sets: Unlike its predecessor, Binette is not constrained by the number of input bin sets. It can handle and process multiple bin sets simultaneously.
<!-- - Bin selection have been improved. It selects the best bins in a more accurate and elegant manner.
- It is easier to use. -->
Expand Down
1 change: 1 addition & 0 deletions binette/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__version__ = '0.1.6'
59 changes: 30 additions & 29 deletions binette/binette.py → binette/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@
import sys
import logging
import os
import pkg_resources

import binette
from binette import contig_manager, cds, diamond, bin_quality, bin_manager, io_manager as io
from typing import List, Dict, Set, Tuple

Expand All @@ -40,83 +40,84 @@ def init_logging(verbose, debug):
)



def parse_arguments(args):
"""Parse script arguments."""
program_version = pkg_resources.get_distribution("Binette").version

parser = ArgumentParser(
description=f"Binette version={program_version}",
description=f"Binette version={binette.__version__}",
formatter_class=ArgumentDefaultsHelpFormatter,
)
# TODO add catagory to better visualize the required and the optional args
input_arg = parser.add_mutually_exclusive_group(required=True)

# Input arguments category
input_group = parser.add_argument_group('Input Arguments')
input_arg = input_group.add_mutually_exclusive_group(required=True)

input_arg.add_argument(
"-d",
"--bin_dirs",
nargs="+",
help="list of bin folders containing each bin in a fasta file.",
help="List of bin folders containing each bin in a fasta file.",
)

input_arg.add_argument(
"-b",
"--contig2bin_tables",
nargs="+",
help="list of contig2bin table with two columns separated\
help="List of contig2bin table with two columns separated\
with a tabulation: contig, bin",
)

parser.add_argument("-c", "--contigs", required=True, help="Contigs in fasta format.")
input_group.add_argument("-c", "--contigs", required=True, help="Contigs in fasta format.")

parser.add_argument(
# Other parameters category
other_group = parser.add_argument_group('Other Arguments')

other_group.add_argument(
"-m",
"--min_completeness",
default=10,
default=40,
type=int,
help="Minimum completeness required for final bin selections.",
)

parser.add_argument("-t", "--threads", default=1, type=int, help="Number of threads.")
other_group.add_argument("-t", "--threads", default=1, type=int, help="Number of threads to use.")

parser.add_argument("-o", "--outdir", default="results", help="Output directory.")
other_group.add_argument("-o", "--outdir", default="results", help="Output directory.")

parser.add_argument(
other_group.add_argument(
"-w",
"--contamination_weight",
default=5,
default=2,
type=float,
help="Bin are scored as follow: completeness - weight * contamination. "
"A low contamination_weight favor complete bins over low contaminated bins.",
)

parser.add_argument(
"-e",
"--extension",
default="fasta",
help="Extension of fasta files in bin folders "
"(necessary when --bin_dirs is used).",
)

parser.add_argument(
other_group.add_argument(
"--checkm2_db",
help="Provide a path for the CheckM2 diamond database. "
"By default the database set via <checkm2 database> is used.",
)

parser.add_argument("-v", "--verbose", help="increase output verbosity", action="store_true")
other_group.add_argument("--low_mem", help="Use low mem mode when running diamond", action="store_true")

other_group.add_argument("-v", "--verbose", help="increase output verbosity", action="store_true")

parser.add_argument("--debug", help="active debug mode", action="store_true")
other_group.add_argument("--debug", help="Activate debug mode", action="store_true")

parser.add_argument("--resume", help="active resume mode", action="store_true")
other_group.add_argument("--resume",
action="store_true",
help="Activate resume mode. Binette will examine the 'temporary_files' directory "
"within the output directory and reuse any existing files if possible."
)

parser.add_argument("--low_mem", help="low mem mode", action="store_true")

parser.add_argument("--version", action="version", version=program_version)
other_group.add_argument("--version", action="version", version=binette.__version__)

args = parser.parse_args(args)
return args


def parse_input_files(bin_dirs: List[str], contig2bin_tables: List[str], contigs_fasta: str) -> Tuple[Dict[str, List], List, Dict[str, List], Dict[str, int]]:
"""
Parses input files to retrieve information related to bins and contigs.
Expand Down
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
8 changes: 8 additions & 0 deletions docs/api/api_ref.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# API Reference

```{toctree}
:maxdepth: 2
binette
indice_and_table
```

75 changes: 75 additions & 0 deletions docs/api/binette.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# binette package

## Submodules

## binette.bin_manager module

```{eval-rst}
.. automodule:: binette.bin_manager
:members:
:undoc-members:
:show-inheritance:
```

## binette.bin_quality module

```{eval-rst}
.. automodule:: binette.bin_quality
:members:
:undoc-members:
:show-inheritance:
```

## binette.binette module

```{eval-rst}
.. automodule:: binette.binette
:members:
:undoc-members:
:show-inheritance:
```

## binette.cds module

```{eval-rst}
.. automodule:: binette.cds
:members:
:undoc-members:
:show-inheritance:
```

## binette.contig_manager module

```{eval-rst}
.. automodule:: binette.contig_manager
:members:
:undoc-members:
:show-inheritance:
```

## binette.diamond module

```{eval-rst}
.. automodule:: binette.diamond
:members:
:undoc-members:
:show-inheritance:
```

## binette.io_manager module

```{eval-rst}
.. automodule:: binette.io_manager
:members:
:undoc-members:
:show-inheritance:
```

## Module contents

```{eval-rst}
.. automodule:: binette
:members:
:undoc-members:
:show-inheritance:
```
7 changes: 7 additions & 0 deletions docs/api/modules.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# binette

```{toctree}
:maxdepth: 4
binette
```
Loading

0 comments on commit 6e9f448

Please sign in to comment.