Skip to content

Commit

Permalink
init
Browse files Browse the repository at this point in the history
  • Loading branch information
CaoTianze committed May 5, 2023
1 parent 07cb2d7 commit 0383d09
Show file tree
Hide file tree
Showing 107 changed files with 93,286 additions and 0 deletions.
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2023 CaoTianze

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
10 changes: 10 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# plotnineSeqSuite: a Python package for visualizing sequence data using ggplot2
## Installation
`pip install plotnineseqsuite`
## Getting Started
See [plotnineSeqSuite Documentation](https://caotianze.github.io/plotnineseqsuite/)
## Development environment
PyCharm 2022.1 (Community Edition) and Spyder version: 5.4.1 (conda)
## Dependencies
Python version: 3.10.9 64-bit
plotnine: 0.10.1
Binary file added docs/accepted_input_formats.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/bar_accepted_input_formats.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/bar_no_sequence_letter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
29 changes: 29 additions & 0 deletions docs/col_schemes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# color schemes
Default color schemes: chemistry, chemistry2, hydrophobicity, nucleotide, nucleotide2, base_pairing, clustalx, taylor.
## *function* make_col_scheme(name: str = 'Custom Scheme', chars: Optional[list[str]] = None, groups: Optional[list[int]] = None, cols: Optional[list[int]] = None, values: Optional[list[int]] = None) -> dict
```python
from plotnineseqsuite.col_schemes import make_col_scheme
cs1 = make_col_scheme(chars=['A', 'T', 'C', 'G'], groups=['gr1', 'gr1', 'gr2', 'gr2'],cols=['purple', 'purple', 'blue', 'blue'])
cs2 = make_col_scheme(chars=['A', 'T', 'C', 'G'], values=[1,2,3,4])
```
The function is used to create custom color style themes.
### name
Name of custom scheme. It will display in legend.
### chars
Letters will used to plot.
### groups
Used in a custom discrete color scheme. It groups letters.
### cols
Used in a custom discrete color scheme. It represents the RGB value of the grouped color.
### values
Used in a custom continuous color scheme. It represents the numeric value of the corresponding letter.
## *function* get_col_scheme(col_scheme: str, seq_type: str = 'AUTO') -> dict
This function is used to get the built-in color theme of the type of the given sequence.
```python
from plotnineseqsuite.col_schemes import get_col_scheme
col_df = get_col_scheme(col_scheme='chemistry')
```
### col_scheme
One of the default color schemes.
### seq_type
AA or DNA or RNA
Binary file added docs/combining_plots.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/custom_alphabet.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/custom_alphabet_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/custom_color_schemes1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/custom_color_schemes2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/custom_height_logos.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions docs/font.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# font
Default fonts: times_new_roman, arial, courier_new, akrobat_bold, xkcd_regular, akrobat_regular, helvetica_bold, helvetica_light, helvetica_regular, roboto_bold, roboto_medium, roboto_regular, roboto_slab_bold, roboto_slab_light, roboto_slab_regular.
## *function* list_fonts()
Get all fonts.
## *function* get_font(font_name: str) -> DataFrame
```python
from plotnineseqsuite import get_font
f_df = get_font(font_name='times_new_roman')
```
Gets the specified font.
### font_name
Name of one of the default fonts.
Binary file added docs/fonts.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
52 changes: 52 additions & 0 deletions docs/geom_alignedSeq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# geom_alignedSeq
A class that represents the sequence alignment diagram
## *class* geom_alignedSeq(self,data: Union[list[str], dict, None] = None,seq_names: Optional[list[str]] = None,seq_type: str = 'AUTO',namespace: Optional[list[str]] = None,font: str = 'roboto_medium',stack_width: float = 0.75,font_col: str = '#FFFFFF',bg_col_scheme: Union[DataFrame, str] = 'AUTO',bg_low_col: str = 'black',bg_high_col: str = 'yellow',bg_na_col: str = '#333333',**kwargs: Any)
```python
from plotnine import ggplot, coord_fixed
from plotnineseqsuite import geom_alignedSeq, theme_seq
from plotnineseqsuite.data import seqs_dna
ggplot() + geom_alignedSeq(seqs_dna['MA0013.1']) + theme_seq() + coord_fixed()
```
- data
Sequence data or corresponding dict.
- seq_names
The name corresponding to the sequence data.
- seq_type
OTHER, AA, DNA, RNA
- namespace
The letter corresponding to the data.
- font
Font value
- stack_width
The ratio of the size of letters to the standard unit width.
- font_col
The color of the font.
- bg_col_scheme
Color scheme of the backgrounds.
- bg_low_col
Continuous color schemes are available.
- bg_high_col
Continuous color schemes are available.
- bg_na_col
Used when the background in the corresponding namespace do not have a color matching value defined.
- kwargs
Other arguments passed on to layer().
## *properties*
- bg_data
DataFrame.
- bg_layer
A geom_tile layer. Data come from the property bg_data.
- letter_data
DataFrame.
- letter_layer
A geom_polygon layer. Data come from the property letter_data.
- scale_x_continuous
A custom scale_x_continuous.
- scale_y_continuous
A custom scale_y_continuous.
- xlab
A custom xlab.
- ylab
A custom ylab.
- colscale_opts
A custom scale_fill_gradient or custom scale_fill_manual.
46 changes: 46 additions & 0 deletions docs/geom_logo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# geom_logo
A class that represents the sequence logo
## *class* geom_logo(self,data: Union[list[str], ndarray, dict, None] = None,method: str = 'bits',seq_type: str = 'AUTO',namespace: Optional[list[str]] = None,font: str = 'roboto_medium',stack_width: float = 0.95,rev_stack_order: bool = False,col_scheme: Union[DataFrame, str] = 'AUTO',low_col: str = 'black',high_col: str = 'yellow',na_col: str = '#333333',**kwargs: Any)
```python
from plotnine import ggplot
from plotnineseqsuite import geom_logo, theme_seq
from plotnineseqsuite.data import seqs_dna
ggplot() + geom_logo(seqs_dna['MA0001.1']) + theme_seq()
```
- data
Sequence data or PFM or corresponding dict.
- method
bits, probability, custom
- seq_type
OTHER, AA, DNA, RNA
- namespace
The letter corresponding to the data. If the type of data is ndarray, the order of the namespaces must correspond to that of ndarray.
- font
Font value
- stack_width
The ratio of the size of letters to the standard unit width.
- rev_stack_order
Order of letter stack is reversed.
- col_scheme
Color scheme of the letters.
- low_col
Continuous color schemes are available.
- high_col
Continuous color schemes are available.
- na_col
Used when the letters in the corresponding namespace do not have a color matching value defined.
- kwargs
Other arguments passed on to layer().
## *properties*
- data
DataFrame.
- layer
A geom_polygon layer. Data come from the property data.
- scale_x_continuous
A custom scale_x_continuous.
- xlab
A custom xlab.
- ylab
A custom ylab.
- colscale_opts
A custom scale_fill_gradient or custom scale_fill_manual.y
50 changes: 50 additions & 0 deletions docs/geom_seqBar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# geom_seqBar
A class that represents the sequence histogram
## *class* geom_seqBar(self,data: Union[list[str], ndarray, dict, None] = None,seq_type: str = 'AUTO',namespace: Optional[list[str]] = None,font: str = 'roboto_medium',stack_width: float = 0.75,bar_col_scheme: Union[DataFrame, str] = 'AUTO',font_col: str = '#808080',low_col: str = 'black',high_col: str = 'yellow',na_col: str = '#333333',**kwargs: Any)
```python
from plotnine import ggplot
from plotnineseqsuite import geom_seqBar, theme_seq
from plotnineseqsuite.data import seqs_dna
ggplot() + geom_seqBar(seqs_dna['MA0013.1']) + theme_seq()
```
- data
Sequence data or PFM or corresponding dict.
- seq_type
OTHER, AA, DNA, RNA
- namespace
The letter corresponding to the data. If the type of data is ndarray, the order of the namespaces must correspond to that of ndarray.
- font
Font value
- stack_width
The ratio of the size of letters and the width of bars to the standard unit width.
- bar_col_scheme
Color scheme of the cylinder.
- font_col
The color of the font.
- low_col
Continuous color schemes are available.
- high_col
Continuous color schemes are available.
- na_col
Used when the letters in the corresponding namespace do not have a color matching value defined.
- kwargs
Other arguments passed on to layer().
## *properties*
- bar_data
DataFrame.
- bar_layer
A geom_tile layer. Data come from the property bar_data.
- letter_data
DataFrame.
- letter_layer
A geom_polygon layer. Data come from the property letter_data.
- scale_x_continuous
A custom scale_x_continuous.
- scale_y_continuous
A custom scale_y_continuous.
- xlab
A custom xlab.
- ylab
A custom ylab.
- colscale_opts
A custom scale_fill_gradient or custom scale_fill_manual.
Loading

0 comments on commit 0383d09

Please sign in to comment.