Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotated bases flag for maptide.query API #5

Merged
merged 2 commits into from
Oct 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "maptide"
version = "0.2.1"
version = "0.3.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
Expand Down
31 changes: 26 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,12 @@
```
$ pip install maptide
```
Depending on your operating system, the Rust compiler may need to be installed.

Installation instructions for the Rust compiler can be found here: https://www.rust-lang.org/tools/install

#### Build from source
Building from source requires the Rust compiler.

Installation instructions for the Rust compiler can be found here: https://www.rust-lang.org/tools/install

Once the Rust compiler is installed:
```
$ git clone https://github.com/CLIMB-COVID/maptide.git
Expand Down Expand Up @@ -46,12 +45,12 @@ options:
Number of decimal places to display (default: 3)
```

#### Frequencies over all positions:
#### Frequencies over all positions
```
$ maptide /path/to/file.bam
```

#### Frequencies over a region:
#### Frequencies over a region
```
$ maptide /path/to/file.bam --region chrom:start-end
```
Expand All @@ -63,3 +62,25 @@ Index files that do not follow the naming convention `/path/to/file.bam.bai` can
```
$ maptide /path/to/file.bam --region chrom:start-end --index /path/to/index.bai
```

#### Example in Python
`maptide` can be used within Python scripts:

```python
import maptide

data = maptide.query(
"path/to/file.bam",
region="MN908947.3:100-200", # Obtain frequencies only in this region
annotated=True, # Annotate frequencies with their bases A,C,G,T,DS,N
)

chrom = "MN908947.3" # Chosen reference/chromosome in the BAM to access
position = (100, 0) # Position 100, insert 0 (i.e. not an insertion)

# With annotated = True, frequencies are annotated with their bases:
frequencies = data[chrom][position]
print(frequencies) # {'A': 1, 'C': 122, 'G': 0, 'T': 1, 'DS': 13, 'N': 0}

# If annotated = False, frequencies would be a list i.e. [1, 122, 0, 1, 13, 0]
```
21 changes: 17 additions & 4 deletions python/maptide/api.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
import os
from typing import Dict, Tuple, List, Optional
from typing import Dict, Tuple, Optional, Any
from . import maptide #  type: ignore


BASES = ["A", "C", "G", "T", "DS", "N"]


def query(
bam: str,
region: Optional[str] = None,
bai: Optional[str] = None,
mapping_quality: int = 0,
base_quality: int = 0,
) -> Dict[str, Dict[Tuple[int, int], List[int]]]:
annotated: bool = False,
) -> Dict[str, Dict[Tuple[int, int], Any]]:
"""Performs a pileup over a region, obtaining per-position base frequencies for the provided BAM file.

Parameters
Expand All @@ -24,6 +28,8 @@ def query(
Minimum mapping quality for a read to be included in the pileup (default: 0)
base_quality : int, optional
Minimum base quality for a base within a read to be included in the pileup (default: 0)
annotated : bool, optional
Return frequencies annotated with their bases, as a `dict[str, int]`. Default is to return frequencies only, as a `list[int]` (default: False)

Returns
-------
Expand All @@ -34,9 +40,16 @@ def query(
if region:
if not bai and os.path.isfile(bam + ".bai"):
bai = bam + ".bai"
return maptide.query(bam, bai, region, mapping_quality, base_quality)
data = maptide.query(bam, bai, region, mapping_quality, base_quality)
else:
return maptide.all(bam, mapping_quality, base_quality)
data = maptide.all(bam, mapping_quality, base_quality)

if annotated:
for _, positions in data.items():
for position, frequencies in positions.items():
positions[position] = dict(zip(BASES, frequencies))

return data


def parse_region(region: str) -> Tuple[str, int, int]:
Expand Down
17 changes: 3 additions & 14 deletions python/maptide/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,23 +108,12 @@ def run():
"pos",
"ins",
"cov",
"a",
"c",
"g",
"t",
"ds",
"n",
]
] + [base.lower() for base in api.BASES]

if args.stats:
columns.extend(
[
"pc_a",
"pc_c",
"pc_g",
"pc_t",
"pc_ds",
"pc_n",
[f"pc_{base.lower()}" for base in api.BASES]
+ [
"entropy",
"secondary_entropy",
]
Expand Down