README.Rmd

---
title: "cdsrmodels"
output:
  github_document:
  html_notebook:
    theme: united
  html_document:
    theme: united
editor_options: 
  chunk_output_type: inline
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(warning = FALSE, message = FALSE, fig.width = 7, fig.height = 7, cache = T)

library(tidyverse)
library(useful)
library(taigr)
library(cdsrbiomarker)
```

cdsrmodels contains modeling function created by the cancer data science team.

## Install

```{r, eval = FALSE}
library(devtools)
devtools::install_github("broadinstitute/cdsr_models")
```

The package can then be loaded by calling

```{r, eval=FALSE}
library(cdsrmodels)
```

## Modeling functions

### discrete_test

Compares binary features, such as lineage and mutation, running a t-test on the difference in mean response between cell lines with the feature and without it. Run on response vector `y` and feature matrix `X`

```{r, eval=FALSE}
cdsrmodels::discrete_test(X, y)
```

### lin_associations

Compares continuous features, such as gene expression, calculating correlations between response and each feature. Run on feature matrix `A`, response vector `y`, and an optional matrix of confounders `W`. Other parameters can also be tuned and are explained in the function documentation.

```{r, eval=FALSE}
cdsrmodels::lin_associations(A, y, W=NULL)
```

### random_forest

Fits a random forest to a feature matrix `X` and a response vector `y` returning estimates of variable importance for each feature, as well as model level statistics such as R-squared. Other parameters can also be tuned and are explained in the function documentation.

```{r, eval=FALSE}
cdsrmodels::random_forest(X, y)
```