Skip to content

Latest commit

 

History

History
36 lines (27 loc) · 1.45 KB

README.md

File metadata and controls

36 lines (27 loc) · 1.45 KB

fishersapi

Build Status

A package for applying a fast implementation of Fisher's exact test to observations in a pandas DataFrame.

Contingency tables are computed based on all pairs of columns in cols and all pairs of unique values within the columns. The results are tested against scipy.stats.fishers_exact and fallback on scipy if numba is not avilable.

Installation

The package is compatible with Python 3 and can be installed from PyPI or cloned and installed directly.

pip install fishersapi

Example

import fishersapi

a = np.random.randint(1, 50, size=n)
b = np.random.randint(1, 50, size=n)
c = np.random.randint(1, 100, size=n)
d = np.random.randint(1, 100, size=n)
    
ORs, pvalues = fishersapi.fishers_vec(a, b, c, d, alternative='two-sided')

n = 50
df = pd.DataFrame({'VA':np.random.choice(['TRAV14', 'TRAV12', 'TRAV3', 'TRAV23', 'TRAV11', 'TRAV6'], n),
                   'JA':np.random.choice(['TRAJ4', 'TRAJ2', 'TRAJ3','TRAJ5', 'TRAJ21', 'TRAJ13'], n),
                   'VB':np.random.choice(['TRBV14', 'TRBV12', 'TRBV3', 'TRBV23', 'TRBV11', 'TRBV6'], n),
                   'JB':np.random.choice(['TRBJ4', 'TRBJ2', 'TRBJ3','TRBJ5', 'TRBJ21', 'TRBJ13'], n)})
df = df.assign(Count=1)
df.loc[:10, 'Count'] = 15

res = fishersapi.fishers_frame(df, ['VA', 'JA', 'VB', 'JB'], count_col=None, alternative='two-sided')