-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Improve test coverage #253
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #253 +/- ##
==========================================
- Coverage 75.48% 74.82% -0.66%
==========================================
Files 4 5 +1
Lines 673 719 +46
Branches 148 130 -18
==========================================
+ Hits 508 538 +30
- Misses 128 140 +12
- Partials 37 41 +4
Continue to review full report at Codecov.
|
hmm, looks like it didn't budge. merge or close? |
neither. Wait a bit. I'll give another try in a day or two :) |
d22a3b1
to
eff469b
Compare
@pavanramkumar looks like the dataset fetchers don't work on python 3.5. Do you want to iterate on top of my pull request here? You can push directly to my branch if you want. |
@pavanramkumar let's merge this? It will improve coverage a little ... |
@jasmainak it's really strange why the dataset fetcher doesn't work with travis. when i run it on my local py35 evironment, it works fine. perhaps a miniconda dependency issue?
|
That's weird I am able to reproduce the (py35) mainak@mainak-ThinkPad-W540 ~/Desktop/projects/github_repos/pyglmnet $ pytest tests/test_pyglmnet.py -k 'test_fetch_datasets'
============================================================== test session starts ===============================================================
platform linux -- Python 3.5.3, pytest-3.0.7, py-1.4.33, pluggy-0.4.0
rootdir: /home/mainak/Desktop/projects/github_repos/pyglmnet, inifile:
collected 9 items
tests/test_pyglmnet.py F
==================================================================== FAILURES ====================================================================
______________________________________________________________ test_fetch_datasets _______________________________________________________________
def test_fetch_datasets():
"""Test fetching datasets."""
datasets.fetch_community_crime_data('/tmp/glm-tools')
> datasets.fetch_group_lasso_datasets()
tests/test_pyglmnet.py:348:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
def fetch_group_lasso_datasets():
"""
Downloads and formats data needed for the group lasso example.
Returns:
--------
design_matrix: pandas.DataFrame
pandas dataframe with formatted data and labels
groups: list
list of group indicies, the value of the ith position in the list
is the group number for the ith regression coefficient
"""
try:
import pandas as pd
except ImportError:
raise ImportError('The pandas module is required for the '
'group lasso dataset')
# helper functions
def find_interaction_index(seq, subseq,
alphabet="ATGC",
all_possible_len_n_interactions=None):
n = len(subseq)
alphabet_interactions = \
[set(p) for
p in list(itertools.combinations_with_replacement(alphabet, n))]
num_interactions = len(alphabet_interactions)
if all_possible_len_n_interactions is None:
all_possible_len_n_interactions = \
[set(interaction) for
interaction in
list(itertools.combinations_with_replacement(seq, n))]
subseq = set(subseq)
group_index = num_interactions * \
all_possible_len_n_interactions.index(subseq)
value_index = alphabet_interactions.index(subseq)
final_index = group_index + value_index
return final_index
def create_group_indicies_list(seqlength=7,
alphabet="ATGC",
interactions=[1, 2, 3],
include_extra=True):
alphabet_length = len(alphabet)
index_groups = []
if include_extra:
index_groups.append(0)
group_count = 1
for inter in interactions:
n_interactions = comb(seqlength, inter)
n_alphabet_combos = comb(alphabet_length,
inter,
repetition=True)
for x1 in range(int(n_interactions)):
for x2 in range(int(n_alphabet_combos)):
index_groups.append(int(group_count))
group_count += 1
return index_groups
def create_feature_vector_for_sequence(seq,
alphabet="ATGC",
interactions=[1, 2, 3]):
feature_vector_length = \
sum([comb(len(seq), inter) *
comb(len(alphabet), inter, repetition=True)
for inter in interactions]) + 1
feature_vector = np.zeros(int(feature_vector_length))
feature_vector[0] = 1.0
for inter in interactions:
# interactions at the current level
cur_interactions = \
[set(p) for p in list(itertools.combinations(seq, inter))]
interaction_idxs = \
[find_interaction_index(
seq, cur_inter,
all_possible_len_n_interactions=cur_interactions) + 1
for cur_inter in cur_interactions]
feature_vector[interaction_idxs] = 1.0
return feature_vector
positive_url = \
"http://genes.mit.edu/burgelab/maxent/ssdata/MEMset/train5_hs"
negative_url = \
"http://genes.mit.edu/burgelab/maxent/ssdata/MEMset/train0_5_hs"
> pos_file = tempfile.NamedTemporaryFile(bufsize=0)
E TypeError: NamedTemporaryFile() got an unexpected keyword argument 'bufsize'
pyglmnet/datasets.py:203: TypeError
-------------------------------------------------------------- Captured stdout call --------------------------------------------------------------
...99%, 1 MB
...100%, 1 MB
=============================================================== 8 tests deselected ===============================================================
===================================================== 1 failed, 8 deselected in 2.85 seconds ===================================================== |
arfff ... now the website hosting the community crime data seems to be down. So Travis won't pass ... |
TST improve coverage FIX installation in Makefile FIX for python3
Let's see if this helps with coverage ...