Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Venn diagram on sparse data #2334

Merged
merged 5 commits into from
Jun 2, 2017
Merged

Conversation

mstrazar
Copy link
Contributor

@mstrazar mstrazar commented May 22, 2017

Issue

Fixes #2164.

Description of changes

Venn diagram is fixed to work on sparse data.

The methods reshape_wide() and varying_between() have been rewritten to
work on sparse data, by iterating over non-zero values rather than all values.

Remaining issues:

  1. Remaining computational bottleneck is Table.transform(domain) in the reshape_wide() method, which iterates over all columns.

Current timings:
~10 sec on 200 x 10000 dataset.
~2 min on 1200 x 24000 dataset.
too long on 61000 x 24000 dataset (friends-transcripts ; takes 1.2 GB memory).

Includes
  • Code changes
  • Tests
  • Documentation

@codecov-io
Copy link

codecov-io commented May 26, 2017

Codecov Report

Merging #2334 into master will decrease coverage by 0.03%.
The diff coverage is 98.4%.

@@            Coverage Diff             @@
##           master    #2334      +/-   ##
==========================================
- Coverage   73.39%   73.35%   -0.04%     
==========================================
  Files         317      317              
  Lines       55619    55595      -24     
==========================================
- Hits        40819    40782      -37     
- Misses      14800    14813      +13

@mstrazar
Copy link
Contributor Author

Added a test for varying_between.

@nikicc nikicc added this to the 3.4.3 milestone Jun 2, 2017
@lanzagar lanzagar force-pushed the venn_sparse branch 2 times, most recently from 35a07a9 to 329f8c4 Compare June 2, 2017 09:11
@kernc kernc assigned kernc and unassigned lanzagar Jun 2, 2017
@kernc kernc changed the title Venn diagram on sparse data [ENH] Venn diagram on sparse data Jun 2, 2017
@kernc kernc merged commit 38f42d8 into biolab:master Jun 2, 2017
@mstrazar mstrazar deleted the venn_sparse branch June 2, 2017 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Venn Diagram should work on sparse
5 participants