Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: perplexity must be less than n_samples #112

Closed
velten-lab opened this issue Jan 22, 2024 · 5 comments
Closed

ValueError: perplexity must be less than n_samples #112

velten-lab opened this issue Jan 22, 2024 · 5 comments

Comments

@velten-lab
Copy link

This might be related to issue 111, not sure. I first ran into that issue, and now, after upgrading to v0.5.16.3, I get an error as below, also at the t-SNE step.

Essentially I am following the example notebook TF_MoDISco_TAL_GATA.ipynb , with the difference that I have contribution scores for more sequences, computed by ISM.

The complete output is attached. Thanks for looking into this! tfmodisco_output.txt

@AvantiShri
Copy link
Collaborator

Thanks for reporting, it gives incentive to fix these things. Should be fixed in https://github.com/kundajelab/tfmodisco/releases/tag/v0.5.16.4 and I uploaded the release to pypi. Please let me know if it doesn't work

@velten-lab
Copy link
Author

velten-lab commented Jan 27, 2024

Thank you for looking into this! Unfortunately I still get the same error. The only real difference in the trace is here:

File ~/miniforge3/envs/tfmodisco/lib/python3.12/site-packages/modisco/core.py:780, in AggregatedSeqlet.compute_subclusters_and_embedding(self, pattern_comparison_settings, perplexity, n_jobs, verbose, compute_embedding)
    774 distmat_sp.sort_indices()
    775 if (compute_embedding):
    776     twod_embedding = sklearn.manifold.TSNE(
    777         perplexity=min(perplexity, distmat_sp.shape[0]),
    778         metric='precomputed',
    779         init="random",
--> 780         verbose=3, random_state=1234).fit_transform(distmat_sp) 
    781     self.twod_embedding = twod_embedding
    783 #do density adaptation```

@giulic3
Copy link

giulic3 commented Jan 27, 2024

Following the thread, as I'm getting the exact same error(s), before and after the update. Thanks for helping!

@AvantiShri
Copy link
Collaborator

AvantiShri commented Jan 28, 2024

Sorry about that, I should have done min(perplexity, distmat_sp.shape[0]-1) rather than min(perplexity, distmat_sp.shape[0]). Can you try v0.5.16.4.1 (just pushed) and let me know if that works? It seems that I don't hit this error on my dataset even when I set the min cluster size to 30, so I'm relying on you to test it.

@giulic3
Copy link

giulic3 commented Jan 29, 2024

I have just tested it, it seems to be working now for me: it runs until the end with no errors and I can extract some patterns. Thanks @AvantiShri !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants