Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError when run tf-modisco #117

Open
Jie-Lii opened this issue Jul 19, 2024 · 1 comment
Open

AssertionError when run tf-modisco #117

Jie-Lii opened this issue Jul 19, 2024 · 1 comment

Comments

@Jie-Lii
Copy link

Jie-Lii commented Jul 19, 2024

Hi, i am trying to use tf-modisco to find motifs, but I encountered the following error during execution:

MEMORY 3.652018176
On task task0
Computing windowed sums on original
Generating null dist
peak(mu)= 0.0071754540205001835
Computing threshold
Subsampling!
For increasing = True , the minimum IR precision was 0.40944084378429907 occurring at 0.0 implying a frac_neg of 0.6933104659793835
To be conservative, adjusted frac neg is 0.95
Thresholds from null dist were -inf  and  8.0625 with frac passing 2e-06
Passing windows frac was 2e-06 , which is below  0.03 ; adjusting
Final raw thresholds are -3.90625  and  3.90625
Final transformed thresholds are -0.9697025572005383  and  0.9697025572005383
saving plot to figures/scoredist_0.png
Got 9863 coords
After resolving overlaps, got 9863 seqlets
Across all tasks, the weakest transformed threshold used was: 0.9696025572005383
MEMORY 3.787350016
9863 identified in total
Traceback (most recent call last):
  File "C:\Users\11435\Desktop\code\2024-07\2024-07-19\tf-modisco玉米5套数据测试\Code\utils.py", line 160, in <module>
    run_modisco(onehot_data=dna_arr[:2000], gradient_data=gradient_arr[:2000])
  File "C:\Users\11435\Desktop\code\2024-07\2024-07-19\tf-modisco玉米5套数据测试\Code\utils.py", line 112, in run_modisco
    tfmodisco_results = modisco.tfmodisco_workflow.workflow.TfModiscoWorkflow(
  File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\tfmodisco_workflow\workflow.py", line 335, in __call__
    metaclustering_results = metaclusterer.fit_transform(seqlets)
  File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\metaclusterers.py", line 100, in fit_transform
    self.fit(seqlets)
  File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\metaclusterers.py", line 107, in fit
    self._fit(attribute_vectors)
  File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\metaclusterers.py", line 306, in _fit
    vector_activity_pattern = self.vector_to_pattern(vector)
  File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\metaclusterers.py", line 149, in vector_to_pattern
    assert False
AssertionError
[0.9666605]
[0]

I don't know what caused this issue. Here is the code I ran.

contrib_scores = {"task0": onehot_data * gradient_data}
hypothetical_contribs_scores = {"task0": gradient_data}
onehot_data = onehot_data

null_per_pos_scores = modisco.coordproducers.LaplaceNullDist(num_to_samp=1000)
tfmodisco_results = modisco.tfmodisco_workflow.workflow.TfModiscoWorkflow(
        # Slight modifications from the default settings
        sliding_window_size=15,
        flank_size=5,
        target_seqlet_fdr=0.15,
        seqlets_to_patterns_factory=
        modisco.tfmodisco_workflow.seqlets_to_patterns.TfModiscoSeqletsToPatternsFactory(
            # Note: as of version 0.5.6.0, it's possible to use the results of a motif discovery
            # software like MEME to improve the TF-MoDISco clustering. To use the meme-based
            # initialization, you would specify the initclusterer_factory as shown in the
            # commented-out code below:
            # initclusterer_factory=modisco.clusterinit.memeinit.MemeInitClustererFactory(
            #    meme_command="meme", base_outdir="meme_out",
            #    max_num_seqlets_to_use=10000, nmotifs=10, n_jobs=1),
            trim_to_window_size=15,
            initial_flank_to_add=5,
            final_flank_to_add=5,
            final_min_cluster_size=20,
            # use_pynnd=True can be used for faster nn comp at coarse grained step
            # (it will use pynndescent), but note that pynndescent may crash
            # use_pynnd=True,
            n_cores=10)
    )(
        task_names=["task0"],
        contrib_scores=contrib_scores,
        hypothetical_contribs=hypothetical_contribs_scores,
        one_hot=onehot_data,
        null_per_pos_scores=null_per_pos_scores)

look forward to your response.

@AvantiShri
Copy link
Collaborator

AvantiShri commented Jul 28, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants