AssertionError when run tf-modisco #117

Jie-Lii · 2024-07-19T08:33:35Z

Hi, i am trying to use tf-modisco to find motifs, but I encountered the following error during execution:

MEMORY 3.652018176
On task task0
Computing windowed sums on original
Generating null dist
peak(mu)= 0.0071754540205001835
Computing threshold
Subsampling!
For increasing = True , the minimum IR precision was 0.40944084378429907 occurring at 0.0 implying a frac_neg of 0.6933104659793835
To be conservative, adjusted frac neg is 0.95
Thresholds from null dist were -inf  and  8.0625 with frac passing 2e-06
Passing windows frac was 2e-06 , which is below  0.03 ; adjusting
Final raw thresholds are -3.90625  and  3.90625
Final transformed thresholds are -0.9697025572005383  and  0.9697025572005383
saving plot to figures/scoredist_0.png
Got 9863 coords
After resolving overlaps, got 9863 seqlets
Across all tasks, the weakest transformed threshold used was: 0.9696025572005383
MEMORY 3.787350016
9863 identified in total
Traceback (most recent call last):
  File "C:\Users\11435\Desktop\code\2024-07\2024-07-19\tf-modisco玉米5套数据测试\Code\utils.py", line 160, in <module>
    run_modisco(onehot_data=dna_arr[:2000], gradient_data=gradient_arr[:2000])
  File "C:\Users\11435\Desktop\code\2024-07\2024-07-19\tf-modisco玉米5套数据测试\Code\utils.py", line 112, in run_modisco
    tfmodisco_results = modisco.tfmodisco_workflow.workflow.TfModiscoWorkflow(
  File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\tfmodisco_workflow\workflow.py", line 335, in __call__
    metaclustering_results = metaclusterer.fit_transform(seqlets)
  File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\metaclusterers.py", line 100, in fit_transform
    self.fit(seqlets)
  File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\metaclusterers.py", line 107, in fit
    self._fit(attribute_vectors)
  File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\metaclusterers.py", line 306, in _fit
    vector_activity_pattern = self.vector_to_pattern(vector)
  File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\metaclusterers.py", line 149, in vector_to_pattern
    assert False
AssertionError
[0.9666605]
[0]

I don't know what caused this issue. Here is the code I ran.

contrib_scores = {"task0": onehot_data * gradient_data}
hypothetical_contribs_scores = {"task0": gradient_data}
onehot_data = onehot_data

null_per_pos_scores = modisco.coordproducers.LaplaceNullDist(num_to_samp=1000)
tfmodisco_results = modisco.tfmodisco_workflow.workflow.TfModiscoWorkflow(
        # Slight modifications from the default settings
        sliding_window_size=15,
        flank_size=5,
        target_seqlet_fdr=0.15,
        seqlets_to_patterns_factory=
        modisco.tfmodisco_workflow.seqlets_to_patterns.TfModiscoSeqletsToPatternsFactory(
            # Note: as of version 0.5.6.0, it's possible to use the results of a motif discovery
            # software like MEME to improve the TF-MoDISco clustering. To use the meme-based
            # initialization, you would specify the initclusterer_factory as shown in the
            # commented-out code below:
            # initclusterer_factory=modisco.clusterinit.memeinit.MemeInitClustererFactory(
            #    meme_command="meme", base_outdir="meme_out",
            #    max_num_seqlets_to_use=10000, nmotifs=10, n_jobs=1),
            trim_to_window_size=15,
            initial_flank_to_add=5,
            final_flank_to_add=5,
            final_min_cluster_size=20,
            # use_pynnd=True can be used for faster nn comp at coarse grained step
            # (it will use pynndescent), but note that pynndescent may crash
            # use_pynnd=True,
            n_cores=10)
    )(
        task_names=["task0"],
        contrib_scores=contrib_scores,
        hypothetical_contribs=hypothetical_contribs_scores,
        one_hot=onehot_data,
        null_per_pos_scores=null_per_pos_scores)

look forward to your response.

The text was updated successfully, but these errors were encountered:

AvantiShri · 2024-07-28T08:07:52Z

Hi Jie, Thanks for bringing this to my attention. I have left the field but will try to make some time to look into this. Unfortunately it would be hard for me to debug this error without access to the input data - any chance you can provide the input data? By the way, do you still get the error with tfmodisco-lite (linked from the README page)? That version of tfmodisco is more actively maintained. If you get the error with tfmodisco lite as well, I will prioritize looking into it. In the worst case we can bypass the metaclustering altogether since it is legacy functionality from when tfmodisco was being run on data from multiple tasks.

…

On Fri, 19 Jul, 2024, 14:03 Jie Li, ***@***.***> wrote: Hi, i am trying to use tf-modisco to find motifs, but I encountered the following error during execution: MEMORY 3.652018176 On task task0 Computing windowed sums on original Generating null dist peak(mu)= 0.0071754540205001835 Computing threshold Subsampling! For increasing = True , the minimum IR precision was 0.40944084378429907 occurring at 0.0 implying a frac_neg of 0.6933104659793835 To be conservative, adjusted frac neg is 0.95 Thresholds from null dist were -inf and 8.0625 with frac passing 2e-06 Passing windows frac was 2e-06 , which is below 0.03 ; adjusting Final raw thresholds are -3.90625 and 3.90625 Final transformed thresholds are -0.9697025572005383 and 0.9697025572005383 saving plot to figures/scoredist_0.png Got 9863 coords After resolving overlaps, got 9863 seqlets Across all tasks, the weakest transformed threshold used was: 0.9696025572005383 MEMORY 3.787350016 9863 identified in total Traceback (most recent call last): File "C:\Users\11435\Desktop\code\2024-07\2024-07-19\tf-modisco玉米5套数据测试\Code\utils.py", line 160, in <module> run_modisco(onehot_data=dna_arr[:2000], gradient_data=gradient_arr[:2000]) File "C:\Users\11435\Desktop\code\2024-07\2024-07-19\tf-modisco玉米5套数据测试\Code\utils.py", line 112, in run_modisco tfmodisco_results = modisco.tfmodisco_workflow.workflow.TfModiscoWorkflow( File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\tfmodisco_workflow\workflow.py", line 335, in __call__ metaclustering_results = metaclusterer.fit_transform(seqlets) File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\metaclusterers.py", line 100, in fit_transform self.fit(seqlets) File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\metaclusterers.py", line 107, in fit self._fit(attribute_vectors) File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\metaclusterers.py", line 306, in _fit vector_activity_pattern = self.vector_to_pattern(vector) File "C:\Users\11435\miniconda3\envs\tf24\lib\site-packages\modisco\metaclusterers.py", line 149, in vector_to_pattern assert False AssertionError [0.9666605] [0] I don't know what caused this issue. Here is the code I ran. contrib_scores = {"task0": onehot_data * gradient_data} hypothetical_contribs_scores = {"task0": gradient_data} onehot_data = onehot_data null_per_pos_scores = modisco.coordproducers.LaplaceNullDist(num_to_samp=1000) tfmodisco_results = modisco.tfmodisco_workflow.workflow.TfModiscoWorkflow( # Slight modifications from the default settings sliding_window_size=15, flank_size=5, target_seqlet_fdr=0.15, seqlets_to_patterns_factory= modisco.tfmodisco_workflow.seqlets_to_patterns.TfModiscoSeqletsToPatternsFactory( # Note: as of version 0.5.6.0, it's possible to use the results of a motif discovery # software like MEME to improve the TF-MoDISco clustering. To use the meme-based # initialization, you would specify the initclusterer_factory as shown in the # commented-out code below: # initclusterer_factory=modisco.clusterinit.memeinit.MemeInitClustererFactory( # meme_command="meme", base_outdir="meme_out", # max_num_seqlets_to_use=10000, nmotifs=10, n_jobs=1), trim_to_window_size=15, initial_flank_to_add=5, final_flank_to_add=5, final_min_cluster_size=20, # use_pynnd=True can be used for faster nn comp at coarse grained step # (it will use pynndescent), but note that pynndescent may crash # use_pynnd=True, n_cores=10) )( task_names=["task0"], contrib_scores=contrib_scores, hypothetical_contribs=hypothetical_contribs_scores, one_hot=onehot_data, null_per_pos_scores=null_per_pos_scores) look forward to your response. — Reply to this email directly, view it on GitHub <#117>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AARSFBQDSPWAZIII4UDSCMLZNDFPLAVCNFSM6AAAAABLEET5POVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQYTQMZWHAZTCNY> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssertionError when run tf-modisco #117

AssertionError when run tf-modisco #117

Jie-Lii commented Jul 19, 2024

AvantiShri commented Jul 28, 2024 via email

AssertionError when run tf-modisco #117

AssertionError when run tf-modisco #117

Comments

Jie-Lii commented Jul 19, 2024

AvantiShri commented Jul 28, 2024 via email