Question: does BPNet code support any seq length? #1

vitkl · 2023-08-08T00:00:06Z

does BPNet code support arbitrary sequence length?
https://github.com/jmschrei/bpnet-lite/blob/master/bpnetlite/bpnet.py

jmschrei · 2023-08-08T00:02:42Z

The BPNet model itself should work on any length. Pay attention to the trimming parameter, which specifies the difference between the input length and the output length (or, specifically, half that difference because it's the amount to trim off either side). However, I may have gotten lazy in other parts of the code-base and assumed 2114 input or 1000 output. Let me know if that's the case anywhere and I'll fix it.

vitkl · 2023-08-08T00:17:15Z

Thanks for clarifying!

What are the requirements for the input sequence length? Can I change trimming to use 1000 input and 1000 output?

jmschrei · 2023-08-08T01:49:11Z

If you set trimming to 0 you should be able to use the same input and output length. However, you'll likely get worse predictions on the flanks because they can't see their full contexts.

vitkl · 2023-08-08T05:33:48Z

I would like to use this as a bias model - how much context does the bias model need to see?

jmschrei · 2023-08-08T05:37:09Z

Usually, the bias model is given far less context so that it does not inadvertently learn complex rules. We usually use a model with four layers, so the residual layers aggregate 2**4 nucleotides after the first layer.

vitkl · 2023-08-08T06:24:12Z

Thanks for clarifying! Trimming 16/2 nucleotides probably doesn't matter too much.

What are the bias model first layer filter size and FCLayers dimensionality?

jmschrei · 2023-08-08T16:25:08Z

Look through this example specification: https://github.com/jmschrei/bpnet-lite/blob/master/example_jsons/chrombpnet_pipeline_example.json

vitkl · 2023-08-09T16:59:34Z

I am not sure where I should looks for a full list of bias model parameters. I see this https://github.com/jmschrei/bpnet-lite/blob/master/example_jsons/chrombpnet_pipeline_example.json#L34-L41 but it only mentions the number of layers.

vitkl · 2023-08-09T17:05:23Z

Also interesting to see that BPNet doesn't use any normalisation layers (eg LayerNorm/BatchNorm). I wonder if not using those normalisation layers is a pre-requisite for learning TF concentration-dependent effects.

jmschrei · 2023-08-09T22:06:43Z

The bias model is just a BPNet model so you can use any of the parameters in https://github.com/jmschrei/bpnet-lite/blob/master/example_jsons/bpnet_fit_example.json

I'm not sure how normalization would relate to learning TF concentration-dependent effects. Presumably, that's static in each experiment.

vitkl · 2023-08-09T22:11:29Z

Presumably, that's static in each experiment.

This makes sense. How do you motivate not using normalisation?

jmschrei · 2023-08-09T22:13:38Z

Presumably, the motivation is that adding it in didn't help, empirically, and keeping the model conceptually simpler can help with interpretation.

vitkl · 2023-08-09T22:26:57Z

Makes sense!

Looking at the file with options:

max_jitter - Which values do you recommend using? I see a pretty large value of 128 compared to Basset default of 3.
reverse_complement - do you have strong arguments against adding results of scanning forward & reverse complement in favour of random RC input?

jmschrei · 2023-08-09T22:34:40Z

I'd recommend having it be as large as possible while still capturing the peak. Basset would have benefitted from a larger jitter because it had position dependence in the model through the use of dense layers. BPNet is much more resistant to this because it only uses convolution layers.
https://www.biorxiv.org/content/10.1101/103663v1 suggests the best approach is to train the model and then, at inference time, scan both. I don't think it matters too much though.

vitkl · 2023-08-09T22:41:34Z

Do you mean that BPNet should in principle works as well with a smaller shift?
I saw the paper - however, I have to use the summation strategy in the rest of the model - so the question is whether the bias model should be similarly trained.

jmschrei · 2023-08-09T22:44:34Z

Many of the choices for BPNet were done on a small number of ChIP-nexus data sets. I don't think we've rigorously tested each decision by, for example, looking at performance with different jitters. I think that jittering helps BPNet but don't have more than just that intuition.
The bias model and subsequently the full ChromBPNet model should be trained by randomly RCing input. If you want to force outputs to be identical across strands then, at inference time, you run a sequence and its RC through the entire model (bias and accessibility) and aggregate.

vitkl · 2023-08-10T15:38:01Z

I see. I observed that 3bp jittering helps cell2state compared to no jittering. I will test 100bp jittering.
I see what you mean. I would like to use BPNet as a bias model but the biological model is very different. I am using parameter-sharing architecture in cell2state CNN because the refererence-based TF motifs will be only recognised in either RC or FW so the model has to look at both directions and aggregate. As far as I understand, random RC forces the model to learn both FW and RC forms of the TF motifs (or in this case bias motifs) - does this map to your intuition? The main question is whether it's fine to randomly RC input for the bias model but use both FW/RC for the biological model.

jmschrei · 2023-08-10T17:34:24Z

Randomly RCing encourages the model to learn motifs in both directions but there's no guarantee that it learns the same effect in both directions, even though biologically that's plausible. There have been RARE cases where it learns a motif in one direction and not in the other, but this is usually only when there are not a lot of binding sites.

I'm not sure I understand how you're training your biological model. Are you training it jointly with a frozen pre-trained bias model? I guess my feeling is that both should be trained the same way. If you end up doing parameter sharing with your model, you should train the bias model with parameter sharing. Otherwise, it might learn some weird effects. Unfortunately, parameter sharing is not implemented in bpnet-lite.

vitkl · 2023-08-10T23:27:28Z

I see. The model failing to learn a motif in both directions makes sense.

Also makes sense to train both models with parameter sharing. The bpnet-lite code helps a lot with understanding the architecture. I will try implementing parameter sharing but I am not sure I fully understand how to do that for layer 2+. My biological architecture uses just one CNN layer, scans forward DNA with FW filter, then with RC filter (simply swap complementary nucleotides and flip width axis), applies Exp activation to each, and then adds the result of scanning with FW/RC. Does the same operation happen in all subsequent layers?

jmschrei · 2024-06-26T15:21:06Z

If you want to do parameter sharing I'd recommend having a wrapper that takes in a normal model, runs your sequence through it, then runs the RC'd sequence through it, and then flips the output from the RC'd sequence and averages it. No need to modify the underlying model.

vitkl · 2024-07-06T00:41:09Z

Thanks for this suggestion @jmschrei! This certainly simplifies the implementation.

Didn't Anshul's group previously show that this kind of conjoining during training is worse than both conjoining during evaluation and a model that correctly accounts for the symmetries in every layer?

vitkl · 2024-07-06T00:48:57Z

Actually, this won't work because for forward and RC sequences to give the same results, the model needs to represent both the forward and RC versions of each TF motif:
-A>
<A-

Which is fine if you train a CNN de-novo but not fine if you use TF motifs as the first layer CNN weight priors.

basarnoyan1 referenced this issue in basarnoyan1/bpnet-lite Nov 12, 2023

Lightning test #1

81cb58e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: does BPNet code support any seq length? #1

Question: does BPNet code support any seq length? #1

vitkl commented Aug 8, 2023 •

edited

Loading

jmschrei commented Aug 8, 2023

vitkl commented Aug 8, 2023

jmschrei commented Aug 8, 2023

vitkl commented Aug 8, 2023

jmschrei commented Aug 8, 2023

vitkl commented Aug 8, 2023

jmschrei commented Aug 8, 2023

vitkl commented Aug 9, 2023

vitkl commented Aug 9, 2023

jmschrei commented Aug 9, 2023

vitkl commented Aug 9, 2023

jmschrei commented Aug 9, 2023

vitkl commented Aug 9, 2023

jmschrei commented Aug 9, 2023

vitkl commented Aug 9, 2023

jmschrei commented Aug 9, 2023

vitkl commented Aug 10, 2023 •

edited

Loading

jmschrei commented Aug 10, 2023

vitkl commented Aug 10, 2023 •

edited

Loading

jmschrei commented Jun 26, 2024

vitkl commented Jul 6, 2024

vitkl commented Jul 6, 2024 •

edited

Loading

Question: does BPNet code support any seq length? #1

Question: does BPNet code support any seq length? #1

Comments

vitkl commented Aug 8, 2023 • edited Loading

jmschrei commented Aug 8, 2023

vitkl commented Aug 8, 2023

jmschrei commented Aug 8, 2023

vitkl commented Aug 8, 2023

jmschrei commented Aug 8, 2023

vitkl commented Aug 8, 2023

jmschrei commented Aug 8, 2023

vitkl commented Aug 9, 2023

vitkl commented Aug 9, 2023

jmschrei commented Aug 9, 2023

vitkl commented Aug 9, 2023

jmschrei commented Aug 9, 2023

vitkl commented Aug 9, 2023

jmschrei commented Aug 9, 2023

vitkl commented Aug 9, 2023

jmschrei commented Aug 9, 2023

vitkl commented Aug 10, 2023 • edited Loading

jmschrei commented Aug 10, 2023

vitkl commented Aug 10, 2023 • edited Loading

jmschrei commented Jun 26, 2024

vitkl commented Jul 6, 2024

vitkl commented Jul 6, 2024 • edited Loading

vitkl commented Aug 8, 2023 •

edited

Loading

vitkl commented Aug 10, 2023 •

edited

Loading

vitkl commented Aug 10, 2023 •

edited

Loading

vitkl commented Jul 6, 2024 •

edited

Loading