-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: does BPNet code support any seq length? #1
Comments
The BPNet model itself should work on any length. Pay attention to the |
Thanks for clarifying! What are the requirements for the input sequence length? Can I change trimming to use 1000 input and 1000 output? |
If you set trimming to 0 you should be able to use the same input and output length. However, you'll likely get worse predictions on the flanks because they can't see their full contexts. |
I would like to use this as a bias model - how much context does the bias model need to see? |
Usually, the bias model is given far less context so that it does not inadvertently learn complex rules. We usually use a model with four layers, so the residual layers aggregate 2**4 nucleotides after the first layer. |
Thanks for clarifying! Trimming 16/2 nucleotides probably doesn't matter too much. What are the bias model first layer filter size and FCLayers dimensionality? |
Look through this example specification: https://github.com/jmschrei/bpnet-lite/blob/master/example_jsons/chrombpnet_pipeline_example.json |
I am not sure where I should looks for a full list of bias model parameters. I see this https://github.com/jmschrei/bpnet-lite/blob/master/example_jsons/chrombpnet_pipeline_example.json#L34-L41 but it only mentions the number of layers. |
Also interesting to see that BPNet doesn't use any normalisation layers (eg LayerNorm/BatchNorm). I wonder if not using those normalisation layers is a pre-requisite for learning TF concentration-dependent effects. |
The bias model is just a BPNet model so you can use any of the parameters in https://github.com/jmschrei/bpnet-lite/blob/master/example_jsons/bpnet_fit_example.json I'm not sure how normalization would relate to learning TF concentration-dependent effects. Presumably, that's static in each experiment. |
This makes sense. How do you motivate not using normalisation? |
Presumably, the motivation is that adding it in didn't help, empirically, and keeping the model conceptually simpler can help with interpretation. |
Makes sense! Looking at the file with options:
|
|
|
|
|
Randomly RCing encourages the model to learn motifs in both directions but there's no guarantee that it learns the same effect in both directions, even though biologically that's plausible. There have been RARE cases where it learns a motif in one direction and not in the other, but this is usually only when there are not a lot of binding sites. I'm not sure I understand how you're training your biological model. Are you training it jointly with a frozen pre-trained bias model? I guess my feeling is that both should be trained the same way. If you end up doing parameter sharing with your model, you should train the bias model with parameter sharing. Otherwise, it might learn some weird effects. Unfortunately, parameter sharing is not implemented in bpnet-lite. |
I see. The model failing to learn a motif in both directions makes sense. Also makes sense to train both models with parameter sharing. The bpnet-lite code helps a lot with understanding the architecture. I will try implementing parameter sharing but I am not sure I fully understand how to do that for layer 2+. My biological architecture uses just one CNN layer, scans forward DNA with FW filter, then with RC filter (simply swap complementary nucleotides and |
If you want to do parameter sharing I'd recommend having a wrapper that takes in a normal model, runs your sequence through it, then runs the RC'd sequence through it, and then flips the output from the RC'd sequence and averages it. No need to modify the underlying model. |
Thanks for this suggestion @jmschrei! This certainly simplifies the implementation. Didn't Anshul's group previously show that this kind of conjoining during training is worse than both conjoining during evaluation and a model that correctly accounts for the symmetries in every layer? |
Actually, this won't work because for forward and RC sequences to give the same results, the model needs to represent both the forward and RC versions of each TF motif: Which is fine if you train a CNN de-novo but not fine if you use TF motifs as the first layer CNN weight priors. |
Hi @jmschrei
does BPNet code support arbitrary sequence length?
https://github.com/jmschrei/bpnet-lite/blob/master/bpnetlite/bpnet.py
The text was updated successfully, but these errors were encountered: