nobias model "uncorrected", but high pearsonr in peaks. Still OK to use just for counts (not profiles)? #217

mepster · 2024-12-11T22:11:14Z

Hi and thank you for chrombpnet!

With the Fraser lab I'm doing some work on evolution of chromatin accessibility, across many tissue types from different datasets.

One dataset from Pritchard lab gets worse bias models than the others (after sweeping the "-bias" parameter). The nobias models are worse too (see example report, attached). The model is (badly) "uncorrected", and one tn5 motif pops up as a strong match in the nobias modisco motifs.

However, the pearsonr in peaks (measuring the counts) is quite high, 0.80. If we are just interested in counts (not profiles), can we still use this "uncorrected" model?

Thanks for all your work, chrombpnet is amazing.

Mem_B_no_treament_peaks_combined-sc2_overall_reportB.pdf

panushri25 · 2024-12-11T22:18:16Z

What bias model are you using here?

mepster · 2024-12-11T23:24:44Z

Thanks Anusri!

The bias model is trained from a large set of Pritchard lab data, with all the tissue types merged together. (Then, we train multiple nobias models on one tissue type at a time, using the merged bias model.)

It's bulk ATAC-seq from 25 pure populations of immune cells, both stimulated and unstimulated. https://www.nature.com/articles/s41588-019-0505-9

Merged together, it's 1,000,408,867 reads and 405,875 peaks (from 1,016,570 peaks initially, filtered for qval 4.0). This yields 403,782 positives and 720,528 negatives. The best "-bias" parameter was 0.2. [edit]

The bias model report is attached.

Thank you for your help!

All.downsampled_bias-b0.2-s1_overall_report.pdf

linzyzhao2002 · 2024-12-18T16:09:09Z

Hi and thank you for chrombpnet!

With the Fraser lab I'm doing some work on evolution of chromatin accessibility, across many tissue types from different datasets.

One dataset from Pritchard lab gets worse bias models than the others (after sweeping the "-bias" parameter). The nobias models are worse too (see example report, attached). The model is (badly) "uncorrected", and one tn5 motif pops up as a strong match in the nobias modisco motifs.

However, the pearsonr in peaks (measuring the counts) is quite high, 0.80. If we are just interested in counts (not profiles), can we still use this "uncorrected" model?

Thanks for all your work, chrombpnet is amazing.

Mem_B_no_treament_peaks_combined-sc2_overall_reportB.pdf

Hi, I am facing similar problem here. The model seams "uncorrected", and pearsonr in peaks for the bias model was 0.3. (normally it should be above -0.3, but still negative I assume?). I use a pre-trained bias model as well, and tried hyper-parameter ranging from 0.2 to 0.5, but it still didn't work. Have you found a solution to your problem yet? Thank you so much!!!

panushri25 · 2024-12-18T18:04:02Z

@mepster Ah I think since you merged together all your peaks in your celltypes and the resulting negatives is excluding these peaks, I think you are excluding a large part of the GC rich genome as well. So your candidate negative regions might have compositionally deviated from your peak regions making it hard for the bias model to generalize.

Instead of merging the tissues, pick a tissue / celltype which is the deepest and train a bias model only in that tissue (i,e using bigwigs, peaks and negatives generated for that tissue alone). This bias model should generalize across tissues (assuming they follow a similar ATAC-seq protocol).

panushri25 · 2024-12-18T18:07:06Z

@linzyzhao2002 the pearsonr for bias in peaks can be positive (anything greater than -0.3 is good).

Are you following a similar setup as above? If yes please refer to my above message, otherwise can you provide more details on your setup?

linzyzhao2002 · 2024-12-18T18:11:40Z

@linzyzhao2002 the pearsonr for bias in peaks can be positive (anything greater than -0.3 is good).

Are you following a similar setup as above? If yes please refer to my above message, otherwise can you provide more details on your setup?

Hi Anshuri, Thank you so much for your reply!!! Could you please refer to the issue 214? #214

I put my bias model and chrombpnet output there. Many thanks for your help and any suggestions would be extremely helpful!! I've been stuck on this for 2 weeks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nobias model "uncorrected", but high pearsonr in peaks. Still OK to use just for counts (not profiles)? #217

nobias model "uncorrected", but high pearsonr in peaks. Still OK to use just for counts (not profiles)? #217

mepster commented Dec 11, 2024

panushri25 commented Dec 11, 2024

mepster commented Dec 11, 2024 •

edited

Loading

linzyzhao2002 commented Dec 18, 2024

panushri25 commented Dec 18, 2024

panushri25 commented Dec 18, 2024

linzyzhao2002 commented Dec 18, 2024

nobias model "uncorrected", but high pearsonr in peaks. Still OK to use just for counts (not profiles)? #217

nobias model "uncorrected", but high pearsonr in peaks. Still OK to use just for counts (not profiles)? #217

Comments

mepster commented Dec 11, 2024

panushri25 commented Dec 11, 2024

mepster commented Dec 11, 2024 • edited Loading

linzyzhao2002 commented Dec 18, 2024

panushri25 commented Dec 18, 2024

panushri25 commented Dec 18, 2024

linzyzhao2002 commented Dec 18, 2024

mepster commented Dec 11, 2024 •

edited

Loading