Experimenting with CRISPR calculations #77

cansavvy · 2024-12-20T21:17:49Z

Description

From a basecamp conversation we realized normalization might not be happening as we think.

@ahberger thought we were calculating CRISPRs using:

logFC adjusted = (log2FC - log2FC_negctls) / |log2FC_posctls|

But the original code has this as the calculation:

https://github.com/FredHutch/GI_mapping/blob/e117710977fd4c92b62ff3f552254a6a3076a6d4/workflow/scripts/03-filter_and_calculate_LFC.Rmd#L450

d.lfc_annot_adj <- d.lfc_annot %>%
  group_by(rep) %>%
  mutate(lfc_adj1 = lfc_plasmid_vs_late - median(lfc_plasmid_vs_late[norm_ctrl_flag == "negative_control"]),
         lfc_adj2 = lfc_adj1 / (median(lfc_adj1[norm_ctrl_flag == "negative_control"]) -
                                  median(lfc_adj1[norm_ctrl_flag == "positive_control"]))) 
...

And then one more median subtraction later.

...
  group_by(rep) %>%
  mutate(lfc_adj3 = lfc_adj2 - median(lfc_adj2[unexpressed_ctrl_flag == TRUE]))

And this is what we've been basing CRISPR calculations on and have gotten very similar results to what is in the results folder on the cluster grp/bergerlab_shared/Projects/paralog_pgRNA/pgPEN_library/GI_mapping/results

But when I plot the results found here (which by all indicators: https://github.com/FredHutch/GI_mapping/blob/e117710977fd4c92b62ff3f552254a6a3076a6d4/workflow/scripts/03-filter_and_calculate_LFC.Rmd#L8 ) are from the code we have.

When I plot these data it doesn't adhere to the negative controls = 0 and positive controls = -1 as expected:

The code on this branch then, attempts to try to better meet these expectations by calculating CRISPR using the following:

logFC adjusted = (log2FC - log2FC_negctls) / |log2FC_posctls|

Instead of the original code. This results

Note however this version of the code does not result in the perfect -1 for positive controls:

  rep              norm_ctrl_flag   median_crispr
   <chr>            <fct>                    <dbl>
 1 Day05_RepA_early negative_control         0    
 2 Day05_RepA_early positive_control         3.25 
 3 Day05_RepA_early single_targeting         2.86 
 4 Day05_RepA_early double_targeting         4.47 
 5 Day22_RepA_late  negative_control         0    
 6 Day22_RepA_late  positive_control        -2.18 
 7 Day22_RepA_late  single_targeting        -0.826
 8 Day22_RepA_late  double_targeting        -1.86 
 9 Day22_RepB_late  negative_control         0    
10 Day22_RepB_late  positive_control        -2.07 
11 Day22_RepB_late  single_targeting        -0.793
12 Day22_RepB_late  double_targeting        -1.66 
13 Day22_RepC_late  negative_control         0    
14 Day22_RepC_late  positive_control        -2.13 
15 Day22_RepC_late  single_targeting        -0.785
16 Day22_RepC_late  double_targeting        -1.75

cansavvy · 2024-12-20T21:18:11Z

Overall readability score: 44.82 (🟢 +0.12)

File	Readability
README.md	60.48 (🟢 +0.47)

View detailed metrics

🟢 - Shows an increase in readability
🔴 - Shows a decrease in readability

File	Readability	FRE	GF	ARI	CLI	DCRS
README.md	60.48	50.57	10.65	13.3	11.66	6.39
	🟢 +0.47	🟢 +0.31	🟢 +0.12	🟢 +0.1	🟢 +0	🟢 +0.02

Averages:

	Readability	FRE	GF	ARI	CLI	DCRS
Average	44.82	34.48	11.87	14.18	14.21	8.27
	🟢 +0.12	🟢 +0.08	🟢 +0.03	🟢 +0.02	🟢 +0	🟢 +0

View metric targets

Metric	Range	Ideal score
Flesch Reading Ease	100 (very easy read) to 0 (extremely difficult read)	60
Gunning Fog	6 (very easy read) to 17 (extremely difficult read)	8 or less
Auto. Read. Index	6 (very easy read) to 14 (extremely difficult read)	8 or less
Coleman Liau Index	6 (very easy read) to 17 (extremely difficult read)	8 or less
Dale-Chall Readability	4.9 (very easy read) to 9.9 (extremely difficult read)	6.9 or less

cansavvy · 2024-12-21T00:29:58Z

Following an older version of the code I did:

crispr_score = (lfc - negative_control) / ( negative_control - positive_control)

And now negative controls are 0 and positive controls are -1 as expected. Will interrogate this more later but I think we're more on track. Also have a function to do the plotting and will add this as a part of unit testing.

With the new calculations we are getting closer. It doesn't look like the paper but at least our normalization is actually to the right range now.

cansavvy added 6 commits December 19, 2024 16:19

Adding tests

4495653

Add positive and negative control median calc

3486257

Update docs

22818f8

Forgot )

c27c28b

Update docs

c4dc378

logFC adjusted = (log2FC - log2FC_negctls) / |log2FC_posctls|

5979627

cansavvy added 2 commits December 20, 2024 16:23

Add dummy set_knitr_image_path

730cace

This works

18edab2

cansavvy added 3 commits December 20, 2024 19:32

Update docs

62cb31c

Update README

32ff9b8

Update vignette

c7214d2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimenting with CRISPR calculations #77

Experimenting with CRISPR calculations #77

cansavvy commented Dec 20, 2024

cansavvy commented Dec 20, 2024 •

edited

Loading

cansavvy commented Dec 21, 2024

Experimenting with CRISPR calculations #77

Are you sure you want to change the base?

Experimenting with CRISPR calculations #77

Conversation

cansavvy commented Dec 20, 2024

Description

cansavvy commented Dec 20, 2024 • edited Loading

cansavvy commented Dec 21, 2024

cansavvy commented Dec 20, 2024 •

edited

Loading