Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding lispmds-style merging to mergeMaps function #148

Open
drserajames opened this issue Apr 5, 2023 · 10 comments
Open

Adding lispmds-style merging to mergeMaps function #148

drserajames opened this issue Apr 5, 2023 · 10 comments

Comments

@drserajames
Copy link
Member

Racmacs and lispmds have different merging criteria at present. When there's a mix of numeric and < titres, Racmacs converts these to numeric and takes the GMT whereas lispmds converts them all to < and takes the maximum. Additionally Racmacs uses the sample SD and lispmds uses the population SD. Lispmds does not support > titres, these are handled using a preprocessing step that converts them to numeric or *.

Proposal (using the population SD)

  1. If all titres are <, keep the minimum (Racmacs & lispmds)
  2. If all titres are >, keep the maximum (Racmacs)
  3. Convert > and < to numeric, calculate the population SD. If greater than a threshold (default 1), set to DONTCARE (Racmacs & lispmds plus user-defined SD threshold)
  4. If no < or >, calculate GMT (Racmacs & lispmds)
  5. If a mix of numeric and >, calculate the GMT (Racmacs & lispmds)
  6. If a mix of < and numeric, convert all to < and take the maximum (lispmds)
  7. If a mix of <, numeric and >, convert > to numeric, then convert all to < and take the maximum (lispmds)

I looked in the file test-merging.R and these are my proposed changes (# is the lines I've edited, ## is new lines):

titer_merge_tests <- tibble::tribble(
  ~expected, ~sum,
  "*",    c("*"),
  "10",   c("10"),
  "<10",  c("<10"),
  ">80",  c(">80"),
  ".",    c("."),
  "10",   c(10, 10),
  "10",   c("*", "10", "10", "*"),
  "<10",  c("<10", "<40", "<20", "*"),
  ">40",  c(">10", ">40", ">20", "*"),
  "*",    c("<10", "20", "80", "*"),
  "*",    c("<10", "40", "*", "*"),    #
  "*",    c("<10", "40", "*", "."),    #
  "<40",   c("<10", "20", "10", "*"),  #
  "<40",   c("<10", "20", "10", "*"),  # duplicate row?
  "<40",   c("<10", "20", "10", "."),  #
  "*",    c("*", "*", "*", "*"),
  ".",    c(".", "."),
  "*",    c(".", "*")
  "<80",   c("<20", "20", ">20", "*")  ##
  "<80",   c("<20", "20", ">20", ".")  ##
)

And changing the sd limit function to allow a 4 fold (bottom two are the new tests)

test_that("sd_lim working", {
  
  expect_equal(
    "<40",
    ac_merge_titers(c("<10", "20"), options = RacMerge.options(sd_limit = NA))
  )
  
  expect_equal(
    "*",
    ac_merge_titers(c("<10", "20"), options = RacMerge.options(sd_limit = 0.9))
  )
  
  expect_equal(
    "<40",
    ac_merge_titers(c("<10", "20"), options = RacMerge.options(sd_limit = 1))
  )
  
  expect_equal(
    "*",
    ac_merge_titers(c(rep("<10",50), rep("20",50), "40"), options = RacMerge.options(sd_limit = 1))
  )
  
})
@drserajames
Copy link
Member Author

In light of further discussion, the default behaviour will be merging all titres (no exclusions). There will be an option to have a sample SD limit when merging, with a default of 1.5. There will the options for a user defined function, plus an option for lispmds-like merging to allow backwards compatibility.

I've modified the titer_merge_tests and commented what I think each tests (# is comment, ## is lines I've edited, ### is new lines)

titer_merge_tests <- tibble::tribble(
  ~expected, ~sum,
  "*",    c("*"), 			# * unchanged
  "10",   c("10"),			# numeric unchanged
  "<10",  c("<10"),			# less than unchanged
  ">80",  c(">80"),			# greater than unchanged
  ".",    c("."),			# . unchanged
  "10",   c(10, 10),			# two same numeric, merges same
  "10",   c("*", "10", "10", "*"),	# two same numeric, two *,  merges same
  "<10",  c("<10", "<40", "<20", "*"),	# all less than go to min
  ">40",  c(">10", ">40", ">20", "*"),	# all greater than fo to max
  "160",  c("10", "160", "2560", "*"),	## very variable not set to . (default sd_limit=NA), including *
  "<40",  c("<10", "20", "*", "*"),	## very variable not set to . , including less than & *
  "<40",  c("<10", "20", "*", "."),	## very variable not set to . , including less than, * & .
  "<40",  c("<10", "20", "10", "*"),	## similar to previous, suggest delete
  "<40",  c("<10", "20", "10", "*"),	## duplicate of row above, suggest delete
  "10",   c("<10", "20", "10", "."),	## similar to previous, suggest delete
  "*",    c("*", "*", "*", "*"),	# all * merge to *
  ".",    c(".", "."),			# all . merge to .
  "*",    c(".", "*")			# mix of . & * merge to .
  "<80",  c("<20", "20", ">20", "*")  	### mix of numeric, less than, greater than & *, convert greater than and merge to < (highest log titre +1)
  "<80",  c("<20", "40", ">10", ".")  	### as previous line with numeric as the highest instead of greater than & * instead of .
)

I'm don't think "sd_lim working" needs changing any more. Instead I propose a test of the lispmds-like merging.

test_that("lispmds working", {
  
  # test n=2 allows 4-fold, but not 8-fold
  expect_equal(
    "<40",
    ac_merge_titers(c("<10", "20"), options = RacMerge.options(merge_function = "lispmds"))
  )
  
  expect_equal(
    "*",
    ac_merge_titers(c("<10", "80"), options = RacMerge.options(merge_function = "lispmds"))
  )

  # test n=102 to check the sd limit is tighter at high n
  expect_equal(
    "*",
    ac_merge_titers(c("10",rep("40", 50), rep("160",50) "20"), options = RacMerge.options(merge_function = "lispmds"))
  )

})

I don't know whether you want a variable sd_limit with lispmds merging, I can add more tests if so. Otherwise, I think there should be a warning if the user sets sd_limit and merge_function.

I can't write a test for user defined function because I still haven't got my head around what it would look like.

@shwilks
Copy link
Member

shwilks commented Apr 21, 2023

How about this for the tests? Testing the 3 options we will have, with "conservative" being the default:

titer_merge_tests <- tibble::tribble(
  ~titers,                      ~conservative,  ~likelihood,  ~lisp,
  c("*"),                       "*",            "*",          "10",   # * unchanged
  c("10"),                      "10",           "10",         "10",   # numeric unchanged
  c("<10"),                     "<10",          "<10",        "<10",  # less than unchanged
  c(">80"),                     ">80",          ">80",        ">80",  # greater than unchanged
  c("."),                       ".",            ".",          ".",    # . unchanged
  c(10, 10),                    "10",           "10",         "10",   # two same numeric, merges same
  c("*", "10", "10", "*"),      "10",           "10",         "10",   # two same numeric, two *,  merges same
  c("<10", "<40", "<20", "*"),  "<10",          "<10",        "<10",  # all less than go to min
  c(">10", ">40", ">20", "*"),  ">40",          ">40",        ">40",  # all greater than to to max
  c("10", "160", "2560", "*"),  "160",          "160",        "*",    # very variable not set to . (default sd_limit=NA), including *
  c("<10", "20", "*", "*"),	    "<40",          "10",         "<40",  # very variable not set to . , including less than & *
  c("<10", "20", "*", "."),	    "<40",          "10",         "<40",  # very variable not set to . , including less than & *
  c("*", "*", "*", "*"),        "*",            "*",          "*",    # all * merge to *
  c(".", "."),                  ".",            ".",          ".",    # all . merge to .
  c(".", "*")                   "*",            "*",          "*",    # mix of . & * merge to *
  c("<20", "20", ">20", "*")    "<80",          "20",         "<80",  # mix of numeric, less than, greater than & *, convert greater than and merge to < (highest log titre +1)
  c("<20", "40", ">10", ".")    "<80",          "20",         "<80",  # as previous line with numeric as the highest instead of greater than & * instead of .
)

Am not sure what to do with the > values in the "lispmds" merge option though?

@drserajames
Copy link
Member Author

A couple of thoughts

  • Should be a "*" for lisp on that first row
  • Should c("10", "160", "2560", "*") merge to "." not "*" for lisp? Similarly c(".", "*") should merge to ".". Or perhaps I'm missing something?
  • I wouldn't say c("<10", "20", "*", "*") & c("<10", "20", "*", ".") is very variable. They are tests of mix of numeric & less than

I think for lisp, the > should be converted to numeric because that's the most likely pre-processing I would do with lisp.

c("10", ">20", "*", "."),	    "20",          "20",         "20",

@shwilks
Copy link
Member

shwilks commented Apr 24, 2023

Thanks, for the last 2 cases we used to replace with "*" if there was a mix of < and >, but did we simply decide not to do that any more? I guess so?

@drserajames
Copy link
Member Author

I don't remember the discussion.

I thought Racmacs converted all to numeric took the GMT if SD<1 (that's what I put in the summary document so I'll need to correct it). lispmds would have pre-processed the > to numeric and so would treat it as a case of < & numerics. I think either is fine because practically speaking for flu it's likely to be <10 and >1280 so the SD limit would make it * anyway.

@shwilks
Copy link
Member

shwilks commented Apr 24, 2023

In the current version Racmacs has a special rule to set any mix of < and > to *, which mirrors what acmacs did. We agreed the default sd would be NA so without that rule <10 and >1280 would become <2560 for the new default "conservative" method and around 113 for the "likelihood" method.

I suggest we keep the old Racmacs / acmacs rule and they become * with a warning.

@drserajames
Copy link
Member Author

I agree to with merging < & > to *. I can't remember what Derek said.

<10 and >1280 would actually merge to <5120.

@shwilks
Copy link
Member

shwilks commented Apr 24, 2023

ok, and yes true re <5120

@shwilks
Copy link
Member

shwilks commented Apr 24, 2023

I just pushed version 1.2.0 to shwilks/Racmacs which implements these merge changes, although no merge messages or warnings about it yet.

@drserajames
Copy link
Member Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants