Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port Jingjing's backcasting preprocessing utils #114

Draft
wants to merge 19 commits into
base: dev
Choose a base branch
from

Conversation

brookslogan
Copy link
Contributor

@brookslogan brookslogan commented Jun 21, 2022

Resolves #88. Might involve #49, #90, #106, #109.

Progress so far:

  • Ported over some of Jingjing's preprocessing functions and tests.
  • Got the tests running in a package environment.

Some TODOs:

  • Get these working based on epiprocess classes (taking care that current functions work on a single geo/epigroup at a time)
  • Work either on implied lag version - time_value or an explicit lag/lag-like column.
  • Check that, or change to: either always, or optionally, add NAs getting/filling lag training data if it looks like we missed recording a version (it's okay if there is no update for an observation in a target version, but a problem if there are no updates for any observations in a target version) unless there are bugs from assuming that versions are evenly spaced, this is a separate convenience function/arg to think about independently
  • Change fill_rows and fill_missing_updates to use a mix of last-version-carried-forward and NA/0/customizable fill-in determined by archive's [check out $fill_through_version --- may or may not be useful]
  • Add examples
  • Use Abort, etc., rather than stop
  • Think about interaction with epix_slide. Straightforward is probably combining with Should slide() for epi_archive be given access to less than the most up-to-date snapshots? #49. But we might also think about turning the time&version-lag covariates dfs into a custom type of epi_archive (base epi_archive would be pretty inefficient though; lazy merge might be one way to improve). ------ Just do the straightforward way to begin with, then profile.
  • [Use alternative to zoo::rollmeanr. Maybe data.table::frollmean?]
  • [Check compatibility of "shift" terminology with tidymodels, which is used by epipredict. (Might be incompatible with some of our production COVID-19 hospitalization forecasters, but that might be fine.)]

jingjtang and others added 6 commits June 14, 2022 10:26
Get lag-completion/target-lag function tests working in package setting. Before
this change, `ref_lag` was a global set in a not-yet-package file and also the
test file, which, when converted into a package, did not pass tests, as the
tests' `ref_lag` global doesn't overwrite the pre-existing value from the
package.
@brookslogan brookslogan self-assigned this Jun 21, 2022
@brookslogan brookslogan changed the title Lcb/port jingjing backcasting preprocessing utils Port Jingjing's backcasting preprocessing utils Jun 21, 2022
@dsweber2 dsweber2 added this to the Epiprocess Issue Triage milestone Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add functions that will enable backcasters to be built on top
5 participants