Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

returning warning and message or NA instead of "less than 10 years of data...try again" #35

Open
ryanpeek opened this issue Jan 15, 2020 · 7 comments

Comments

@ryanpeek
Copy link
Contributor

evaluate_gage_alteration() (and associated functions) currently return an error with the message "Less than 10 years of data...try again" if the gage has insufficient data. This is helpful, but also can be an issue for batch processing as it is a stop error and thus requires the user to wrap code in some sort of error function (i.e., I'm using possibly(get_percentiles, NA_real_) but there's others too like safely(), try()). If possible it would be preferable to return the warning message in user console, but return an actual NA value to the user environment. Not best way to do this but can look.

@nickrsan
Copy link
Member

I think my preferred response to this would be a parameter that:

  1. defaults to a stop error (or equivalent error level) still,
  2. but can be switched to a warning + fill NA or even a warning + run data anyway

That way, people new to the package/process won't send data through and either think it's fine or just broken, but others can easily switch the behavior. Would something like this work for the workflow you're using? If not, I'm fine to adjust it, just wanted to note what I'm thinking here.

@nickrsan
Copy link
Member

Also, I'd be fine with a PR that switches the current behavior to warn + fill as the default for now, then we could leave this issue open until I'm able to get the other behaviors in place (and I'd still use the PR code for warn and fill). Just wanted to not stand in your way if this is a big thing limiting your ability to use evaluate_gage_alteration.

@ryanpeek
Copy link
Contributor Author

I don't know...I also like the default to stop error...since it's important to know about and should be the default. That's partly why I'm not sure it's as big a deal to fix this function in particular if I can get batch processing to work with the other functions that go into this one. I sort of feel like ultimately I'd rather have the flexibility of batching a bunch of gages or comids and pulling just the percentiles, or just the metrics as nice dataframes, than have something all lumped into one and then needs to be split back out again (from a batch perspective). Let's talk some more about this.

@nickrsan
Copy link
Member

OK, I think the FFCProcessor object could help a lot with this because it will save state, so it will know what has been run/what hasn't and be able to run only what's needed to provide results, even if you go to just retrieve a specific item. If you instantiated an FFCProcessor object for each gage, you'd then be able to call something like get_percentiles, and it will run everything it needs to and just return the percentiles to you. Agree on discussing a bit more on what's useful. Thanks!

@ryanpeek
Copy link
Contributor Author

@nickrsan Sorry to squish this in here...it's a bit broader and maybe worth a longer conversation:

In some senses it would be nice to be able to use these three functions and get back a dataframe for each (some of these already do this):

  • get_ffc_results_as_df() : would be nice to run this from the start (so feed the gage/comid instead of big results list
  • get_ffc_percentiles(): could be nice to have this work with a gage/comid OR a dataframe...so if you only wanted percentiles you could just call one function here and not worry about stepping through things.
  • evaluate_alteration: would return the little table showing for a given gage/comid (and maybe year?) a table that shows the + / - or indeterminate, as well as the plots if folks want to see them.

Currently we have to return the list, then convert to a dataframe, then use that to calc percentiles. From a batch processing standpoint, that puts more onus on the user. Which is fine if we show them the code to do these things.

Secondary functions are nice but may not be primary to analysis:

  • get_drh : this is nice but I don't necessarily foresee needing it specifically when folks are analyzing data, aside from visualizing the hydrograph. This is already done.
  • get_flow_data: getting the flow data is nice as a standalone. This is already done.

@nickrsan
Copy link
Member

This feels fine for where to discuss! Sounds like you're suggesting a set of new shortcuts that handle specific portions of the workflow, but give you spots to intervene before moving to the next? That'd be fine by me - maybe we make a new shortcuts.R function that contains code that supports workflows rather than doing data manipulation itself?

Currently we have to return the list, then convert to a dataframe, then use that to calc percentiles. From a batch processing standpoint, that puts more onus on the user. Which is fine if we show them the code to do these things.

Just to confirm, you're only referring to this workflow in a batch processing context, right? Because I think we already have better methods for people not batching. I agree that evaluate_alteration should probably change functionality from handling everything to specifically saying whether something is altered or not, and we'd provide another function that does the shortcut that handles everything for people.

@ryanpeek
Copy link
Contributor Author

ryanpeek commented Jan 16, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants