returning warning and message or NA instead of "less than 10 years of data...try again" #35

ryanpeek · 2020-01-15T23:24:43Z

evaluate_gage_alteration() (and associated functions) currently return an error with the message "Less than 10 years of data...try again" if the gage has insufficient data. This is helpful, but also can be an issue for batch processing as it is a stop error and thus requires the user to wrap code in some sort of error function (i.e., I'm using possibly(get_percentiles, NA_real_) but there's others too like safely(), try()). If possible it would be preferable to return the warning message in user console, but return an actual NA value to the user environment. Not best way to do this but can look.

The text was updated successfully, but these errors were encountered:

nickrsan · 2020-01-15T23:33:04Z

I think my preferred response to this would be a parameter that:

defaults to a stop error (or equivalent error level) still,
but can be switched to a warning + fill NA or even a warning + run data anyway

That way, people new to the package/process won't send data through and either think it's fine or just broken, but others can easily switch the behavior. Would something like this work for the workflow you're using? If not, I'm fine to adjust it, just wanted to note what I'm thinking here.

nickrsan · 2020-01-15T23:35:09Z

Also, I'd be fine with a PR that switches the current behavior to warn + fill as the default for now, then we could leave this issue open until I'm able to get the other behaviors in place (and I'd still use the PR code for warn and fill). Just wanted to not stand in your way if this is a big thing limiting your ability to use evaluate_gage_alteration.

ryanpeek · 2020-01-15T23:48:29Z

I don't know...I also like the default to stop error...since it's important to know about and should be the default. That's partly why I'm not sure it's as big a deal to fix this function in particular if I can get batch processing to work with the other functions that go into this one. I sort of feel like ultimately I'd rather have the flexibility of batching a bunch of gages or comids and pulling just the percentiles, or just the metrics as nice dataframes, than have something all lumped into one and then needs to be split back out again (from a batch perspective). Let's talk some more about this.

nickrsan · 2020-01-15T23:57:17Z

OK, I think the FFCProcessor object could help a lot with this because it will save state, so it will know what has been run/what hasn't and be able to run only what's needed to provide results, even if you go to just retrieve a specific item. If you instantiated an FFCProcessor object for each gage, you'd then be able to call something like get_percentiles, and it will run everything it needs to and just return the percentiles to you. Agree on discussing a bit more on what's useful. Thanks!

ryanpeek · 2020-01-15T23:58:40Z

@nickrsan Sorry to squish this in here...it's a bit broader and maybe worth a longer conversation:

In some senses it would be nice to be able to use these three functions and get back a dataframe for each (some of these already do this):

get_ffc_results_as_df() : would be nice to run this from the start (so feed the gage/comid instead of big results list
get_ffc_percentiles(): could be nice to have this work with a gage/comid OR a dataframe...so if you only wanted percentiles you could just call one function here and not worry about stepping through things.
evaluate_alteration: would return the little table showing for a given gage/comid (and maybe year?) a table that shows the + / - or indeterminate, as well as the plots if folks want to see them.

Currently we have to return the list, then convert to a dataframe, then use that to calc percentiles. From a batch processing standpoint, that puts more onus on the user. Which is fine if we show them the code to do these things.

Secondary functions are nice but may not be primary to analysis:

get_drh : this is nice but I don't necessarily foresee needing it specifically when folks are analyzing data, aside from visualizing the hydrograph. This is already done.
get_flow_data: getting the flow data is nice as a standalone. This is already done.

nickrsan · 2020-01-16T00:12:06Z

This feels fine for where to discuss! Sounds like you're suggesting a set of new shortcuts that handle specific portions of the workflow, but give you spots to intervene before moving to the next? That'd be fine by me - maybe we make a new shortcuts.R function that contains code that supports workflows rather than doing data manipulation itself?

Currently we have to return the list, then convert to a dataframe, then use that to calc percentiles. From a batch processing standpoint, that puts more onus on the user. Which is fine if we show them the code to do these things.

Just to confirm, you're only referring to this workflow in a batch processing context, right? Because I think we already have better methods for people not batching. I agree that evaluate_alteration should probably change functionality from handling everything to specifically saying whether something is altered or not, and we'd provide another function that does the shortcut that handles everything for people.

ryanpeek · 2020-01-16T00:26:03Z

Yup, I think that makes sense. Maybe we specifically can write a `batch_ffc.R` script that would take demonstrate how to step through and return these pieces as dataframes (easier to bind lots together if pulling same cols/colnames). Still need to make a few tweaks to make that possible, but makes sense to me.

…

On Wed, Jan 15, 2020 at 4:12 PM Nick Santos ***@***.***> wrote: This feels fine for where to discuss! Sounds like you're suggesting a set of new shortcuts that handle specific portions of the workflow, but give you spots to intervene before moving to the next? That'd be fine by me - maybe we make a new shortcuts.R function that contains code that supports workflows rather than doing data manipulation itself? Currently we have to return the list, then convert to a dataframe, then use that to calc percentiles. From a batch processing standpoint, that puts more onus on the user. Which is fine if we show them the code to do these things. Just to confirm, you're only referring to this workflow in a batch processing context, right? Because I think we already have better methods for people not batching. I agree that evaluate_alteration should probably change functionality from handling everything to specifically saying whether something is altered or not, and we'd provide another function that does the shortcut that handles everything for people. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#35?email_source=notifications&email_token=ABMX3MFBYOQHI6LY6PPM4CLQ56Q5NA5CNFSM4KHLNAO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJCIVXA#issuecomment-574917340>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABMX3MFQGLOYGXA7N2HUIZDQ56Q5NANCNFSM4KHLNAOQ> .

-- *"When we try to pick out anything by itself, we find it hitched to everything else in the universe."John Muir (My First Summer in the Sierra, 1911)* *-----------------------------------------------------* Ryan Peek, PhD (he/him) Aquatic Ecologist, Post-Doctoral Researcher Center for Watershed Sciences, UC Davis *ryanpeek.github.io <http://ryanpeek.github.io>* *@riverpeek* 530.754.5351 *-----------------------------------------------------*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

returning warning and message or NA instead of "less than 10 years of data...try again" #35

returning warning and message or NA instead of "less than 10 years of data...try again" #35

ryanpeek commented Jan 15, 2020

nickrsan commented Jan 15, 2020

nickrsan commented Jan 15, 2020

ryanpeek commented Jan 15, 2020

nickrsan commented Jan 15, 2020

ryanpeek commented Jan 15, 2020

nickrsan commented Jan 16, 2020

ryanpeek commented Jan 16, 2020 via email

returning warning and message or NA instead of "less than 10 years of data...try again" #35

returning warning and message or NA instead of "less than 10 years of data...try again" #35

Comments

ryanpeek commented Jan 15, 2020

nickrsan commented Jan 15, 2020

nickrsan commented Jan 15, 2020

ryanpeek commented Jan 15, 2020

nickrsan commented Jan 15, 2020

ryanpeek commented Jan 15, 2020

nickrsan commented Jan 16, 2020

ryanpeek commented Jan 16, 2020 via email