Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thin plate spline bias solving #9

Open
tylerni7 opened this issue Sep 10, 2021 · 4 comments
Open

thin plate spline bias solving #9

tylerni7 opened this issue Sep 10, 2021 · 4 comments
Labels
enhancement New feature or request

Comments

@tylerni7
Copy link
Owner

Bias solving right now is trash. The thin-plate-spline model from mgnute was great and we should add that back in.

@MGNute MGNute added the enhancement New feature or request label Jul 13, 2023
@MGNute
Copy link
Collaborator

MGNute commented Jul 13, 2023

In truth, the bias should really be something that is stored and retrieved at the time of calculating the vtecs, and then we should have a separate process that periodically analyzes the biases for consistency and updates them based on some amount of assumed drift. In reality they don't have a ton of drift. Actually for the GPS satellites at least the NOAA ionosphere maps provide values for the bias that were pretty close to ours (last I looked anyway). But one thing that could be an option is storing these values in like a MySql database (maybe in the lookup_tables subfolder) and then creating a separate routine to estimate them and QC the estimates periodically. The bias-estimation tends to be far more computationally intensive than any other step, so this would reduce the compute time substantially. Thoughts?

@tylerni7
Copy link
Owner Author

I'm a fan of something along these lines in some cases.
I think probably it would be something like a separate process which runs and produces a bias file/database/whatever. If it exists, it gets used. If not, the expensive operation is done to calculate them. I don't think this makes sense to maintain for every (station, day) and (satellite, day). But maybe something that keeps < a few days worth of data is reasonable.

The NOAA biases are nice, but I don't think they have the stations, and we need the biases of both. I'm definitely not opposed to using published values when we can. I also don't know how often they are published (we want to be able to get values that are current, not "last weeks" biases or whatever).

MySQL doesn't really make sense here to me. Even a lighter thing like SQLite is probably overkill. I don't think it is something that we would want to commit to git, because it's ever growing and a lot of data. It might make more sense to stick with a native format in the already shared cache type folder. Maybe like h5 or something. It's a little less flexible, but we're already using things like that so no extra dependencies and things.

@MGNute
Copy link
Collaborator

MGNute commented Jul 14, 2023

I'm a fan of something along these lines in some cases. I think probably it would be something like a separate process which runs and produces a bias file/database/whatever. If it exists, it gets used. If not, the expensive operation is done to calculate them. I don't think this makes sense to maintain for every (station, day) and (satellite, day). But maybe something that keeps < a few days worth of data is reasonable.

Ya I was thinking something along these lines. I do actually think there is some value in storing more historical data on the bias estimates that could be used for optimizing and for QC, though that's a separate point. It could be pretty easy to create a command to clear say all but the most recent estimates. But yes, this structure makes sense.

The NOAA biases are nice, but I don't think they have the stations, and we need the biases of both. I'm definitely not opposed to using published values when we can. I also don't know how often they are published (we want to be able to get values that are current, not "last weeks" biases or whatever).

That's true, they don't have stations. And yes, we wouldn't want to use those; I just thought of that off hand in that the bias values are relatively stable over time...

MySQL doesn't really make sense here to me. Even a lighter thing like SQLite is probably overkill. I don't think it is something that we would want to commit to git, because it's ever growing and a lot of data. It might make more sense to stick with a native format in the already shared cache type folder. Maybe like h5 or something. It's a little less flexible, but we're already using things like that so no extra dependencies and things.

My bad, I meant SQLite, although I'm fine with using h5 since that's what we're already using and keeping it in the cache folder. But if we're good on those details I can create a branch and start thinking about how that would work.

@tylerni7
Copy link
Owner Author

The NOAA thing might still be useful. I kind of imagine some sort of Kalman filter-esque thing where we can get potentially slow but highly reliable data from NOAA or a more expensive operation like a full Thin Plate Spline model, and then use simpler things to model short term drift. That might be overkill depending on how the biases actually change over time though!

But yeah overall something like this seems useful to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants