-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization #92
base: master
Are you sure you want to change the base?
Optimization #92
Conversation
@twallema feel free to test it out (maybe checkout to my branch to try); @jorisvandenbossche, there are still a lot of shortcuts and hardcoded elements, which requires further abstraction, so any input is welcome. We could actually make the |
src/covid19model/data/parameters.py
Outdated
@@ -54,27 +54,30 @@ def get_COVID19_SEIRD_parameters(): | |||
----------- | |||
parameters = get_COVID19_SEIRD_parameters() | |||
""" | |||
abs_dir = os.path.dirname(__file__) | |||
rel_dir = os.path.join(abs_dir, "../../../data/raw/model_parameters") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do a merge of master, as I think this already changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I did update this morning and merged with the adjustments Jenna did earlier.
|
||
# run model | ||
time = len(data.time) + lag_time # at least this length | ||
output = model.sim(time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also run with [-lag_time, time]
? (or does the model not work with negative time?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solve_ivp
from SciPy requires the user to define the times at which the model output should be returned, they can be negative without any problem. So assigning the time as [ -lag_time, time]
in the model output xarray, and assigning the time as [0 1 2 ... 7]
in the data xarray should allow for pairing without using values as suggested below. However, this does not really abstract the lag_time
. I'm also wondering if this approach will work if the times do not match, f.i. when using a discrete-time model with step-length 0.235, some form of extrapolation will be necessary to match outputs. I've seen xarray can be linked to SciPy interpolate: see http://xarray.pydata.org/en/stable/generated/xarray.DataArray.interp.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default will be to only align those values where the time values match exactly, but you can indeed make this more flexible (eg http://xarray.pydata.org/en/stable/generated/xarray.DataArray.reindex.html also allows a simple "nearest" or a certain tolerance that should be allowed)
|
||
# define new parameters in model | ||
for i, parameter in enumerate(model_parameters): | ||
model.parameters.update({parameter: theta[i]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if we should introduce a concept of "cloning" a model, to avoid changing the original one when setting parameters like this for an optimization run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to define this in a separate issue/PR to provide a more generic solution.
output_to_compare = subset_output.sel(time=slice(lag_time, lag_time - 1 + len(data))) | ||
|
||
# calculate sse -> we could maybe enable more function options on this level? | ||
sse = np.sum((data.values - output_to_compare.values)**2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to check that the "time" actually matched? (with the provided data) So if you adjust the time of the model output with lag_time, this calculation could be done with the xarray object (so without .values
), and then this alignment on the time dimension will be done automatically
data/README.md
fileThis PR is a blueprint and first attempt to provide an optimization functionality taking into account the
extra_time
orlagtime
concept requested by @twallema:mcmc.py
file to optimize, as this has not yet any mcmc functionality as functionslag_time
), but it is a first attempt to make the same sse function useable with fixed and optimized lag_time.See the example notebook
0.2-stijnvanhoey-refactor-optimization.ipynb
for an example of the usage, which I tried to keep close to the currentfit_pso
function.@twallema we should NOT have both
sse
andSSE
there and neither remain withfit
andfit_pso
but as current test and work in progress, I kept them both for the time being.Note I only worked on the sse optimiziation as such and not on the MCMC part