-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ELEX-3469 save aggregate predictions to s3 #100
ELEX-3469 save aggregate predictions to s3 #100
Conversation
…have them written out to s3, with command line argument
…use it won't work if get_estimands() hasn't been called yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not opposed to moving the call for self.model.get_national_summary_estimates(None, None, 0, 0.99)
to within get_estimates
. At the moment it is executed outside the model which then calls get_national_summary_votes_estimates
in the client. But if we want to make this change we need to:
- Talk to the live team, because this changes how they call the aggregate model
- Make corresponding changes in the testbed
- Make sure we pass through the relevant arguments.
Yes! This is great, I was wondering about all of this 🎉 I moved it because |
Also I just realized this is PR 100 😄 🎉 💯 |
Alright so the feedback from Jen is please don't change anything 😂 So I'll add the method back in and I have some ideas for how to accomplish this... 🤔 |
…e making sure the results get written where they need to be written
@lennybronner alright I just pushed some changes that I think should accomplish this and preserve the current |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand why so much change was necessary? Why can't we just write the data to s3 from get_national_summary_votes_estimates
?
We could, but,
I think those were my main thoughts on this. I'll take a look again to see what can be simplified, but, 🤔 |
I don't really like re-running the national summary estimates, I'm not sure we've thought about whether something changes. I understand why you just return the national summary estimates as they exist now, but that only runs the aggregate model once per instantiation -- which we do not want. Maybe we add a parameter to |
Ok, I just pushed a possible solution for this based on this comment and the one you have on the code 🙌🏻 The thinking is this:
The other option is to re-compute the national summary votes every time |
Although either way, I just realized if you call that method and you're calling |
… options to write output data in get_national_summary_votes_estimates(), don't write out all the data all over again if we don't have to
… own version of that
Ok @lennybronner take a look at the changes I just pushed. It borrows some logic / variables from |
This generally looks good now, thanks so much for figuring this out. I left one small question. This will necessitate making changes to the model testbed, so please make those (and then run the 2020 election through the testbed to make sure it all still works as expected). Also, the live will need to make small changes to how they interact with the national summary data, so as long as they aware aware of this, this seems fine. Where does re-calling |
Thanks! 🎉
|
(1) and (2) The structure of the response of (3) that's what we want. |
(3) Great! 🎉 (1) and (2), no, {"margin" : [agg_pred, agg_lower, agg_upper]} I made sure to preserve that. When writing to s3, I'm currently converting that to a CSV but that can easily be a JSON or whatever is best 🤔 |
amazing, thank you! And just to confirm, you've run this from the testbed (including writing to s3) and it works? |
Thanks!! I will test that now! 🤔 |
@lennybronner ok, so this testbed command works as we'd expect it to:
🎉 but for writing to s3 from the testbed, there's no way to do that via the CLI currently, although I could add the options. And if I did, it would just write through |
Yes, this is why I meant this needs a corresponding PR in the testbed. It's ok to overwrite the predictions produced in dev in s3. |
Oh sorry, I misunderstood! Ok, I'll work on that now 😅 👍🏻 |
@lennybronner this is ready for review 🎉 washpost/elex-live-model-testbed#25 |
…et_national_vote_summary_estimates() and running pre-commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, and now just to confirm, you ran this with your testbed branch?
Yup! 🎉 |
Description
Hi! The changes in this PR allow us to save aggregate model output (national summary estimates) to s3. It also provides a new CLI argument,
--national_summary
, which will produce aggregate model output via CLI 🎉 Hopefully I did this right 😬 Thanks!Jira Ticket
ELEX-3469
Test Steps
Set
APP_ENV=dev
andDATA_ENV=dev
, then run any command with--national_summary
, e.g.: