ENH: add projections on the level of individuals #9

dkopasker · 2022-11-23T15:54:21Z

Here is a data sample for the individual-level dataset.

heed_mortality data sample_23nov22dk.xlsx

vkhodygo · 2022-11-23T17:41:46Z

vkhodygo · 2022-11-24T17:05:40Z

@dkopasker I'd like to ask you a favour: please, do not distribute any data in *.xlsx files unless it contains some specific formatting and you can't avoid it. It's a relatively minor inconvenience, but still.

I also think we could drop some data. We don't actually need the prob_death column since it has too many empty entries. In addition, if this column is a negation of survive, some of them must be there but are missing, and two columns correlate with each other. Reducing the number of dimensions is always a good thing.

Other than that, we group the data by hh_id and count the total number of entries in each of them. That produces the total number of people per one household. Their respective sum gives you the population/sample size. What are my next steps?

dkopasker · 2022-11-28T10:11:53Z

The prob_death column is intended to be populated by estimates from the mortality data you have formed. I think it is worth keeping to error check the code. This can also be checked against counts from the survive column. Having ways to check the data and code is more important, at least at this stage, than reducing dimensions.

The next steps are to assign hh_id to your mortality model such that the relevant sums equal estimates from external data.

vkhodygo · 2022-11-28T14:37:12Z

The next steps are to assign hh_id to your mortality model such that the relevant sums equal estimates from external data.

Can I get this data first? At least for Wales/NI as they are relatively small.

dkopasker · 2022-11-28T15:32:04Z

Various relevant dataset are available here:
https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/families/bulletins/familiesandhouseholds/2020

vkhodygo · 2022-11-28T17:28:46Z

Those are relevant, but far from complete. I can make some guesses and use those aggregates, but I have no knowledge about the actual household composition. We can easily assume that one-person households are comprised of a single person aged 18 and above. For two people that becomes increasingly more difficult:

a couple;
a parent and a kid;
a parent and an adult kid.

This is clearly the case of combinatorial explosion which requires some external limits to be introduced.

What I meant was something similar to what you had provided originally.

dkopasker · 2022-11-29T09:31:27Z

"Households by type of household and family, regions of England and GB constituent countries" gives more detail on household composition. Beyond this you can make and document assumptions.

ONS data is usually the best quality for population-level statistics, but you could look for other data sources.

vkhodygo changed the title ~~Individual-level dataset~~ ENH: add projections on the level of individuals Nov 23, 2022

vkhodygo self-assigned this Nov 23, 2022

vkhodygo added enhancement New feature or request question Further information is requested labels Nov 23, 2022

vkhodygo assigned rachelmthomson and dkopasker Nov 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: add projections on the level of individuals #9

ENH: add projections on the level of individuals #9

dkopasker commented Nov 23, 2022

vkhodygo commented Nov 23, 2022

vkhodygo commented Nov 24, 2022

dkopasker commented Nov 28, 2022 •

edited by vkhodygo

Loading

vkhodygo commented Nov 28, 2022

dkopasker commented Nov 28, 2022

vkhodygo commented Nov 28, 2022

dkopasker commented Nov 29, 2022

ENH: add projections on the level of individuals #9

ENH: add projections on the level of individuals #9

Comments

dkopasker commented Nov 23, 2022

vkhodygo commented Nov 23, 2022

vkhodygo commented Nov 24, 2022

dkopasker commented Nov 28, 2022 • edited by vkhodygo Loading

vkhodygo commented Nov 28, 2022

dkopasker commented Nov 28, 2022

vkhodygo commented Nov 28, 2022

dkopasker commented Nov 29, 2022

dkopasker commented Nov 28, 2022 •

edited by vkhodygo

Loading