-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUGFIX avg garden size #85
Conversation
…e to properties with 'unknown' MSOA property type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes more sense! The code ran fine for me with
epc_df["msoa_avg_outdoor_space_property_type"].unique()
shape: (3,)
Series: 'msoa_avg_outdoor_space_property_type' [str]
[
"Houses"
"unknown"
"Flats"
]
…'house' removed from property_type strings
@lizgzil this is ready for another review. I had to change the way the function identifies houses. This function is used in the I have tested the
|
hey @crispy-wonton - I just ran:
and got |
That's the input EPC data file. To generate
Results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@crispy-wonton thanks! This looks good! I got your results. My previous comment was due to some confusion after having not pulled the branch properly!!
Fixes #84
Please could you double check the reasoning in the issue description and that the bugfix is appropriate.
I have tested and it works. If you want to test it run:
python -i asf_heat_pump_suitability/pipeline/run_scripts/run_add_features.py --epc_path s3://asf-heat-pump-suitability/outputs/2023Q4/20240824_2023_Q4_EPC_weighted.parquet -y 2023 -q 4
I would advise commenting out everything from line 106 onwards in
run_add_features.py
before running and checking the output ofepc_df
in terminal. This will be much quicker.Outputs from my test:
We can see the categories of this feature now match in the supplementary dataset (
msoa_avg_outdoor_space_property_type
) and theepc_df
EPC dataset.