-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/onspd header checks and db flag #33
Conversation
55a1ada
to
05992f7
Compare
Pull Request Test Coverage Report for Build c7bb37cf-7e7b-422f-a9da-cef3986fe33fDetails
💛 - Coveralls |
def __init__( | ||
self, stdout=None, stderr=None, no_color=False, force_color=False | ||
): | ||
super().__init__(stdout, stderr, no_color, force_color) | ||
self.derived_fields = ["location"] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given we don't do anything with stdout
, stderr
, no_color
or force_color
in this function, you could make this:
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.derived_fields = ["location"]
..and then if the function in the base class changes signature, we don't need to account for it here.
for field in sorted(unexpected_fields): | ||
error_msg.append(f" + {field}") | ||
error_msg.append( | ||
"This probably means ONSPD has changed they're csv format and we need to update our import command." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they're --> their
Also, if this happens then really the thing we need to update is our model, isn't it?
cmd.stdout = StringIO() # Suppress output | ||
|
||
# Import data to default database | ||
opts = {"data_path": csv_path, "database": DEFAULT_DB_ALIAS} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we could test that this is the default behaviour if we don't explicitly supply the database paramat all. I guess calling the command like this we have to specify every param. This might be a situation where testing via call_command()
might be a good plan.
https://docs.djangoproject.com/en/4.2/ref/django-admin/#running-management-commands-from-your-code
Using call_command()
does prevent us from doing something like mocking out one of the internal functions of our management command class but if all we want to do is suppress/capture stdout/stderr, call_command()
allows us to do this.
def add_arguments(self, parser): | ||
super().add_arguments(parser) | ||
|
||
parser.add_argument( | ||
"--header", | ||
help="Specify which header the csv has", | ||
default="aug2022", | ||
choices=["may2018", "aug2022"], | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly!
There are a few comments that need follow up but in general this is looking really good 👍 |
Useful comments, Thanks. I think the fixups address them. |
In the two places where we switched to using |
Yep that would make sense wouldn't it :) |
This commit also brings in checks to let us know if the header is correct. If it's not then we get a list of missing/extra fields. This means it should be obvious next time they change it. It will also help us catch non erroring changes like nuts -> itl which we had missed.
5b3eb95
to
6f79e36
Compare
First commit sorts out the ONSPD header issue and does some nice checks. When it doesn't match we will get output that looks something like:
The second commit adds a
--database
flag to theBaseImporter
meaning that classes that inherit can target specific databases. I've added some tests for this, because it's easy not to use theBaseImporter
's cursor (i.e.self.cursor
).To test locally you can:
pip install -e /path/to/uk-geo-utils/
createdb -T every_election every_election_20241111_test
DATABASES
setting:python manage.py import_onspd --data-path /home/will/Downloads/ONSPD_AUG_2024/Data/
python manage.py migrate
. Now the above command should succeedpython manage.py import_onspd --data-path /home/will/Downloads/ONSPD_AUG_2024/Data/ --database other
python manage.py migrate --database other
Now it should succeed.Between the above commands you can check where the data is being imported with
select count(*) from uk_geo_utils_onspd;
in the relevant dbs.