Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Location feature for better accuracy #288

Merged
merged 26 commits into from
Mar 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
9dcea07
fix: update celery config to resolve remaining task issues
ajbarnes Feb 9, 2024
02526c3
feat: update locations model, ingest ops, and associated VA location …
ajbarnes Mar 4, 2024
567a12b
updated docs to reflect location changes
mboyas-mitre Mar 7, 2024
fc15a53
update to admin training - correct expected values for active facilit…
mboyas-mitre Mar 8, 2024
73687cb
refactor: correct ruff B905: zip() without an explicit strict
ajbarnes Mar 15, 2024
6f2e087
fix: correct test_views of va_export to account for new location prop…
ajbarnes Mar 15, 2024
ab5e082
BUG FIX: load_locations.py
mboyas-mitre Mar 8, 2024
b8692bc
BUG FIX: making dashboard work with new location tree structure
mboyas-mitre Mar 8, 2024
aaca7f3
FIX: change how we flag to the users what Errors and Warnings occur d…
mboyas-mitre Mar 8, 2024
fae9ce5
ADD: new management command to refresh locations for existing VA data
mboyas-mitre Mar 8, 2024
66b5bc5
FIX: when a row is deleted from the locations CSV and it already exis…
mboyas-mitre Mar 8, 2024
779e125
added management command to export the current list of locations to a…
mboyas-mitre Mar 8, 2024
9925222
FIX: on loading locations, only mark one inactive if it is not alread…
mboyas-mitre Mar 9, 2024
9652ed6
bug fix in refresh locations management command
mboyas-mitre Mar 20, 2024
9b65bbc
updated user guides to reflect new location management commands
mboyas-mitre Mar 20, 2024
f6bb525
added yes/no catch to uploading a new location file with delete_previ…
mboyas-mitre Mar 20, 2024
4ec3c40
fix lint errors
mboyas-mitre Mar 20, 2024
a65a86c
fix lint errors - one more
mboyas-mitre Mar 20, 2024
d6c1694
fixed black formatting errors
mboyas-mitre Mar 20, 2024
2fe0944
fix black mismatch in views.py
mboyas-mitre Mar 20, 2024
53af11e
admin training guides formatting error
mboyas-mitre Mar 20, 2024
4fdc76c
remove blank lines for linting
mboyas-mitre Mar 20, 2024
e690dbe
re-attempt for location admin guides in docs
mboyas-mitre Mar 20, 2024
f6a6c86
re-attempt for location admin guides in docs - redo
mboyas-mitre Mar 20, 2024
f7b0e47
rename North-Western to North_Western in static geojson to force map …
mboyas-mitre Mar 20, 2024
a1337ee
refactor: address pr code-review comments
ajbarnes Mar 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion compose/django/celery/flower/start
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ set -o nounset

export DJANGO_SETTINGS_MODULE=config.settings.production

celery -A config.celery_app.app flower --basic_auth="${CELERY_FLOWER_USER}:${CELERY_FLOWER_PASSWORD}" -l INFO
celery -A config.celery_app.app flower --url-prefix="/celery" --basic_auth="${CELERY_FLOWER_USER}:${CELERY_FLOWER_PASSWORD}" -l INFO
16 changes: 13 additions & 3 deletions config/settings/base.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Base settings to build other settings files upon.
"""

import os
from pathlib import Path

Expand Down Expand Up @@ -277,17 +278,26 @@
}

# Celery

if USE_TZ:
CELERY_TIMEZONE = TIME_ZONE
CELERY_BROKER_URL = env("CELERY_BROKER_URL", default="redis://redis:6379/0")
CELERY_RESULT_BACKEND = CELERY_BROKER_URL
CELERY_RESULT_EXTENDED = True
CELERY_RESULT_BACKEND_ALWAYS_RETRY = True
CELERY_RESULT_BACKEND_MAX_RETRIES = 5
CELERY_ACCEPT_CONTENT = ["json"]
CELERY_TASK_SERIALIZER = "json"
CELERY_RESULT_SERIALIZER = "json"
CELERY_TASK_TIME_LIMIT = 60 * 60
CELERY_TASK_SOFT_TIME_LIMIT = 500

# 2700 = 45min before worker task exception + potential cleanup
# 3000 = 50min before forced termination of worker task
# Needed for long-running import & coding jobs
CELERY_TASK_TIME_LIMIT = 3000
CELERY_TASK_SOFT_TIME_LIMIT = 2700

CELERY_BEAT_SCHEDULER = "django_celery_beat.schedulers:DatabaseScheduler"
CELERY_WORKER_SEND_TASK_EVENTS = True
CELERY_TASK_SEND_SENT_EVENT = True

# Allauth
ACCOUNT_ALLOW_REGISTRATION = env.bool("DJANGO_ACCOUNT_ALLOW_REGISTRATION", False)
Expand Down
1 change: 1 addition & 0 deletions config/wsgi.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
framework.

"""

import os

from django.core.wsgi import get_wsgi_application
Expand Down
81 changes: 60 additions & 21 deletions docs/training/admin_guides.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,12 @@ as individuals described below.
To set up VA Explorer for the Geographic Access mentioned in that section, you
must load location data into the system.

Locations in VA Explorer follow a hierarchical structure. A specific geographic
region or jurisdiction has a name ("Name"), a type ("Type"), and a parent
("Parent"). By specifying the Parent field, you can achieve the arbitrary level
of nesting required to make a tree.
Locations in VA Explorer follow an assumed three-level hierarchical structure
by which each facility or hospital maps to an associated Level 2 ("District")
and Level 1 ("Province") hierarchy. Each facility also has a corresponding
`key`, which represents the XML option used in the dropdown list within ODK or
Kobo, and a `status` which indicates if the Facility is actively producing VAs
or not.

The table below shows an example location hierarchy for States, Counties, and
Cities in the United States. In this example, we have one state (California), two
Expand All @@ -24,33 +26,50 @@ Los Angeles).

```{csv-table} An example geographic hierarchy in tabular format
:header-rows: 1
Name,Type,Parent
California,State,
Marin County,County,California
Sausalito,City,Marin County
San Rafael,City,Marin County
Los Angeles County,County,California
Los Angeles,City,Los Angeles County
Province,District,Name,Key,Status
California,Marin County,Sausalito Hospital, sausalito_hospital, Active
California,Marin County,San Rafael Clinic, san_rafael_clinic, Inactive
California,Los Angeles County,Los Angeles Hospital, los_angeles_hospital, Active
```

```{figure} ../_static/img/geo_hierarchy.png
:width: 75%
The input is similarly structured to support any number of geographic hierarchies
for VA Explorer users. With a {term}`CSV` file in hand, you can now supplement
your initial system set up with the `load_locations` management command. Full
usage details for this are provided in [Management Commands](#management-commands).
The specification of the input CSV file is as follows:

<small> A tree data structure showing the example geographic hierarchy from the previous table</small>
```{csv-table} Expected columns for the location file
:header-rows: 1
Column Name, Description, Specifics
Province,Level 1 Administrative Boundary Name,One of the `label::English` values as defined in the VA XLSForm
District,Level 2 Administrative Boundary Name,One of the `label::English` values as defined in the VA XLSForm
Name, Facility or Hospital Name, One of the `label::English` values as defined in the VA XLSForm
Key, Facility or Hospital XML Value, The choice name associated with the
`label::English` defined in the previous column Status, Whether the facility is
still actively producing VAs, One of: 'Active' or 'Inactive'
```

The input is similarly structured to support any number of geographic hierarchies
for VA Explorer users. With a {term}`CSV` file in hand, you can now supplement your initial
system set up with the `load_locations` management command. Full usage details
for this are provided in [Management Commands](#management-commands).

Following this command, VA Explorer should support geographic restrictions to any
area or facility you’ve provide, making them available during user creation and
editing. Note that access to geographies higher up in the given tree equates to
access for that geographic area as well as all its children-geographies. For
example, in the above tree a user with access to California also has access to
Marin County, Los Angeles County, Sausalito, San Rafael, and Los Angeles.

#### Updating locations in VA Explorer

When VAs are imported into VA Explorer, they are matched exactly on the
locations loaded into the system in this step. If a VA does not have a valid
location field, VA Explorer will track that mismatch as an error that either
needs to be corrected in the VA Explorer locations file or in the underlying
VA data. To add a location to VA Explorer, re-upload a revised location file
following the `load_locations` management command. If a row is deleted from
the locations file, it will also be kept in VA Explorer and marked inactive.
To permanently delete locations in VA Explorer, re-upload a revised location
file following the `load_locations` management command with `--delete_previous`.
Warning: doing so may delete all VAs in the database, so make sure to
backup the system first.

### Creating & Editing Users

Click "Users" in the navigation bar to visit the Users page. Click the "Create
Expand Down Expand Up @@ -174,7 +193,7 @@ generally useful to admins. An even fuller list of these can be found under

* - :rspan:`1` ``load_locations``
- ``--csv_file`` (*)
- :rspan:`1` Used to load initial location date data needed to support
- :rspan:`1` Used to load initial location data needed to support
Geographic access. ``csv_file`` is a filename in the local folder or
``unix:path`` format location of the file. Can be used with
``delete_previous`` to delete existing location data and start fresh with
Expand All @@ -183,9 +202,29 @@ generally useful to admins. An even fuller list of these can be found under

* - ``--delete_previous``

* - ``refresh_locations``
- None
- Used to refresh the locations assigned to all of the VAs in
the database if a new location file is loaded into the system using the
``load_locations`` management command. This command does not add or
delete any VAs from the database; it simply remaps the existing VAs
to the new locations.

* - ``export_locations``
- ``--output_file``
- Utility to obtain the current list of locations in the VA
Explorer system in the CSV format with header fields
corresponding to fields expected by the system. The intended use case
for this utility is when administrators need to update the location file
by first downloading the existing locations, making any necessary updates,
and re-uploading a revised version using ``load_locations``. ``output_file``
is a filename or ``unix:path`` format location to save template to. Default is
``locations_[[date]].csv`` where [[date]] is the date and time of export.


* - :rspan:`1` ``run_coding_algorithms``
- ``--overwrite``
- :rspan:`1` Used to call supported algorithms for assignment of cause of
- Used to call supported algorithms for assignment of cause of
death to all uncoded verbal autopsies. ``overwrite`` allows this command
to clear (and save) all existing CoD assignments before running on
every verbal autopsy regardless of whether it's coded or not. ``True`` or
Expand Down
1 change: 1 addition & 0 deletions requirements/base.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ pandas==1.5.1
fuzzywuzzy==0.18.0
python-Levenshtein==0.20.7
tqdm==4.65.0
anytree==2.12.1

# Django
django==4.1.2
Expand Down
18 changes: 9 additions & 9 deletions va_explorer/home/va_trends.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import pandas as pd
from django.db.models import F
from pandas._libs.tslibs.offsets import relativedelta
from pandas.tseries.offsets import DateOffset

from va_explorer.va_data_management.constants import REDACTED_STRING
from va_explorer.va_data_management.utils.date_parsing import (
Expand Down Expand Up @@ -40,7 +40,7 @@
VA_TABLE_COLUMNS = ["24", "1 week", "1 month", "Overall"]
VA_GRAPH_TYPES = VA_TABLE_ROWS

MONTHS = [START_MONTH + relativedelta(months=i) for i in range(12)]
MONTHS = [START_MONTH + DateOffset(months=i) for i in range(12)]
VA_GRAPH_Y_DATA = 12 * [0.0]
VA_GRAPH_X_DATA = [month.strftime("%Y-%m") for month in MONTHS]

Expand Down Expand Up @@ -77,14 +77,14 @@ def get_context_for_va_table(va_list, user):
"id": va.id,
"deceased": f"{va.Id10017} {va.Id10018}",
"interviewer": va.Id10010,
"interviewed": parse_date(va.Id10012)
if (va.Id10012 != "dk")
else "Unknown",
"interviewed": (
parse_date(va.Id10012) if (va.Id10012 != "dk") else "Unknown"
),
"dod": parse_date(va.Id10023) if (va.Id10023 != "dk") else "Unknown",
"facility": va.location.name if va.location else "Not Provided",
"cause": va.causes.all()[0].cause
if len(va.causes.all()) > 0
else "Not Coded",
"cause": (
va.causes.all()[0].cause if len(va.causes.all()) > 0 else "Not Coded"
),
"warnings": len(
[
issue
Expand Down Expand Up @@ -145,7 +145,7 @@ def get_trends_data(user):
# Load the VAs that are collected over various periods of time
vas_24_hours = va_df[va_df["date"] == TODAY].index
vas_1_week = va_df[va_df["date"] >= (TODAY - timedelta(days=7))].index
vas_1_month = va_df[va_df["date"] >= (TODAY - relativedelta(months=1))].index
vas_1_month = va_df[va_df["date"] >= (TODAY - DateOffset(months=1))].index
vas_overall = va_df.sort_values(by="id").index

# Graphs of the past 12 months, not including this month
Expand Down
2 changes: 1 addition & 1 deletion va_explorer/static/data/zambia_geojson.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions va_explorer/static/js/dashboard.js
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,7 @@ const dashboard = new Vue({
geojson.features = this.geojson.features.filter(feature => {
return borders.includes(feature.properties.area_level_label)
})

this.layer = L.geoJson(geojson, {
style: function (feature) {
if (feature.properties.area_level_label !== 'Country') {
Expand Down
9 changes: 5 additions & 4 deletions va_explorer/va_analytics/utils/loading.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,8 @@ def load_va_data(
user_vas_filtered.annotate(
district_name=Subquery(
Location.objects.values("name").filter(
Q(path=Substr(OuterRef("location__path"), 1, 8)), Q(depth=2)
Q(path=Substr(OuterRef("location__path"), 1, 12)),
Q(depth=3),
)[:1]
)
)
Expand All @@ -98,7 +99,7 @@ def load_va_data(
user_vas_filtered.annotate(
province_name=Subquery(
Location.objects.values("name").filter(
Q(path=Substr(OuterRef("location__path"), 1, 4)), Q(depth=1)
Q(path=Substr(OuterRef("location__path"), 1, 8)), Q(depth=2)
)[:1]
)
)
Expand Down Expand Up @@ -211,7 +212,7 @@ def load_va_data(
user_vas_filtered.annotate(
province_name=Subquery(
Location.objects.values("name").filter(
Q(path=Substr(OuterRef("location__path"), 1, 4)), Q(depth=1)
Q(path=Substr(OuterRef("location__path"), 1, 8)), Q(depth=2)
)[:1]
)
)
Expand All @@ -225,7 +226,7 @@ def load_va_data(
user_vas_filtered.annotate(
district_name=Subquery(
Location.objects.values("name").filter(
Q(path=Substr(OuterRef("location__path"), 1, 8)), Q(depth=2)
Q(path=Substr(OuterRef("location__path"), 1, 12)), Q(depth=3)
)[:1]
)
)
Expand Down
8 changes: 5 additions & 3 deletions va_explorer/va_data_cleanup/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,11 @@ def get_context_data(self, **kwargs):
"dod": parse_date(va.Id10023) if (va.Id10023 != "dk") else "Unknown",
"facility": va.location.name if va.location else "Not Provided",
"deceased": va.deceased,
"cause": va.causes.all()[0].cause
if len(va.causes.all()) > 0
else "Not Coded",
"cause": (
va.causes.all()[0].cause
if len(va.causes.all()) > 0
else "Not Coded"
),
"warnings": len(
[
issue
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
from datetime import datetime

import pandas as pd
from django.core.management.base import BaseCommand

from va_explorer.va_data_management.models import Location


class Command(BaseCommand):
help = "Exports current facility list as a CSV"

def add_arguments(self, parser):
parser.add_argument(
"--output_file",
type=str,
nargs="?",
default="locations_"
+ datetime.now().strftime("%Y_%m_%d-%I_%M_%S_%p")
+ ".csv",
)

def handle(self, *args, **options):
locations = list(Location.objects.filter(location_type="facility").all())

keys = [location.path_string for location in locations]
loc_df = pd.DataFrame({"path_string": keys})

loc_df[["null", "country", "province", "district", "name"]] = loc_df[
"path_string"
].str.split(r"\/", expand=True)

loc_df["key"] = ""
loc_df["status"] = ""

for _, row in loc_df.iterrows():
loc = Location.objects.filter(path_string=row["path_string"])
row["key"] = [location.key for location in loc][0]

status_b = [location.is_active for location in loc][0]
row["status"] = "Active" if status_b else "Inactive"

loc_df = loc_df.drop(columns=["path_string", "null", "country"])

loc_df = loc_df.loc[
(loc_df["name"].notnull())
& (loc_df["province"].notnull())
& (loc_df["district"].notnull())
]

loc_df = loc_df[["province", "district", "name", "key", "status"]]

fname = options["output_file"]

try:
loc_df.to_csv(fname, index=False)
print(f"Exported current location file to {fname}")
except Exception as err:
print(
f"Error {err} occurred while exporting location file. Check \
provided filename is a valid path."
)
Loading
Loading