-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Translate ALT key measure to ehrQL #24
Conversation
analysis/measure_definition.py | ||
--output output/alt_tests/measures.arrow | ||
-- | ||
--codelist codelists/opensafely-alanine-aminotransferase-alt-tests.csv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't think of this when we were pairing (you probably did), but the easiest way to pass a codelist as a parameter to the measure definition is, well, to pass a codelist as a parameter!
run: > | ||
ehrql:v1 generate-measures | ||
analysis/measure_definition.py | ||
--output output/alt_tests/measures.arrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of the sub-directory need not match the name of the codelist, because the action tells the reader which sub-directory is associated with which codelist. If a downstream action needs to associate a sub-directory with a codelist, then we may revisit this design decision.
outputs: | ||
highly_sensitive: | ||
dataset: output/dataset.csv.gz | ||
measures: output/alt_tests/measures.arrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can read an Arrow (Feather) file into a Pandas data frame with:
import pandas
pandas.read_feather("output/alt_tests/measures.arrow")
If we don't have Pandas installed on our system, then we can start JupyterLab in a container that does have Pandas installed with:
opensafely jupyter
The measure definition is parametrized by a codelist, such that it can be applied to any number of codelists. The measures file is written to a sub-directory of the `outputs` directory, because we anticipate a downstream transformation action reading the measures file and writing a transformed file, suitable for releasing and publishing. Co-authored-by: Alice Wong <[email protected]>
bce6e72
to
d9dd78f
Compare
def num_months(from_, to_): | ||
"""Returns the number of months between the given dates.""" | ||
return abs(to_.year - from_.year) * 12 + abs(from_.month - to_.month) | ||
|
||
|
||
# same year, same month | ||
assert num_months(date(2025, 1, 1), date(2025, 1, 31)) == 0 | ||
# different year, different month | ||
assert num_months(date(2024, 1, 1), date(2025, 2, 1)) == 13 | ||
# to_ before from_ | ||
assert num_months(date(2025, 2, 1), date(2024, 1, 1)) == 13 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realised that there was a bug in num_months
. I've fixed it, and have added some lightweight tests: namely, the three assert
s here. I'd like to highlight that they're lightweight tests. One day, we may choose to move them to test functions, which we may choose to call with, for example, pytest. Not today, however 🙂
The measure definition is parametrized by a codelist, such that it can be applied to any number of codelists. The measures file is written to a sub-directory of the
outputs
directory, because we anticipate a downstream transformation action reading the measures file and writing a transformed file, suitable for releasing and publishing.