-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added task criteria for abnormal lab prediction tasks #25
Conversation
WalkthroughThe pull request introduces new predicates and task definitions for evaluating laboratory tests and vital signs in the MIMIC-IV dataset. It includes predicates for various laboratory measurements such as creatinine, sodium, bicarbonate, hemoglobin, white blood cells, platelets, and mean arterial pressure, each with defined conditions for abnormal values. Additionally, new YAML files define tasks for predicting conditions like elevated creatinine, hyponatremia, metabolic acidosis, anemia, leukocytosis, thrombocytopenia, and hypotension within the first 24 hours of hospital admission. Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Outside diff range and nitpick comments (13)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/vital/hypotension/first_24h.yaml (2)
1-5
: Consider adding the specific threshold for low MAP values in the description.The description provides a clear overview of the task and inclusion criteria. However, it would be beneficial to include the specific threshold for what constitutes "low MAP values" (hypotension) in the description. This addition would provide immediate context for users reviewing the task definition.
48-48
: Add a new line character at the end of the file.To comply with YAML best practices and improve consistency, please add a new line character at the end of the file.
🧰 Tools
🪛 yamllint
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/leukocytosis/first_24h.yaml (1)
48-48
: Add a newline character at the end of the file.To comply with YAML best practices and resolve the yamllint error, please add a newline character at the end of the file.
Apply this change to the end of the file:
label: abnormally_high_wbc +
🧰 Tools
🪛 yamllint
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/hyponatremia/first_24h.yaml (2)
1-5
: Consider adding the specific threshold for hyponatremia in the description.The description provides a clear overview of the task. However, to align with the PR objectives and improve clarity, consider including the specific threshold for hyponatremia (sodium levels below 135 mEq/L) in the description.
Here's a suggested addition to the description:
This task predicts whether the patient will have abnormal _sodium_ values within the next 24 hours, given the first 24 hours of their admission to the ICU. To be included, patients must have at least one measurement for _sodium_ within the next 24 hours after their admission to ICU, -and their length of stay in ICU should be at least 24 hours. +and their length of stay in ICU should be at least 24 hours. Hyponatremia is defined as sodium +levels below 135 mEq/L.
48-48
: Add a newline character at the end of the file.To comply with YAML best practices and resolve the yamllint error, add a newline character at the end of the file.
Apply this change:
has: sodium: (1, None) label: abnormally_low_sodium +
🧰 Tools
🪛 yamllint
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/elevated_creatinine/first_24h.yaml (3)
1-5
: Consider adding the specific threshold for elevated creatinine.The description clearly states the task's purpose and inclusion criteria. However, it would be beneficial to include the specific threshold for what constitutes "elevated" creatinine levels (e.g., > 1.3 mg/dL) to align with the PR objectives and provide more clarity.
25-48
: Consider the following improvements to the windows section:
Input window: The start time is set to null. Consider setting it explicitly to the trigger time for clarity:
start: triggerNo_discharge window: The start time is exclusive (False), which might allow for immediate discharges. Consider making it inclusive:
start_inclusive: TrueTarget window: The label "abnormally_high_creatinine" is correct, but could be more specific. Consider renaming it to "elevated_creatinine" to align with the task name:
label: elevated_creatinineThese changes will improve the clarity and consistency of the task definition.
🧰 Tools
🪛 yamllint
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
48-48
: Add a new line character at the end of the file.To comply with YAML best practices and improve consistency, please add a new line character at the end of the file.
🧰 Tools
🪛 yamllint
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/anemia/first_24h.yaml (2)
1-5
: Consider adding the specific threshold for anemia in the description.The description clearly outlines the task's purpose and inclusion criteria. However, it would be beneficial to include the specific threshold for "low hemoglobin values" (anemia) in the description. This would provide immediate context for users reviewing the task definition.
Consider updating the description to include the threshold, for example:
"This task predicts whether the patient will have low hemoglobin values (anemia, defined as hemoglobin levels below 13 g/dL) within the next 24 hours, ..."
25-48
: Consider adding a comment to explain the "trigger" reference.The windows are correctly defined and align with the task description. However, it would be helpful to add a comment explaining that "trigger" refers to the "icu_admission" event defined in the trigger section. This would improve clarity for readers who might not be familiar with the YAML structure.
Consider adding a comment like this at the beginning of the windows section:
windows: # Note: "trigger" refers to the "icu_admission" event defined in the trigger section input: ...🧰 Tools
🪛 yamllint
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/thrombocytopenia/first_24h.yaml (1)
25-48
: LGTM: Well-defined windows with a minor formatting issue.The windows (input, no_discharge, and target) are correctly defined and align with the task description. They appropriately capture the required timeframes for predicting thrombocytopenia within 24 hours of ICU admission.
However, there's a minor formatting issue:
Add a newline character at the end of the file to comply with YAML best practices and resolve the yamllint error.
🧰 Tools
🪛 yamllint
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/metabolic_acidosis/first_24h.yaml (2)
26-31
: Clarify the 'input' window's start time.The 'input' window's start time is set to null. While this might be intentional to include all available data up to the trigger point, it could benefit from a brief comment explaining this choice. Consider adding a comment to clarify the reasoning behind setting the start time to null.
Example:
input: start: null # Include all available data up to the trigger point end: trigger + 24h # ... rest of the input window configuration
48-48
: Add a newline character at the end of the file.The file is missing a newline character at the end. While this doesn't affect functionality, it's a common convention to end files with a newline character. This can help with text processing and version control systems.
Please add a blank line at the end of the file.
🧰 Tools
🪛 yamllint
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (7)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/elevated_creatinine/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/hyponatremia/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/metabolic_acidosis/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/anemia/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/leukocytosis/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/thrombocytopenia/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/vital/hypotension/first_24h.yaml (1 hunks)
🧰 Additional context used
🪛 yamllint
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/elevated_creatinine/first_24h.yaml
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/hyponatremia/first_24h.yaml
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/metabolic_acidosis/first_24h.yaml
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/anemia/first_24h.yaml
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/leukocytosis/first_24h.yaml
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/thrombocytopenia/first_24h.yaml
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/vital/hypotension/first_24h.yaml
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
🔇 Additional comments (15)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/vital/hypotension/first_24h.yaml (2)
23-23
: LGTM: Trigger correctly set to ICU admission.The trigger is appropriately set to "icu_admission", which aligns with the task description of predicting hypotension within 24 hours of ICU admission.
26-40
: LGTM: Input and no_discharge windows are correctly defined.The input and no_discharge windows are well-defined and align with the task description:
- The input window correctly covers the first 24 hours after ICU admission.
- The no_discharge window ensures that no discharge events occur within the first 24 hours of ICU admission.
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/leukocytosis/first_24h.yaml (3)
1-5
: LGTM: Clear and comprehensive description.The description accurately outlines the task's purpose, prediction window, and inclusion criteria, aligning well with the PR objectives.
23-23
: LGTM: Appropriate trigger defined.The trigger is correctly set to "icu_admission", which aligns with the task description and PR objectives.
25-48
: LGTM: Well-defined windows section.The windows section accurately defines the input, no_discharge, and target windows, aligning with the task description and PR objectives. The constraints and labeling criteria are appropriately set.
🧰 Tools
🪛 yamllint
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/hyponatremia/first_24h.yaml (2)
23-23
: LGTM: Trigger is correctly set.The trigger is appropriately set to
icu_admission
, which aligns with the task description and PR objectives.
25-48
: LGTM: Windows section is well-defined.The windows section is correctly defined and aligns with the task description and PR objectives:
- The input window covers the first 24 hours after ICU admission.
- The no_discharge window ensures no discharge occurs within the first 24 hours.
- The target window is set to the 24 hours following the input window.
This structure accurately captures the requirements for predicting hyponatremia within the specified timeframe.
🧰 Tools
🪛 yamllint
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/elevated_creatinine/first_24h.yaml (3)
23-23
: LGTM: Trigger correctly set to ICU admission.The trigger is appropriately set to "icu_admission", which aligns with the task's purpose of predicting elevated creatinine levels after ICU admission.
1-48
: Overall assessment: Well-structured task definition with minor improvements needed.The task definition for predicting elevated creatinine levels in ICU patients is well-structured and aligns with the PR objectives. The suggested improvements, including adding specific thresholds, implementing undefined predicates, and refining the windows section, will enhance the clarity and consistency of the task definition. Once these minor issues are addressed, the file will provide a robust foundation for the abnormal lab prediction task.
🧰 Tools
🪛 yamllint
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
7-21
:⚠️ Potential issueImplement or provide references for undefined predicates.
Several predicates (icu_admission, icu_discharge, hospital_discharge, death, creatinine, abnormally_high_creatinine) are currently undefined (marked with ???). To ensure the task functions correctly:
- Implement these predicates directly in this file, or
- Provide references to where these predicates are defined if they are implemented elsewhere in the codebase.
Additionally, for the
abnormally_high_creatinine
predicate, consider setting thevalue_min
to 1.3 mg/dL to align with the PR objectives.To verify the existence of predicate definitions elsewhere in the codebase, you can run the following script:
✅ Verification successful
Undefined predicates are defined in
predicates.yaml
.All predicates (
icu_admission
,icu_discharge
,hospital_discharge
,death
,creatinine
,abnormally_high_creatinine
) are defined insrc/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml
. Ensure thatfirst_24h.yaml
properly references these predicates frompredicates.yaml
to maintain clarity and functionality.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Search for predicate definitions in the codebase echo "Searching for predicate definitions..." rg --type yaml -g '!src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/elevated_creatinine/first_24h.yaml' \ -e 'icu_admission:' -e 'icu_discharge:' -e 'hospital_discharge:' -e 'death:' -e 'creatinine:' -e 'abnormally_high_creatinine:' echo "If no results are found, the predicates may need to be implemented in this file."Length of output: 6008
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/anemia/first_24h.yaml (1)
23-23
: LGTM: Trigger correctly set.The trigger is appropriately set to "icu_admission", which aligns with the task description and PR objectives.
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/thrombocytopenia/first_24h.yaml (2)
1-5
: LGTM: Clear and comprehensive task description.The description accurately defines the task for predicting thrombocytopenia, including the timeframe and inclusion criteria. It aligns well with the PR objectives.
23-23
: LGTM: Appropriate trigger defined.The trigger is correctly set to 'icu_admission', which aligns with the task description and PR objectives.
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/metabolic_acidosis/first_24h.yaml (2)
1-5
: LGTM: Clear and comprehensive task description.The description effectively outlines the task's purpose, prediction target, time frames, and inclusion criteria. It provides a solid foundation for understanding the task's objectives and constraints.
23-23
: LGTM: Appropriate trigger defined.The trigger is correctly set to 'icu_admission', which aligns with the task description and ensures that the prediction process starts at the right moment - when a patient is admitted to the ICU.
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/vital/hypotension/first_24h.yaml
Outdated
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/vital/hypotension/first_24h.yaml
Outdated
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/leukocytosis/first_24h.yaml
Outdated
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/hyponatremia/first_24h.yaml
Outdated
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/anemia/first_24h.yaml
Outdated
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/anemia/first_24h.yaml
Outdated
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/thrombocytopenia/first_24h.yaml
Outdated
Show resolved
Hide resolved
...EDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/metabolic_acidosis/first_24h.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the ranges not be set in the task configs for particular units?
Also, do you want to add the predicates you've defined here to the MIMIC predicates.yaml file too?
Also, should we have something different than cbc
and blood_chemistry
? blood_chemistry
seems very broad relative to cbc
, but maybe that is wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should have the ICU stay be at least 48 hours, right? We want the lab test to happen while they are still in the ICU, not after they were discharged, don't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This applies here and throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, need to modify no_discharge
window to cover 48h.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, we would expect the labs to be recorded during patient stay at ICU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to predict abnormal creatine levels on all patients? Or only on patients who have not yet had an abnormal creatine in the ICU? And/or who haven't had an abnormal creatine in this hospital stay? Or something else guided by clinical ideas? @Gallifantjack, any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This applies here and throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just defined it as the first case where we predict on all patients, but I agree that this needs an opinion from clinicians to figure out which scenario has the best clinical utility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @Jwoo5 mentioned, we were considering detecting any event of elevated creatinine, independent of previous instances of creatinine elevation. What you mentioned, @mmcdermott, concerns the problem of selecting the cohort we want to use (to condition the task on), which should be orthogonal to the task definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually @mandiehyewon , I think that task cohort is very much a key part of task definition -- at least as I'm using the term. we'll care about how models perform differently on different cohorts even for the same "label", and I think our goal in specifying these ACES configs should not just be to say "here is the label" but also to say "this task is interested in this label over patients who satisfy these criteria"
So if we think it would be relevant to get more specific here, we should, in my opinion. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good points; I agree with Matthew's suggestion to make it more restricted. Nice start might be to predict abnormal creatinine levels in all patients who haven't previously shown elevated levels, without restricting to ICU patients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense @mmcdermott as we are using ACES format and defining cohort w.r.t. each task. About your point @Gallifantjack, just to clarify, you meant those who haven't previously shown elevated levels during current admission, correct? I think it makes sense to detect the first event in each admission but don't think it's necessary to extend to historic admissions.
code: ??? | ||
abnormally_high_creatinine: | ||
code: ??? | ||
value_min: ??? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this reference range different for men vs. women, or children vs. adults? Do we need more qualifiers and/or different ranges for different patients here? Tagging @Gallifantjack for the clinical questions here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the reference ranges were based on @mandiehyewon's suggestions, and only "Anemia" prediction task based on hemoglobin has a different range for men vs women. I'm also not sure if ACES does have the capacity to define predicates conditioned on other information (e.g., age, gender, pregnancy, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about this @mmcdermott. While we can define cutoffs for elevated SCr, these values can be somewhat arbitrary and may vary between institutions. It's important to refer to established literature for guidance. According to this source, the normal SCr levels for individuals with normal kidney function are:
- Men: 0.6 to 1.2 mg/dL
- Women: 0.5 to 1.1 mg/dL
The source also provides approximate measures of glomerular filtration rate (GFR) based on creatinine levels:
- Creatinine 1 mg/dL: Baseline for a patient with normal GFR
- Creatinine 2 mg/dL: Indicates about a 50% reduction in GFR
- Creatinine 4 mg/dL: Indicates about a 70% to 85% reduction in GFR
- Creatinine 8 mg/dL: Indicates about a 90% to 95% reduction in GFR
From this information, we can define multiple cutoffs for elevated creatinine levels. For instance, we might use 1.3 mg/dL—the higher end of the normal range for both men and women—as a threshold. The choice of cutoff depends on the specific condition we aim to detect:
-
Critical Conditions (e.g., Creatinine ≥ 2.0 mg/dL): Levels that may require immediate medical intervention due to significant kidney function reduction.
-
Mild Elevations (e.g., Creatinine ≥ 1.3 mg/dL): Levels that warrant further evaluation but may not indicate critical kidney failure.
Ultimately, the selected cutoff should align with the clinical context and the goals of the evaluation. Do we want to identify patients needing urgent care, or are we focusing on early detection of potential kidney issues that require monitoring?
For pediatric patients, different reference ranges are necessary for all laboratory values. Therefore, it's best to focus on adult patients (age > 18) for the tasks defined here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is example of patient with mild SCr increase (with the cutoff 1.3 mg/dL)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mandiehyewon -- this is super helpful. @justin13601, what is the status of our ability to define different reference ranges for different patients based on, e.g., static data? I think it is currently not possible; is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think currently it is unsupported, but possibly doable by allowing users to create derived predicates joining normal plain predicates (ie. reference ranges) and a static predicates (ie. sex, which are also plain predicates under the hood)? Perhaps you would specify both male
and female
in patient_demographics
so both are included, but then join it with different plain predicates - not sure if this works but I can look into it if this is of priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justin13601, can users currently create derived predicates with normal plain and static? Or would we need to add that? It seems passably simple to do so, if it isn't supported already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justin13601, can users currently create derived predicates with normal plain and static? Or would we need to add that? It seems passably simple to do so, if it isn't supported already.
I just tested, seems like ACES can parse if users define derived predicates between static and normal plain predicates, but doesn't yet evaluate them properly. Our current derived predicate logic simply looks at whether all of the referenced predicates are realized in a given row, but static predicates are defined in a separate row with a null
timestamp. I suppose we can forward fill the value within the subject_id
(since they're currently all 0 anyway). Or update the logic such that in cases when a static variable is involved in a derived predicate, always look to the row with the null
timestamp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, you're right. Can you file two github issues on ACES: 1) to disable the ability to create such derived predicates until it works properly and 2) to actually make it work properly and link them here? then if we can implement them in ACES fast enough we can use them for any labs where it is appropriate here.
abnormally_low_sodium: | ||
code: ??? | ||
value_min: null | ||
value_max: ??? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this reference range different for men vs. women, or children vs. adults? Do we need more qualifiers and/or different ranges for different patients here? Tagging @Gallifantjack for the clinical questions here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with the above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hyponatremia has a standardized cutoff that applies equally to both men, women, and neonate/pediatric patients (< 135 mEq/L). As mentioned above, it's best to focus on adult patients (age > 18) for the tasks defined here for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mandiehyewon, I agree with the focus on adult patients -- but we should therefore encode that specifically in the inclusion criteria here, right? @justin13601, do we have any ability to filter by age or by duration of time of a window?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would age be represented in MEDS? Would it simply be subtracting the timestamp at a given event by the DOB? I wonder if one can also create a window in the task requiring an event as an adult (ie. 18 years after DOB or something like that)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least for the MEDS_Extract_v0.0.7 version, one's birth date is represented as an event whose code is "MEDS_BIRTH" with a timestamp of the corresponding DOB, so we should calculate age dynamically for every triggered sample and filter it out if its age is less than 18 years by a window or something as @justin13601 mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just do a time-derived window of the target age as @justin13601 suggests, and we can make it more elegant later. @Jwoo5, do you have a good sense of how to do that? We can just do a "no_children" window that ends at trigger
and starts at end - 18y
and has no birth
events which are defined using the MEDS_BIRTH
code. That should work, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tested it for in-icu mortality task, but it turns out that there is no difference in the cohort numbers:
no_children:
start: end - 6570d # 18 years = 365 * 18 days
end: trigger
start_inclusive: True
end_inclusive: True
has:
birth: (None, 0)
and I checked that all the subjects from in-icu mortality (based on icu admission) and long-los (based on hosp admission) cohorts are of age > 18.
It seems like MEDS_Extract_v0.0.7 has already filtered out children patients, could you confirm it @mmcdermott?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delayed confirmation, @mmcdermott. That's correct, we would like to specify the cohort to be adults if we want to use this cutoff!
Also @Jwoo5, can you get the PR check to pass? If you want, I can help you set things up locally to run those and auto-fix issues every commit. |
For your questions on the head:
|
…OS is at least 48 hours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Outside diff range and nitpick comments (6)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/vital/hypotension/first_24h.yaml (1)
1-5
: Add the specific threshold for hypotension in the description.Consider adding the specific MAP threshold (< 65 mmHg) in the description to align with standard medical definitions of hypotension. This would make the task criteria immediately clear to readers.
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/leukocytosis/first_24h.yaml (1)
1-6
: Enhance description with specific thresholds and units.The description should include the specific threshold (>11,000/uL) that defines leukocytosis, as mentioned in the PR objectives. This helps readers understand the exact criteria without having to look elsewhere.
description: >- This task predicts whether the patient will have high _WBC_ values (leukocytosis) within the next 24 hours, given the first 24 hours of their admission to the ICU. To be included, patients must have at least one measurement for _WBC_ within the next 24 hours after their admission to ICU, - and their length of stay in ICU should be at least 24 hours. + and their length of stay in ICU should be at least 24 hours. Leukocytosis is defined as + WBC count exceeding 11,000/uL.src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/anemia/first_24h.yaml (1)
1-6
: Enhance description with specific threshold value.Consider adding the specific hemoglobin threshold (13 g/dL) to the description for immediate clarity about what constitutes "low hemoglobin values".
description: >- This task predicts whether the patient will have low _hemoglobin_ values (anemia) within the next 24 hours, given the first 24 hours of their admission to the ICU. To be included, patients must have at least one measurement for _hemoglobin_ within the next 24 hours after their admission to ICU, - and their length of stay in ICU should be at least 24 hours. + and their length of stay in ICU should be at least 24 hours. Anemia is defined as hemoglobin + levels below 13 g/dL.src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/thrombocytopenia/first_24h.yaml (2)
1-5
: Add the specific threshold for thrombocytopenia in the description.The description should include the specific threshold value (platelet counts below 150,000/uL) to make it immediately clear what constitutes "low platelets values".
description: >- This task predicts whether the patient will have low _platelets_ values (thrombocytopenia) within the next 24 hours, given the first 24 hours of their admission to the ICU. To be included, patients must have at least one measurement for _platelets_ within the next 24 hours after their - admission to ICU, and their length of stay in ICU should be at least 24 hours. + admission to ICU, and their length of stay in ICU should be at least 24 hours. Thrombocytopenia + is defined as platelet counts below 150,000/uL.
33-40
: Clarify the no_discharge window end time calculation.The end time calculation uses
start + 48h
which could be ambiguous. For clarity and consistency with other window definitions, consider usingtrigger + 48h
.no_discharge: start: trigger - end: start + 48h + end: trigger + 48h start_inclusive: False end_inclusive: True has: icu_admission: (None, 0) discharge_or_death: (None, 0)src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/metabolic_acidosis/first_24h.yaml (1)
1-5
: Add the specific threshold value in the description.Consider adding the specific threshold value (22 mEq/L) that defines metabolic acidosis to make the task criteria immediately clear to readers.
description: >- This task predicts whether the patient will have low _bicarbonate_ values (metabolic acidosis) within the next 24 hours, given the first 24 hours of their admission to the ICU. To be included, patients must have at least one measurement for _bicarbonate_ within the next 24 hours after - their admission to ICU, and their length of stay in ICU should be at least 24 hours. + their admission to ICU, and their length of stay in ICU should be at least 24 hours. Metabolic + acidosis is defined as bicarbonate levels below 22 mEq/L.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (7)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/elevated_creatinine/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/hyponatremia/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/metabolic_acidosis/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/anemia/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/leukocytosis/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/thrombocytopenia/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/vital/hypotension/first_24h.yaml (1 hunks)
🧰 Additional context used
🔇 Additional comments (17)
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/vital/hypotension/first_24h.yaml (2)
33-40
: Verify the no_discharge window constraints.The current configuration prevents any ICU admission or discharge/death events during the entire 48-hour period. This might be too restrictive for patients who:
- Are transferred between ICUs
- Have temporary ICU stays recorded during the period
Consider whether these scenarios should be handled differently.
Let's check for similar constraints in other tasks:
✅ Verification successful
Let me fix the regex pattern and try again to search for similar window definitions.
Based on the search results, I can now provide a final response:
The no_discharge window constraints are consistently implemented across similar tasks
The search reveals that this exact window configuration is used consistently across multiple ICU-related tasks in the codebase, including:
- Anemia detection
- Metabolic acidosis
- Thrombocytopenia
- Hypotension
- Elevated creatinine
- Hyponatremia
- Leukocytosis
All these tasks use identical window constraints:
- 48-hour period
- No ICU admission events:
icu_admission: (None, 0)
- No discharge/death events:
discharge_or_death: (None, 0)
This consistent pattern across multiple critical care scenarios suggests this is an intentional design choice for first 24-hour assessments, likely to ensure clean observation periods without administrative transitions that could affect data quality.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for no_discharge window definitions in other tasks echo "Searching for no_discharge window definitions..." rg -t yaml "no_discharge:.*(?:\n.*)+?(icu_admission|discharge_or_death):"Length of output: 515
Script:
#!/bin/bash # Search for no_discharge window definitions in other tasks echo "Searching for no_discharge window definitions..." rg -U "no_discharge:\s*\n.*(?:\n.*)*?(?:icu_admission|discharge_or_death):" -g "*.yaml" # Also search for other ICU-related window definitions to understand the patterns echo -e "\nSearching for other ICU-related window definitions..." rg -U "window:.*\n.*(?:\n.*)*?icu_admission:" -g "*.yaml"Length of output: 6524
7-21
: Consider using shared predicate definitions.The predicates like
icu_admission
,icu_discharge
, etc., are likely to be reused across multiple tasks. Consider moving these common predicates to a shared configuration file to maintain consistency and reduce duplication.Let's check if these predicates are defined elsewhere:
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/hyponatremia/first_24h.yaml (4)
1-6
: LGTM! Clear and comprehensive description.The description effectively communicates the task's purpose, prediction target, and inclusion criteria.
23-23
: LGTM! Appropriate trigger definition.The trigger correctly uses ICU admission as the starting point for prediction.
25-48
: LGTM! Well-structured window definitions.The windows configuration effectively implements:
- 24-hour input window from ICU admission
- Discharge prevention during the entire prediction period
- Appropriate target window with required measurements
15-21
: Awaiting clinical input for sodium reference ranges.The threshold for hyponatremia needs clinical validation:
- Potential variations based on patient demographics (age, gender)
- Consideration of different clinical contexts
- Validation of the 135 mEq/L threshold mentioned in PR objectives
Let's check if there are any existing reference range definitions in the codebase:
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/leukocytosis/first_24h.yaml (2)
12-13
: LGTM! Well-defined composite predicate.The
discharge_or_death
predicate correctly combines all relevant discharge and death events.
41-48
: LGTM! Well-structured target window configuration.The target window is correctly configured with:
- Proper timing (24h after input window)
- Required WBC measurement constraint
- Appropriate labeling using the abnormally_high_wbc predicate
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/elevated_creatinine/first_24h.yaml (4)
32-41
: Verify window configuration for discharge edge cases.The
no_discharge
window ends at 48 hours, but the target window extends up to 48-72 hours from admission. This creates a potential edge case where a patient could be discharged after theno_discharge
window ends but before the target window completes, potentially affecting measurement availability.Consider extending the
no_discharge
window to cover the entire target window period.#!/bin/bash # Check if other similar tasks handle this edge case differently rg -g '*.yaml' -B 5 -A 5 'no_discharge:'
16-21
: Consider gender and age-specific reference ranges for creatinine.The current implementation uses a single threshold for abnormal creatinine levels. However, normal creatinine ranges can vary by:
- Gender (typically higher in males)
- Age (different ranges for children vs. adults)
- Other factors like muscle mass and diet
Consider implementing conditional thresholds based on these factors if the system supports such functionality.
#!/bin/bash # Check if any other tasks implement gender/age-specific ranges rg -g '*.yaml' 'value_min.*gender|value_min.*age'
41-48
: Clarify handling of multiple creatinine measurements.The target window configuration requires at least one creatinine measurement but doesn't specify how multiple measurements should be handled. Consider documenting:
- Whether the highest value should be used
- If multiple confirmatory readings are required
- How to handle conflicting measurements (normal vs. abnormal) within the window
#!/bin/bash # Check how other tasks handle multiple measurements rg -g '*.yaml' -B 5 -A 5 'label:.*creatinine'
7-14
:⚠️ Potential issueDefine the core predicates marked with ???.
The following critical predicates are undefined:
- icu_admission
- icu_discharge
- hospital_discharge
- death
These predicates are essential for the task's functionality and should be properly defined.
✅ Verification successful
Let me try another search strategy to verify if these predicates are defined elsewhere.
The previous search might have failed due to the specific pattern. Let's try a broader search to understand the predicate definitions in the codebase and particularly look in common/shared directories where such core predicates might be defined.
Let me try one final search to look for any predicate definitions in potential base or template files.
This should help us find if there are any base templates or actual implementations of these predicates in the codebase.
Predicates are defined in MIMIC-IV dataset configuration
The predicates marked with
???
in your task configuration are actually defined insrc/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml
with the following implementations:
icu_admission
:code: { regex: "^ICU_ADMISSION//.*" }
icu_discharge
:code: { regex: "^ICU_DISCHARGE//.*" }
hospital_discharge
:code: { regex: "^HOSPITAL_DISCHARGE//.*" }
death
:code: MEDS_DEATH
These predicates are likely meant to be inherited from the dataset configuration, which explains why they appear as
???
in your task-specific YAML. This is a common pattern where core predicates are defined at the dataset level and reused across multiple tasks.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check if these predicates are defined elsewhere in the codebase rg -g '*.yaml' -A 5 'icu_admission:|icu_discharge:|hospital_discharge:|death:'Length of output: 983
Script:
#!/bin/bash # Search for any YAML files that might contain predicate definitions fd -e yaml -x cat {} | grep -A 5 "predicates:" # Also search specifically in the common/shared directories for predicate definitions fd -e yaml -g '**/common/**' -g '**/shared/**' -x cat {}Length of output: 1706
Script:
#!/bin/bash # Search for base/template files that might contain predicate definitions fd -e yaml -g '*base*' -g '*template*' -x cat {} # Also search for files that define these specific predicates with their actual implementations rg -g '*.yaml' 'code:.*HOSPITAL_|code:.*ICU_|code:.*DEATH' -A 2Length of output: 1004
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/anemia/first_24h.yaml (2)
25-48
: Verify window references and temporal constraints.The window configurations look correct, but please ensure:
- The reference
input.end
in the target window is properly resolved- The expression
start + 48h
in no_discharge window correctly spans both input and target windows- The constraint
(None, 0)
for discharge events effectively prevents early discharges✅ Verification successful
Based on the search results, I can now provide a final response since I have sufficient evidence to verify the window configurations:
Window configuration is correctly implemented and consistent with other ICU-based tasks
The window configuration follows the established pattern used across multiple ICU-based tasks in the codebase, specifically:
- The reference
input.end
in the target window is valid and consistently used across similar tasks- The
start + 48h
duration in the no_discharge window is correct as it matches the pattern used in other ICU tasks (e.g., hyponatremia, hypotension, elevated_creatinine) to cover both input and target windows- The constraint
(None, 0)
for discharge events is properly implemented and consistent with other ICU tasks to prevent early dischargesThe implementation aligns with similar tasks in:
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/hyponatremia/first_24h.yaml
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/vital/hypotension/first_24h.yaml
- And other ICU-based tasks following the same pattern
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for similar window configurations in other tasks # Test: Search for similar window configurations to validate consistency echo "Searching for similar window configurations..." rg -t yaml "windows:.*(?:\n.*)+?input:.*(?:\n.*)+?target:" --multilineLength of output: 21356
7-22
: Verify predicate definitions in the codebase.The composite predicate
discharge_or_death
looks correct, but let's verify if the required predicates are defined elsewhere in the codebase.✅ Verification successful
Based on the search results, I can now generate the final response:
Required predicates are defined in MIMIC-IV dataset configuration
The base predicates (
icu_admission
,icu_discharge
,hospital_discharge
,death
) are defined insrc/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml
with proper code regex patterns:
icu_admission
:^ICU_ADMISSION//.*
icu_discharge
:^ICU_DISCHARGE//.*
hospital_discharge
:^HOSPITAL_DISCHARGE//.*
death
: (defined but pattern not shown in search results)The composite predicate
discharge_or_death
correctly uses these base predicates in its expression.The lab-specific predicates (
hemoglobin
,abnormally_low_hemoglobin
) are intentionally marked with???
as they need to be configured per dataset, which is the expected pattern seen across other similar lab criteria files (e.g., platelets, wbc, sodium, etc.).🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check if predicates are defined in other YAML files # Test 1: Search for predicate definitions in YAML files echo "Searching for predicate definitions..." rg -t yaml "predicates:.*(?:\n.*)+?(icu_admission|icu_discharge|hospital_discharge|death|hemoglobin):" --multiline # Test 2: Search for hemoglobin-related configurations echo "Searching for hemoglobin configurations..." rg -t yaml "hemoglobin:.*(?:\n.*)+?code:" --multilineLength of output: 30564
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/thrombocytopenia/first_24h.yaml (1)
12-13
: Verify the discharge_or_death predicate logic.The composite predicate includes
hospital_discharge
which might be redundant since an ICU discharge would typically precede or coincide with a hospital discharge. This could lead to duplicate event handling.✅ Verification successful
Let me analyze the results and generate a final response.
The discharge_or_death predicate implementation is consistent across the codebase
The search results show that the same
discharge_or_death
predicate expressionor(icu_discharge, death, hospital_discharge)
is consistently used across multiple task criteria files in the ICU context. This consistent implementation across different medical conditions (thrombocytopenia, leukocytosis, metabolic acidosis, etc.) suggests this is an intentional design pattern rather than an oversight.While it might seem redundant at first glance, including both
icu_discharge
andhospital_discharge
likely serves specific use cases:
- Capturing direct transfers from ICU to other facilities
- Handling cases where hospital discharge data might be available even when ICU discharge data is missing
- Ensuring completeness in different data collection scenarios
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for other discharge_or_death predicates to verify consistency rg -l "discharge_or_death" | xargs rg "expr: or\(.*discharge.*\)"Length of output: 1173
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/metabolic_acidosis/first_24h.yaml (2)
12-13
: LGTM! Well-structured composite predicate.The
discharge_or_death
predicate correctly combines all relevant discharge/death events.
26-32
: LGTM! Well-structured input and target windows.The window configurations correctly implement the task requirements:
- Input window captures first 24h of ICU stay
- Target window requires at least one bicarbonate measurement
- Proper inclusive/exclusive bounds are set
Also applies to: 41-48
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/vital/hypotension/first_24h.yaml
Outdated
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/vital/hypotension/first_24h.yaml
Outdated
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/leukocytosis/first_24h.yaml
Outdated
Show resolved
Hide resolved
...DS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/elevated_creatinine/first_24h.yaml
Outdated
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/in_icu/cbc/thrombocytopenia/first_24h.yaml
Outdated
Show resolved
Hide resolved
...EDS_DEV/tasks/criteria/abnormal_lab/in_icu/blood_chemistry/metabolic_acidosis/first_24h.yaml
Outdated
Show resolved
Hide resolved
I have left some comment with some clinical insights. |
Yes, but I meant more of, can we define the
sounds perfect
@mandiehyewon, what do you think? Should we group them by typical lab panel name? By organ system? By just the name of the test? I'm not sure what would be best.
These checks help ensure all the code is in a consistent style; if you install |
I am not sure if we can do it explicitly. Maybe we can provide value_min or value_max with a comment to indicate units like: abnormally_low_sodium:
code: ???
value_min: null
value_max: 135 # mEq/L |
@Jwoo5 and @mandiehyewon -- what is the status of this PR; do we want to make other changes to the ranges, etc.? Should we close this PR temporarily until those are made, or is it ready to go in now? |
To-dos for this PR would be:
If this list looks fine, I'll get to update this PR soon. |
@Jwoo5 If this is where different reference ranges are needed based on static variables (like sex etc.), it should be supported on this branch: https://github.com/justin13601/ACES/tree/25_medsdev_static_derived_predicates (see justin13601/ACES#145) |
@justin13601 Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Outside diff range and nitpick comments (12)
src/MEDS_DEV/tasks/criteria/abnormal_lab/vital/hypotension/first_24h.yaml (2)
1-6
: Consider enhancing the description with the MAP threshold.
The description clearly outlines the task purpose and inclusion criteria. Consider adding the specific MAP threshold (< 65 mmHg) to make the definition more explicit and self-contained.
description: >-
This task predicts whether the patient will have low _MAP_ values (hypotension) within the next
24 hours, given the first 24 hours of their admission to the hospital. To be included,
1) patients must have at least one measurement for _MAP_ within the next 24 hours after their
admission to the hospital, 2) patients must not have shown low _MAP_ values previously, and
- 3) their length of stay in the hospital must be at least 48 hours.
+ 3) their length of stay in the hospital must be at least 48 hours. Low _MAP_ is defined as
+ values below 65 mmHg.
47-53
: Consider more precise age calculation.
The 18-year calculation using 6570 days (365 * 18) doesn't account for leap years. For more precision, consider using 6574 days (accounts for ~4.5 leap years in 18 years).
- start: end - 6570d # 18 years = 365 * 18 days
+ start: end - 6574d # 18 years including leap years
src/MEDS_DEV/tasks/criteria/abnormal_lab/cbc/leukocytosis/first_24h.yaml (1)
1-6
: Consider enhancing the description with specific threshold values.
The description could be more precise by including the actual threshold for high WBC values (>11 K/uL) that's defined in the predicates section.
description: >-
This task predicts whether the patient will have high _WBC_ values (leukocytosis) within the next
24 hours, given the first 24 hours of their admission to the _hospital_. To be included,
1) patients must have at least one measurement for _WBC_ within the next 24 hours after their
- admission to the hospital, 2) patients must not have shown high _WBC_ values previously, and
+ admission to the hospital, 2) patients must not have shown high _WBC_ values (>11 K/uL) previously, and
3) their length of stay in the hospital must be at least 48 hours.
src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/hyponatremia/first_24h.yaml (1)
1-6
: Remove markdown-style emphasis from the description.
The description uses markdown-style emphasis with underscores (e.g., _sodium_
, _hospital_
), which is unnecessary in YAML strings and might not render as expected.
- This task predicts whether the patient will have low _sodium_ values (hyponatremia) within the
- next 24 hours, given the first 24 hours of their admission to the _hospital_. To be included,
- 1) patients must have at least one measurement for _sodium_ within the next 24 hours after their
- admission to the hospital, 2) patients must not have shown low _sodium_ values previously, and
+ This task predicts whether the patient will have low sodium values (hyponatremia) within the
+ next 24 hours, given the first 24 hours of their admission to the hospital. To be included,
+ 1) patients must have at least one measurement for sodium within the next 24 hours after their
+ admission to the hospital, 2) patients must not have shown low sodium values previously, and
src/MEDS_DEV/tasks/criteria/abnormal_lab/cbc/anemia/first_24h.yaml (2)
1-6
: Enhance description clarity and consistency.
Consider these improvements to the description:
- Use consistent emphasis for medical terms (e.g.,
_hospital_
,_hemoglobin_
) - Consider adding the specific medical threshold (13 g/dL) in the description for clarity
description: >-
This task predicts whether the patient will have low _hemoglobin_ values (anemia) within the next
- 24 hours, given the first 24 hours of their admission to the _hospital_. To be included,
+ 24 hours, given the first 24 hours of their admission to the _hospital_. Anemia is defined as
+ _hemoglobin_ levels below 13 g/dL. To be included,
1) patients must have at least one measurement for _hemoglobin_ within the next 24 hours after
their admission to the hospital, 2) patients must not have shown low _hemoglobin_ values
previously, and 3) their length of stay in the hospital must be at least 48 hours.
47-54
: Review age calculation method.
The age calculation using 6570d
(18 years * 365 days) doesn't account for leap years, which could affect edge cases.
Consider using a more precise calculation or documenting the simplification choice:
no_children:
- start: end - 6570d # 18 years = 365 * 18 days
+ start: end - 6575d # 18 years accounting for leap years (365.25 * 18)
end: trigger
src/MEDS_DEV/tasks/criteria/abnormal_lab/cbc/thrombocytopenia/first_24h.yaml (2)
1-6
: Enhance description with specific threshold values.
Consider adding the specific threshold value (150 K/uL) in the description for clarity and immediate understanding of what constitutes "low platelets values".
description: >-
This task predicts whether the patient will have low _platelets_ values (thrombocytopenia) within
the next 24 hours, given the first 24 hours of their admission to the hospital. To be included,
1) patients must have at least one measurement for _platelets_ within the next 24 hours after
- their admission to the hospital, 2) patients must not have shown low _platelets_ values
+ their admission to the hospital, 2) patients must not have shown low _platelets_ values (< 150 K/uL)
previously, and 3) their length of stay in the hospital must be at least 48 hours.
48-49
: Verify age calculation for pediatric exclusion.
The calculation 6570d
for 18 years doesn't account for leap years. A more accurate calculation would be 6574d
(18 * 365.25 days).
- start: end - 6570d # 18 years = 365 * 18 days
+ start: end - 6574d # 18 years = 365.25 * 18 days
src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/elevated_creatinine/first_24h.yaml (1)
1-6
: Consider standardizing markdown emphasis formatting.
While the description is clear and comprehensive, consider using consistent markdown formatting. The terms creatinine and hospital are emphasized, but this isn't maintained throughout (e.g., "admission to the hospital" vs "admission to the hospital").
src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/metabolic_acidosis/first_24h.yaml (2)
1-6
: Consider enhancing the description with the threshold value.
The description clearly outlines the purpose and inclusion criteria. Consider adding the specific threshold value (22 mEq/L) in the description to make it immediately clear what constitutes "low bicarbonate values".
28-34
: Clarify the input window start time.
The input window's start time is set to null
. Consider explicitly setting this to trigger
for better clarity, since the description indicates we're looking at the first 24 hours from admission.
src/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml (1)
22-167
: Consider adding upper bounds for critical values.
While the current ranges correctly implement the PR objectives, consider adding upper bounds for critical values that could indicate medical emergencies. For example:
- Creatinine > 3.0 mg/dL might indicate severe kidney injury
- Sodium < 120 mEq/L (severe hyponatremia) or > 155 mEq/L (severe hypernatremia)
- WBC > 30 K/uL might indicate severe infection or leukemia
Would you like me to propose additional predicates for these critical ranges?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (8)
- src/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/elevated_creatinine/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/hyponatremia/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/metabolic_acidosis/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/cbc/anemia/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/cbc/leukocytosis/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/cbc/thrombocytopenia/first_24h.yaml (1 hunks)
- src/MEDS_DEV/tasks/criteria/abnormal_lab/vital/hypotension/first_24h.yaml (1 hunks)
🔇 Additional comments (25)
src/MEDS_DEV/tasks/criteria/abnormal_lab/vital/hypotension/first_24h.yaml (3)
25-25
: LGTM!
The trigger is appropriately set to hospital_admission, which aligns with the task's objective of predicting outcomes within 24 hours of hospital admission.
8-24
:
Define missing predicates.
Several critical predicates are undefined (marked with "???"):
- hospital_admission
- hospital_discharge
- birth
- death
- map
These predicates need to be defined in the MIMIC predicates.yaml file to ensure proper task validation.
Let's verify if these predicates are defined in the MIMIC predicates file:
27-61
: Verify window configurations.
The window configurations look well-structured with appropriate conditions. However, let's verify:
- The 48-hour minimum stay requirement is correctly enforced through the no_discharge window
- The target window properly aligns with the task description
Let's check for similar window configurations in other tasks:
✅ Verification successful
Window configurations are consistent and correctly structured
The verification confirms that:
- The 48-hour minimum stay requirement is consistently enforced across all similar tasks through the
no_discharge
window (trigger to trigger + 48h) - The target window configuration properly aligns with other abnormal lab tasks, following the standard pattern:
- Starts right after input window (input.end)
- 24-hour duration (start + 24h)
- Consistent inclusivity settings (start_inclusive: False, end_inclusive: True)
- Appropriate label and measurement requirements
The window configurations in the hypotension task follow the established pattern used across all abnormal lab tasks in the codebase.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check window configurations in other similar tasks
# Look for similar window patterns to validate consistency
rg -A 10 "windows:" "src/MEDS_DEV/tasks/criteria/abnormal_lab/**/*.yaml"
Length of output: 163
Script:
#!/bin/bash
# Let's first find the correct path structure for abnormal lab tasks
fd "yaml$" --type f
Length of output: 1054
Script:
#!/bin/bash
# Now let's check window configurations in similar abnormal lab tasks
rg -A 15 "windows:" "src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/elevated_creatinine/first_24h.yaml" "src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/hyponatremia/first_24h.yaml" "src/MEDS_DEV/tasks/criteria/abnormal_lab/cbc/anemia/first_24h.yaml" "src/MEDS_DEV/tasks/criteria/abnormal_lab/vital/hypotension/first_24h.yaml"
Length of output: 7162
Script:
#!/bin/bash
# Let's check the target window configurations in these files
rg -A 10 " target:" "src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/elevated_creatinine/first_24h.yaml" "src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/hyponatremia/first_24h.yaml" "src/MEDS_DEV/tasks/criteria/abnormal_lab/cbc/anemia/first_24h.yaml" "src/MEDS_DEV/tasks/criteria/abnormal_lab/vital/hypotension/first_24h.yaml"
Length of output: 3580
src/MEDS_DEV/tasks/criteria/abnormal_lab/cbc/leukocytosis/first_24h.yaml (4)
16-24
: Verify WBC threshold with clinical guidelines.
The WBC threshold (>11 K/uL) for leukocytosis is defined, but we should verify this matches standard clinical guidelines.
Would you like me to help create a README.md file documenting the clinical rationale for this threshold, as mentioned in the PR objectives?
54-61
: LGTM: Target window configuration is well-defined.
The target window configuration correctly implements the requirements:
- 24-hour prediction window
- Requires at least one WBC measurement
- Uses abnormally_high_wbc for labeling
25-25
: Ensure hospital_admission trigger is properly defined.
The task trigger depends on the undefined hospital_admission
predicate. This needs to be addressed along with the other missing predicate definitions.
#!/bin/bash
# Check if the trigger predicate is used consistently
rg -t yaml "trigger:\s*hospital_admission" -A 5
8-15
:
Define missing core predicates.
The following predicates are undefined and marked with "???":
- hospital_admission
- hospital_discharge
- birth
- death
These are critical predicates that need to be defined for the task to function.
Let's check if these predicates are defined elsewhere:
src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/hyponatremia/first_24h.yaml (3)
27-61
: Verify window sequence and constraints.
The window configuration appears logically sound but please verify:
- The 48-hour window in
no_discharge
correctly covers both input and target windows (24h + 24h) - The sequence of window evaluations ensures proper patient filtering
- The requirement for at least one sodium measurement (
sodium: (1, None)
) is sufficient for reliable prediction
#!/bin/bash
# Search for similar window configurations in other lab prediction tasks
rg -t yaml "windows:.*input:.*target:" -A 20
16-23
: Verify sodium threshold with medical standards.
The threshold for hyponatremia is set to < 135 mEq/L. Please verify this aligns with standard medical definitions and MIMIC-IV data distribution.
#!/bin/bash
# Search for other sodium-related definitions for consistency
rg -t yaml "sodium.*value_max:.*135" --no-filename
8-15
:
Define missing core predicates.
The following critical predicates are undefined:
hospital_admission
hospital_discharge
birth
death
These predicates are essential for the task's functionality and should be properly defined.
Let's check if these predicates are defined elsewhere:
src/MEDS_DEV/tasks/criteria/abnormal_lab/cbc/anemia/first_24h.yaml (4)
27-46
: Verify discharge window configuration.
The no_discharge window configuration:
- Uses
start + 48h
but references its own start - should it reference trigger instead? - Has
hospital_admission
count of 0 which might conflict with the trigger
Consider revising to:
no_discharge:
start: trigger
- end: start + 48h
+ end: trigger + 48h
start_inclusive: False
end_inclusive: True
has:
- hospital_admission: (None, 0)
discharge_or_death: (None, 0)
54-61
: Validate target window configuration.
The target window configuration looks correct, but verify:
- The requirement for at least one hemoglobin measurement
- The use of abnormally_low_hemoglobin as the label
#!/bin/bash
# Check for similar window configurations in other lab prediction tasks
rg -A 5 "target:\s+start: input.end\s+end: start \+ 24h" "src/MEDS_DEV/tasks/criteria/"
16-24
: Verify hemoglobin threshold and predicate definitions.
The abnormal hemoglobin threshold (13 g/dL) needs verification:
- The PR objectives mention this might need reevaluation due to high positive sample yield
- The
hemoglobin
andabnormally_low_hemoglobin
predicates are undefined
#!/bin/bash
# Check if hemoglobin predicates are defined in MIMIC predicates
rg -l "hemoglobin:|abnormally_low_hemoglobin:" "src/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml"
8-15
:
Define missing core predicates.
The following predicates are undefined:
- hospital_admission
- hospital_discharge
- birth
- death
These are critical for task functionality.
src/MEDS_DEV/tasks/criteria/abnormal_lab/cbc/thrombocytopenia/first_24h.yaml (3)
1-61
: Overall task structure looks good.
The task definition aligns well with the PR objectives and follows the expected structure for abnormal lab prediction tasks. The thrombocytopenia threshold (150 K/uL) matches the specified criterion from the PR description.
9-12
:
Define missing predicates marked with "???".
The following predicates need to be defined:
- hospital_admission
- hospital_discharge
- birth
- death
- platelets.code
- abnormally_low_platelets.code
These predicates are crucial for the task's functionality. They should match the corresponding definitions in MIMIC-IV's predicates.yaml file.
Let's verify if these predicates are defined in the MIMIC predicates file:
#!/bin/bash
# Search for predicate definitions in MIMIC predicates file
rg -A 2 "hospital_admission:|hospital_discharge:|birth:|death:|platelets:" "src/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml"
Also applies to: 18-18, 20-20
27-61
: Consider edge cases in window definitions.
The current window configuration might need additional validation for edge cases:
- Input window: Consider validating if there are any platelet measurements in the first 24h
- Target window: Consider adding a maximum number of measurements requirement
Let's check if similar tasks have these validations:
src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/elevated_creatinine/first_24h.yaml (2)
16-24
: Verify creatinine threshold value.
Please confirm that 1.3 mg/dL is the correct threshold for abnormally high creatinine. Additionally, the creatinine
predicate's code is undefined (marked with ???).
Could you provide medical references or documentation supporting this threshold value?
27-61
: Verify window boundary logic.
Please verify the following window configurations:
- Input window (0h to 24h): Ensures no previous elevated creatinine
- No discharge window (0h to 48h): Prevents including discharged patients
- Target window (24h to 48h): Requires at least one creatinine measurement
The windows appear to overlap correctly, but we should verify that this matches the intended clinical workflow.
Consider adding a diagram or documentation to visualize these time windows and their relationships.
src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/metabolic_acidosis/first_24h.yaml (4)
20-23
: Verify the bicarbonate threshold value.
The threshold for abnormally low bicarbonate is set to < 22 mEq/L. Please verify this aligns with standard medical guidelines for metabolic acidosis diagnosis.
25-25
: LGTM!
The trigger is appropriately set to hospital_admission, which aligns with the task's objective of prediction within the first 24 hours of admission.
54-61
: LGTM!
The target window configuration correctly implements:
- 24-hour prediction window
- Required bicarbonate measurement
- Appropriate label definition
9-12
:
Critical predicates are undefined.
Several essential predicates are marked with "???". These need to be implemented:
- icu_admission
- icu_discharge
- hospital_discharge
- death
- bicarbonate
- abnormally_low_bicarbonate
Let's verify if these predicates are defined in the MIMIC predicates file:
Also applies to: 18-19
✅ Verification successful
All referenced predicates are properly defined in MIMIC-IV predicates file
The verification shows that all predicates marked with "???" in the YAML file are actually defined in src/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml
:
icu_admission
: defined with regex^ICU_ADMISSION//.*
icu_discharge
: defined with regex^ICU_DISCHARGE//.*
hospital_discharge
: defined with regex^HOSPITAL_DISCHARGE//.*
death
: defined with codeMEDS_DEATH
bicarbonate
: defined as an OR expression of bicarbonate_1 and bicarbonate_2abnormally_low_bicarbonate
: defined as an OR expression of abnormally_low_bicarbonate_1 and abnormally_low_bicarbonate_2
The "???" markers appear to be placeholders in a template file that references these already-defined predicates.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for predicate definitions in MIMIC predicates file
rg -A 5 "icu_admission:|icu_discharge:|hospital_discharge:|death:|bicarbonate:" "src/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml"
Length of output: 745
src/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml (2)
17-21
: LGTM: Birth predicate follows existing pattern.
The addition of the birth predicate is clean and consistent with the file's structure.
143-167
: LGTM: MAP predicates are well-structured.
The mean arterial pressure predicates are correctly implemented with:
- Comprehensive coverage using three code variants
- Correct threshold of 65 mmHg
- Consistent naming and structure
src/MEDS_DEV/tasks/criteria/abnormal_lab/cbc/leukocytosis/first_24h.yaml
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/hyponatremia/first_24h.yaml
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/elevated_creatinine/first_24h.yaml
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/elevated_creatinine/first_24h.yaml
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/elevated_creatinine/first_24h.yaml
Show resolved
Hide resolved
src/MEDS_DEV/tasks/criteria/abnormal_lab/blood_chemistry/metabolic_acidosis/first_24h.yaml
Show resolved
Hide resolved
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
src/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml (1)
22-167
: Consider adding section comments
The file would benefit from section comments to clearly delineate between different measurement groups (Blood Chemistry, CBC, Vital Signs).
Add section comments like this:
+ # Blood Chemistry Measurements
creatinine_1:
code: LAB//50912//mg/dL
...
+ # Complete Blood Count (CBC) Measurements
hemoglobin_1:
code: LAB//220228//g/dl
...
+ # Vital Signs Measurements
map_1:
code: LAB//220052//mmHg
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- src/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml (1 hunks)
🔇 Additional comments (7)
src/MEDS_DEV/datasets/MIMIC-IV/predicates.yaml (7)
17-21
: LGTM: Birth predicate addition
The birth predicate follows the established pattern and is correctly positioned.
22-40
: LGTM: Creatinine predicates
The implementation correctly defines elevated creatinine (>1.3 mg/dL) using both lab codes (50912, 52546) with consistent units.
41-66
: LGTM: Sodium predicates
The implementation correctly defines hyponatremia (<135 mEq/L) using all three lab codes (220645, 50983, 52623) with consistent units.
67-85
: LGTM: Bicarbonate predicates
The implementation correctly defines metabolic acidosis (<22 mEq/L) using both lab codes (227443, 50882) with consistent units.
105-142
: LGTM: WBC and platelets predicates
The implementation correctly defines:
- Leukocytosis (>11 K/uL) using both WBC codes (220546, 51300)
- Thrombocytopenia (<150 K/uL) using both platelets codes (227457, 51265)
All predicates use consistent units.
143-167
: LGTM: MAP predicates
The implementation correctly defines hypotension (<65 mmHg) using all three MAP codes (220052, 220181, 225312) with consistent units.
86-104
: Verify hemoglobin threshold impact
The PR objectives note that the current threshold (<13 g/dL) may need reevaluation due to high positive sample yield in MIMIC-IV.
Let's analyze the distribution of hemoglobin values:
After I renew the task definitions, the cohort numbers (positive / total) for each task on MIMIC-IV have become:
It seems that excluding patients who have shown abnormal values on the target lab previously has a critical effect on the number of positive samples for some tasks. Do these numbers look fine? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The age range and inclusion of all hospital admissions seem appropriate. I can also provide the normal ranges or abnormal cutoffs for other cohorts if needed. |
Since we aim to account for certain medical abnormalities that may or may not be present across various conditions (e.g., hyponatremia could occur in cases like SIADH or diarrhea), as @mmcdermott suggested, it’s best to use task names rather than the approach I previously recommended (CBC, Blood Chemistry). |
All the tasks are defined as predicting whether the patients will have abnormal values for the target lab items within the next 24 hours after the first 24 hours of their ICU admission. It includes ICU patients 1) whose length of stay is at least 24 hours and 2) who have at least one measurement for the target lab items within the target window.
A list of the added tasks are as follows:
There are two questions:
Summary by CodeRabbit
New Features
Bug Fixes