From 039b4210e92c979baac316694c4468d0c209c9b6 Mon Sep 17 00:00:00 2001 From: kamilest Date: Fri, 11 Oct 2024 15:14:09 +0100 Subject: [PATCH] Add prediction schema description. --- README.md | 33 +++++++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 0d9c209..f218b57 100644 --- a/README.md +++ b/README.md @@ -111,11 +111,40 @@ conforming to MEDS binary classification prediction schema: ./MEDS-DEV/src/MEDS_DEV/helpers/generate_predictions.sh $MEDS_ROOT_DIR $TASK_NAME ``` +In order to work with the evaluation package (see the next section), +the model's outputs must conform to the _prediction schema_: + +```python +prediction = pa.schema( + [ + ("subject_id", pa.int64()), + ("prediction_time", pa.timestamp("us")), + ("boolean_value", pa.bool_()), + ("predicted_boolean_value", pa.bool_()), + ("predicted_boolean_probability", pa.float64()), + ] +) + +Prediction = TypedDict( + "Prediction", + { + "subject_id": int, + "prediction_time": datetime.datetime, + "boolean_value": bool, + "predicted_boolean_value": bool, + "predicted_boolean_probability": bool, + }, + total=False, +) +``` + +TODO: make the predicted values/probabilities optional and evaluate metrics based on availability of these +values + ### Evaluate the model You can use the `meds-evaluation` package by running `meds-evaluation-cli` and providing the path to -predictions -dataframe as well as the output directory. For example, +predictions dataframe as well as the output directory. For example, ```bash meds-evaluation-cli \