From 039b4210e92c979baac316694c4468d0c209c9b6 Mon Sep 17 00:00:00 2001
From: kamilest <stankeviciute.kamile@gmail.com>
Date: Fri, 11 Oct 2024 15:14:09 +0100
Subject: [PATCH] Add prediction schema description.

---
 README.md | 33 +++++++++++++++++++++++++++++++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 0d9c209..f218b57 100644
--- a/README.md
+++ b/README.md
@@ -111,11 +111,40 @@ conforming to MEDS binary classification prediction schema:
 ./MEDS-DEV/src/MEDS_DEV/helpers/generate_predictions.sh $MEDS_ROOT_DIR $TASK_NAME
 ```
 
+In order to work with the evaluation package (see the next section),
+the model's outputs must conform to the _prediction schema_:
+
+```python
+prediction = pa.schema(
+    [
+        ("subject_id", pa.int64()),
+        ("prediction_time", pa.timestamp("us")),
+        ("boolean_value", pa.bool_()),
+        ("predicted_boolean_value", pa.bool_()),
+        ("predicted_boolean_probability", pa.float64()),
+    ]
+)
+
+Prediction = TypedDict(
+    "Prediction",
+    {
+        "subject_id": int,
+        "prediction_time": datetime.datetime,
+        "boolean_value": bool,
+        "predicted_boolean_value": bool,
+        "predicted_boolean_probability": bool,
+    },
+    total=False,
+)
+```
+
+TODO: make the predicted values/probabilities optional and evaluate metrics based on availability of these
+values
+
 ### Evaluate the model
 
 You can use the `meds-evaluation` package by running `meds-evaluation-cli` and providing the path to
-predictions
-dataframe as well as the output directory. For example,
+predictions dataframe as well as the output directory. For example,
 
 ```bash
 meds-evaluation-cli \