Follow-up on the run_metadata changes (#3193)

* Initial commit, nuking all metadata responses and seeing what breaks * Removed last remnant of LazyLoader * Reintroducing the lazy loaders. * Add LazyRunMetadataResponse to EntrypointFunctionDefinition * Test for lazy loaders works now * Fixed tests, reformatted * Use updated template * Auto-update of Starter template * Updated more templates * Fixed failing test * Fixed step run schemas * Auto-update of E2E template * Auto-update of NLP template * Fixed tests, removed additional .value access * Further fixing * Fixed linting issues * Reformatted * Linted, formatted and tested again * Typing * Maybe fix everything * Apply some feedback * new operation * new log_metadata function * changes to the base filters * new filters * adding log_metadata to __all__ * checkpoint with float casting * adding tests * final touches and formatting * formatting * moved the utils * modified log metadata function * checkpoint * deprecating the old functions * linting and final fixes * better error message * fixing the client method * better error message * consistent creation\ * adjusting tests * linting * changes for step metadata * more test adjustments * testing unit tests * linting * fixing more tests * fixing more tests * more test fixes * fixing the test * fixing per comments * added validation, constant error message * linting * new changes * second checkpoint * fixing revisions * adding overlap to remove warnings * complete docs changes * adding a parameter to control the related entity behaviour * fixing the toc * fixed the description * docstring * spellcheck * metadata creation during artifact version creation * allowing artifact metadata with name for external artifact * update the template versions * Auto-update of LLM Finetuning template * Auto-update of Starter template * Auto-update of E2E template * Auto-update of NLP template * fixing the migration script * formatting * redirects * minor fixes * working pipelines again * small fix * working checkpoint * fixes, linting, docstrings * fixing unit tests * docs updates 1 * docs update 2 * fixing integration tests * spellcheck * formatting * Auto-update of E2E template * docs changes * review comments * added the batch rbac call * added a validator to check the name of the keys * small adjustments * base schema added * formatting * new functionalities * breaking circular imports * spellchecker * other minor fixes * covering the uncovered case * adjusting tests * fixing the quickstart again * minor change * going back to publisher step id * updating github refs * Auto-update of LLM Finetuning template * Auto-update of Starter template * fixing tests * updated docs * Auto-update of E2E template * Auto-update of NLP template * formatting * review comments * adding some tests in * review comments * Update src/zenml/zen_stores/migrations/versions/cc269488e5a9_separate_run_metadata.py Co-authored-by: Michael Schuster <[email protected]> * Update src/zenml/zen_stores/migrations/versions/cc269488e5a9_separate_run_metadata.py Co-authored-by: Michael Schuster <[email protected]> * Update src/zenml/zen_stores/migrations/versions/cc269488e5a9_separate_run_metadata.py Co-authored-by: Michael Schuster <[email protected]> * Update src/zenml/zen_stores/migrations/versions/cc269488e5a9_separate_run_metadata.py Co-authored-by: Michael Schuster <[email protected]> * Update src/zenml/zen_stores/migrations/versions/cc269488e5a9_separate_run_metadata.py Co-authored-by: Michael Schuster <[email protected]> * changed assert to value error * fixed the alembic head * changed the interaction with the models * trimmed down * small bugfix * naming recommendations * linting * fixing the test --------- Co-authored-by: AlexejPenner <[email protected]> Co-authored-by: Andrei Vishniakov <[email protected]> Co-authored-by: GitHub Actions <[email protected]> Co-authored-by: Michael Schuster <[email protected]> Co-authored-by: Michael Schuster <[email protected]>
zenml-io · Nov 29, 2024 · fbbfc29 · fbbfc29
1 parent 0ccb1fd
commit fbbfc29
Show file tree

Hide file tree

Showing 57 changed files with 1,482 additions and 566 deletions.
diff --git a/.gitbook.yaml b/.gitbook.yaml
@@ -18,6 +18,7 @@ redirects:
   how-to/setting-up-a-project-repository/best-practices: how-to/project-setup-and-management/setting-up-a-project-repository/set-up-repository.md
   getting-started/zenml-pro/system-architectures: getting-started/system-architectures.md
   how-to/build-pipelines/name-your-pipeline-and-runs: how-to/pipeline-development/build-pipelines/name-your-pipeline-runs.md
+  how-to/model-management-metrics/track-metrics-metadata/attach-metadata-to-steps: how-to/model-management-metrics/track-metrics-metadata/attach-metadata-to-a-step.md
 
   # ZenML Pro
   getting-started/zenml-pro/user-management: getting-started/zenml-pro/core-concepts.md

diff --git a/.github/workflows/update-templates-to-examples.yml b/.github/workflows/update-templates-to-examples.yml
@@ -46,7 +46,7 @@ jobs:
           python-version: ${{ inputs.python-version }}
           stack-name: local
           ref-zenml: ${{ github.ref }}
-          ref-template: 2024.11.20  # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
+          ref-template: 2024.11.28  # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
       - name: Clean-up
         run: |
           rm -rf ./local_checkout
@@ -118,7 +118,7 @@ jobs:
           python-version: ${{ inputs.python-version }}
           stack-name: local
           ref-zenml: ${{ github.ref }}
-          ref-template: 2024.10.30  # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
+          ref-template: 2024.11.28  # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
       - name: Clean-up
         run: |
           rm -rf ./local_checkout
@@ -189,7 +189,7 @@ jobs:
           python-version: ${{ inputs.python-version }}
           stack-name: local
           ref-zenml: ${{ github.ref }}
-          ref-template: 2024.10.30  # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
+          ref-template: 2024.11.28  # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
       - name: Clean-up
         run: |
           rm -rf ./local_checkout
@@ -261,7 +261,7 @@ jobs:
         with:
           python-version: ${{ inputs.python-version }}
           ref-zenml: ${{ github.ref }}
-          ref-template: 2024.11.08  # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
+          ref-template: 2024.11.28  # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
       - name: Clean-up
         run: |
           rm -rf ./local_checkout

diff --git a/docs/book/how-to/model-management-metrics/track-metrics-metadata/README.md b/docs/book/how-to/model-management-metrics/track-metrics-metadata/README.md
@@ -5,10 +5,44 @@ description: Tracking metrics and metadata
 
 # Track metrics and metadata
 
-Logging metrics and metadata is standardized in ZenML. The most common pattern is to use the `log_xxx` methods, e.g.:
+ZenML provides a unified way to log and manage metrics and metadata through 
+the `log_metadata` function. This versatile function allows you to log 
+metadata across various entities like models, artifacts, steps, and runs 
+through a single interface. Additionally, you can adjust if you want to 
+automatically the same metadata for the related entities.
 
-* Log metadata to a [model](attach-metadata-to-a-model.md): `log_model_metadata`
-* Log metadata to an [artifact](attach-metadata-to-an-artifact.md): `log_artifact_metadata`
-* Log metadata to a [step](attach-metadata-to-steps.md): `log_step_metadata`
+### The most basic use-case
+
+You can use the `log_metadata` function within a step:
+
+```python
+from zenml import step, log_metadata
+
+@step
+def my_step() -> ...:
+    log_metadata(metadata={"accuracy": 0.91})
+    ...
+```
+
+This will log the `accuracy` for the step, its pipeline run, and if provided 
+its model version.
+
+### Additional use-cases
+
+The `log_metadata` function also supports various use-cases by allowing you to 
+specify the target entity (e.g., model, artifact, step, or run) with flexible 
+parameters. You can learn more about these use-cases in the following pages:
+
+- [Log metadata to a step](attach-metadata-to-a-step.md)
+- [Log metadata to a run](attach-metadata-to-a-run.md)
+- [Log metadata to an artifact](attach-metadata-to-an-artifact.md)
+- [Log metadata to a model](attach-metadata-to-a-model.md)
+
+{% hint style="warning" %}
+The older methods for logging metadata to specific entities, such as 
+`log_model_metadata`, `log_artifact_metadata`, and `log_step_metadata`, are 
+now deprecated. It is recommended to use `log_metadata` for all future 
+implementations.
+{% endhint %}
 
 <figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
diff --git a/...o/model-management-metrics/track-metrics-metadata/attach-metadata-to-a-model.md b/...o/model-management-metrics/track-metrics-metadata/attach-metadata-to-a-model.md
@@ -1,62 +1,93 @@
 ---
-description: >-
-  Attach any metadata as key-value pairs to your models for future reference and
-  auditability.
+description: Learn how to attach metadata to a model.
 ---
 
 # Attach metadata to a model
 
+ZenML allows you to log metadata for models, which provides additional context
+that goes beyond individual artifact details. Model metadata can represent
+high-level insights, such as evaluation results, deployment information,
+or customer-specific details, making it easier to manage and interpret
+the model's usage and performance across different versions.
+
 ## Logging Metadata for Models
 
-While artifact metadata is specific to individual outputs of steps, model metadata encapsulates broader and more general information that spans across multiple artifacts. For example, evaluation results or the name of a customer for whom the model is intended could be logged with the model.
+To log metadata for a model, use the `log_metadata` function. This function
+lets you attach key-value metadata to a model, which can include metrics and
+other JSON-serializable values, such as custom ZenML types like `Uri`,
+`Path`, and `StorageSize`.
 
 Here's an example of logging metadata for a model:
 
 ```python
-from zenml import step, log_model_metadata, ArtifactConfig, get_step_context
 from typing import Annotated
+
 import pandas as pd
-from sklearn.ensemble import RandomForestClassifier
 from sklearn.base import ClassifierMixin
+from sklearn.ensemble import RandomForestClassifier
+
+from zenml import step, log_metadata, ArtifactConfig, get_step_context
+
 
 @step
-def train_model(dataset: pd.DataFrame) -> Annotated[ClassifierMixin, ArtifactConfig(name="sklearn_classifier")]:
-    """Train a model"""
-    # Fit the model and compute metrics
+def train_model(dataset: pd.DataFrame) -> Annotated[
+    ClassifierMixin, ArtifactConfig(name="sklearn_classifier")
+]:
+    """Train a model and log model metadata."""
     classifier = RandomForestClassifier().fit(dataset)
     accuracy, precision, recall = ...
-
-    # Log metadata for the model
-    # This associates the metadata with the ZenML model, not the artifact
-    log_model_metadata(
+
+    log_metadata(
         metadata={
             "evaluation_metrics": {
                 "accuracy": accuracy,
                 "precision": precision,
                 "recall": recall
             }
         },
-        # Omitted model_name will use the model in the current context
-        model_name="zenml_model_name",
-        # Omitted model_version will default to 'latest'
-        model_version="zenml_model_version",
+        infer_model=True,
     )
+
     return classifier
 ```
 
-In this example, the metadata is associated with the model rather than the specific classifier artifact. This is particularly useful when the metadata reflects an aggregation or summary of various steps and artifacts in the pipeline.
+In this example, the metadata is associated with the model rather than the
+specific classifier artifact. This is particularly useful when the metadata
+reflects an aggregation or summary of various steps and artifacts in the
+pipeline.
+
+
+### Selecting Models with `log_metadata`
+
+When using `log_metadata`, ZenML provides flexible options of attaching 
+metadata to model versions:
+
+1. **Using `infer_model`**: If used within a step, ZenML will use the step
+   context to infer the model it is using and attach the metadata to it.
+2. **Model Name and Version Provided**: If both a model name and version are
+   provided, ZenML will use these to identify and attach metadata to the
+   specific model version.
+3. **Model Version ID Provided**: If a model version ID is directly provided,
+   ZenML will use it to fetch and attach the metadata to that specific model
+   version.
 
 ## Fetching logged metadata
 
-Once metadata has been logged in an [artifact](attach-metadata-to-an-artifact.md), model, or [step](attach-metadata-to-steps.md), we can easily fetch the metadata with the ZenML Client:
+Once metadata has been attached to a model, it can be retrieved for inspection
+or analysis using the ZenML Client.
 
 ```python
 from zenml.client import Client
 
 client = Client()
 model = client.get_model_version("my_model", "my_version")
 
-print(model.run_metadata["metadata_key"].value)
+print(model.run_metadata["metadata_key"])
 ```
 
+{% hint style="info" %}
+When you are fetching metadata using a specific key, the returned value will 
+always reflect the latest entry.
+{% endhint %}
+
 <figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
diff --git a/...-to/model-management-metrics/track-metrics-metadata/attach-metadata-to-a-run.md b/...-to/model-management-metrics/track-metrics-metadata/attach-metadata-to-a-run.md
@@ -0,0 +1,87 @@
+---
+description: Learn how to attach metadata to a run.
+---
+
+# Attach Metadata to a Run
+
+In ZenML, you can log metadata directly to a pipeline run, either during or 
+after execution, using the `log_metadata` function. This function allows you 
+to attach a dictionary of key-value pairs as metadata to a pipeline run, 
+with values that can be any JSON-serializable data type, including ZenML 
+custom types like `Uri`, `Path`, `DType`, and `StorageSize`.
+
+## Logging Metadata Within a Run
+
+If you are logging metadata from within a step that’s part of a pipeline run, 
+calling `log_metadata` will attach the specified metadata to the current 
+pipeline run where the metadata key will have the `step_name::metadata_key` 
+pattern. This allows you to use the same metadata key from different steps 
+while the run's still executing.
+
+```python
+from typing import Annotated
+
+import pandas as pd
+from sklearn.base import ClassifierMixin
+from sklearn.ensemble import RandomForestClassifier
+
+from zenml import step, log_metadata, ArtifactConfig
+
+
+@step
+def train_model(dataset: pd.DataFrame) -> Annotated[
+    ClassifierMixin,
+    ArtifactConfig(name="sklearn_classifier", is_model_artifact=True)
+]:
+    """Train a model and log run-level metadata."""
+    classifier = RandomForestClassifier().fit(dataset)
+    accuracy, precision, recall = ...
+
+    # Log metadata at the run level
+    log_metadata(
+        metadata={
+            "run_metrics": {
+                "accuracy": accuracy,
+                "precision": precision,
+                "recall": recall
+            }
+        }
+    )
+    return classifier
+```
+
+## Manually Logging Metadata to a Pipeline Run
+
+You can also attach metadata to a specific pipeline run without needing a step, 
+using identifiers like the run ID. This is useful when logging information or 
+metrics that were calculated post-execution.
+
+```python
+from zenml import log_metadata
+
+log_metadata(
+    metadata={"post_run_info": {"some_metric": 5.0}},
+    run_id_name_or_prefix="run_id_name_or_prefix"
+)
+```
+
+## Fetching Logged Metadata
+
+Once metadata has been logged in a pipeline run, you can retrieve it using 
+the ZenML Client:
+
+```python
+from zenml.client import Client
+
+client = Client()
+run = client.get_pipeline_run("run_id_name_or_prefix")
+
+print(run.run_metadata["metadata_key"])
+```
+
+{% hint style="info" %}
+When you are fetching metadata using a specific key, the returned value will 
+always reflect the latest entry.
+{% endhint %}
+
+<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>