Refactor model deployer configurations and add VertexAI model deployer

zenml-io · Oct 31, 2024 · 0a13214 · 0a13214
1 parent 6e2b660
commit 0a13214
Show file tree

Hide file tree

Showing 3 changed files with 331 additions and 0 deletions.
diff --git a/docs/book/component-guide/model-deployers/vertex.md b/docs/book/component-guide/model-deployers/vertex.md
@@ -0,0 +1,179 @@
+# Vertex AI Model Deployer
+
+[Vertex AI](https://cloud.google.com/vertex-ai) provides managed infrastructure for deploying machine learning models at scale. The Vertex AI Model Deployer in ZenML allows you to deploy models to Vertex AI endpoints, providing a scalable and managed solution for model serving.
+
+## When to use it?
+
+You should use the Vertex AI Model Deployer when:
+
+* You're already using Google Cloud Platform (GCP) and want to leverage its native ML infrastructure
+* You need enterprise-grade model serving capabilities with autoscaling
+* You want a fully managed solution for hosting ML models
+* You need to handle high-throughput prediction requests
+* You want to deploy models with GPU acceleration
+* You need to monitor and track your model deployments
+
+This is particularly useful in the following scenarios:
+* Deploying models to production with high availability requirements
+* Serving models that need GPU acceleration
+* Handling varying prediction workloads with autoscaling
+* Integrating model serving with other GCP services
+
+{% hint style="warning" %}
+The Vertex AI Model Deployer requires a Vertex AI Model Registry to be present in your stack. Make sure you have configured both components properly.
+{% endhint %}
+
+## How to deploy it?
+
+The Vertex AI Model Deployer is provided by the GCP ZenML integration. First, install the integration:
+
+```shell
+zenml integration install gcp -y
+```
+
+### Authentication and Service Connector Configuration
+
+The Vertex AI Model Deployer requires proper GCP authentication. The recommended way to configure this is using the ZenML Service Connector functionality:
+
+```shell
+# Register the service connector with a service account key
+zenml service-connector register vertex_deployer_connector \
+    --type gcp \
+    --auth-method=service-account \
+    --project_id=<PROJECT_ID> \
+    [email protected] \
+    --resource-type gcp-generic
+
+# Register the model deployer
+zenml model-deployer register vertex_deployer \
+    --flavor=vertex \
+    --location=us-central1
+
+# Connect the model deployer to the service connector
+zenml model-deployer connect vertex_deployer --connector vertex_deployer_connector
+```
+
+{% hint style="info" %}
+The service account needs the following permissions:
+- `Vertex AI User` role for deploying models
+- `Vertex AI Service Agent` role for managing model endpoints
+{% endhint %}
+
+## How to use it
+
+### Deploy a model in a pipeline
+
+Here's an example of how to use the Vertex AI Model Deployer in a ZenML pipeline:
+
+```python
+from typing_extensions import Annotated
+from zenml import ArtifactConfig, get_step_context, step
+from zenml.client import Client
+from zenml.integrations.gcp.services.vertex_deployment import (
+    VertexAIDeploymentConfig,
+    VertexDeploymentService,
+)
+
+@step(enable_cache=False)
+def model_deployer(
+    model_registry_uri: str,
+) -> Annotated[
+    VertexDeploymentService, 
+    ArtifactConfig(name="vertex_deployment", is_deployment_artifact=True)
+]:
+    """Model deployer step."""
+    zenml_client = Client()
+    current_model = get_step_context().model
+    model_deployer = zenml_client.active_stack.model_deployer
+
+    # Configure the deployment
+    vertex_deployment_config = VertexAIDeploymentConfig(
+        location="europe-west1",
+        name="zenml-vertex-quickstart",
+        model_name=current_model.name,
+        description="Vertex AI model deployment example",
+        model_id=model_registry_uri,
+        machine_type="n1-standard-4",  # Optional: specify machine type
+        min_replica_count=1,  # Optional: minimum number of replicas
+        max_replica_count=3,  # Optional: maximum number of replicas
+    )
+
+    # Deploy the model
+    service = model_deployer.deploy_model(
+        config=vertex_deployment_config,
+        service_type=VertexDeploymentService.SERVICE_TYPE,
+    )
+
+    return service
+```
+
+### Configuration Options
+
+The Vertex AI Model Deployer accepts a rich set of configuration options through `VertexAIDeploymentConfig`:
+
+* Basic Configuration:
+  * `location`: GCP region for deployment (e.g., "us-central1")
+  * `name`: Name for the deployment endpoint
+  * `model_name`: Name of the model being deployed
+  * `model_id`: Model ID from the Vertex AI Model Registry
+
+* Infrastructure Configuration:
+  * `machine_type`: Type of machine to use (e.g., "n1-standard-4")
+  * `accelerator_type`: GPU accelerator type if needed
+  * `accelerator_count`: Number of GPUs per replica
+  * `min_replica_count`: Minimum number of serving replicas
+  * `max_replica_count`: Maximum number of serving replicas
+
+* Advanced Configuration:
+  * `service_account`: Custom service account for the deployment
+  * `network`: VPC network configuration
+  * `encryption_spec_key_name`: Customer-managed encryption key
+  * `enable_access_logging`: Enable detailed access logging
+  * `explanation_metadata`: Model explanation configuration
+  * `autoscaling_target_cpu_utilization`: Target CPU utilization for autoscaling
+
+### Running Predictions
+
+Once a model is deployed, you can run predictions using the service:
+
+```python
+from zenml.integrations.gcp.model_deployers import VertexModelDeployer
+from zenml.services import ServiceState
+
+# Get the deployed service
+model_deployer = VertexModelDeployer.get_active_model_deployer()
+services = model_deployer.find_model_server(
+    pipeline_name="deployment_pipeline",
+    pipeline_step_name="model_deployer",
+    model_name="my_model",
+)
+
+if services:
+    service = services[0]
+    if service.is_running:
+        # Run prediction
+        prediction = service.predict(
+            instances=[{"feature1": 1.0, "feature2": 2.0}]
+        )
+        print(f"Prediction: {prediction}")
+```
+
+### Limitations and Considerations
+
+1. **Stack Requirements**: 
+   - Requires a Vertex AI Model Registry in the stack
+   - All stack components must be non-local
+
+2. **Authentication**: 
+   - Requires proper GCP credentials with Vertex AI permissions
+   - Best practice is to use service connectors for authentication
+
+3. **Costs**: 
+   - Vertex AI endpoints incur costs based on machine type and uptime
+   - Consider using autoscaling to optimize costs
+
+4. **Region Availability**:
+   - Service availability depends on Vertex AI regional availability
+   - Model and endpoint must be in the same region
+
+Check out the [SDK docs](https://sdkdocs.zenml.io) for more detailed information about the implementation.
diff --git a/docs/book/component-guide/model-registries/vertex.md b/docs/book/component-guide/model-registries/vertex.md
@@ -0,0 +1,150 @@
+# Vertex AI Model Registry
+
+[Vertex AI](https://cloud.google.com/vertex-ai) is Google Cloud's unified ML platform that helps you build, deploy, and scale ML models. The Vertex AI Model Registry is a centralized repository for managing your ML models throughout their lifecycle. ZenML's Vertex AI Model Registry integration allows you to register, version, and manage your models using Vertex AI's infrastructure.
+
+## When would you want to use it?
+
+You should consider using the Vertex AI Model Registry when:
+
+* You're already using Google Cloud Platform (GCP) and want to leverage its native ML infrastructure
+* You need enterprise-grade model management capabilities with fine-grained access control
+* You want to track model lineage and metadata in a centralized location
+* You're building ML pipelines that need to integrate with other Vertex AI services
+* You need to manage model deployment across different GCP environments
+
+This is particularly useful in the following scenarios:
+
+* Building production ML pipelines that need to integrate with GCP services
+* Managing multiple versions of models across development and production environments
+* Tracking model artifacts and metadata in a centralized location
+* Deploying models to Vertex AI endpoints for serving
+
+{% hint style="warning" %}
+Important: The Vertex AI Model Registry implementation only supports the model version interface, not the model interface. This means you cannot register, delete, or update models directly - you can only work with model versions. Operations like `register_model()`, `delete_model()`, and `update_model()` are not supported.
+{% endhint %}
+
+## How do you deploy it?
+
+The Vertex AI Model Registry flavor is provided by the GCP ZenML integration. First, install the integration:
+
+```shell
+zenml integration install gcp -y
+```
+
+### Authentication and Service Connector Configuration
+
+The Vertex AI Model Registry requires proper GCP authentication. The recommended way to configure this is using the ZenML Service Connector functionality. You have several options for authentication:
+
+1. Using a GCP Service Connector with a dedicated service account (Recommended):
+```shell
+# Register the service connector with a service account key
+zenml service-connector register vertex_registry_connector \
+    --type gcp \
+    --auth-method=service-account \
+    --project_id=<PROJECT_ID> \
+    [email protected] \
+    --resource-type gcp-generic
+
+# Register the model registry
+zenml model-registry register vertex_registry \
+    --flavor=vertex \
+    --location=us-central1
+
+# Connect the model registry to the service connector
+zenml model-registry connect vertex_registry --connector vertex_registry_connector
+```
+
+2. Using local gcloud credentials:
+```shell
+# Register the model registry using local gcloud auth
+zenml model-registry register vertex_registry \
+    --flavor=vertex \
+    --location=us-central1
+```
+
+{% hint style="info" %}
+The service account used needs the following permissions:
+- `Vertex AI User` role for creating and managing model versions
+- `Storage Object Viewer` role if accessing models stored in Google Cloud Storage
+{% endhint %}
+
+## How do you use it?
+
+### Register models inside a pipeline
+
+Here's an example of how to use the Vertex AI Model Registry in your ZenML pipeline using the provided model registration step:
+
+```python
+from typing_extensions import Annotated
+from zenml import ArtifactConfig, get_step_context, step
+from zenml.client import Client
+from zenml.logger import get_logger
+
+logger = get_logger(__name__)
+
+@step(enable_cache=False)
+def model_register() -> Annotated[str, ArtifactConfig(name="model_registry_uri")]:
+    """Model registration step."""
+    # Get the current model from the context
+    current_model = get_step_context().model
+
+    client = Client()
+    model_registry = client.active_stack.model_registry
+    model_version = model_registry.register_model_version(
+        name=current_model.name,
+        version=str(current_model.version),
+        model_source_uri=current_model.get_model_artifact("sklearn_classifier").uri,
+        description="ZenML model registered after promotion",
+    )
+    logger.info(
+        f"Model version {model_version.version} registered in Model Registry"
+    )
+
+    return model_version.model_source_uri
+```
+
+### Configuration Options
+
+The Vertex AI Model Registry accepts the following configuration options:
+
+* `location`: The GCP region where the model registry will be created (e.g., "us-central1")
+* `project_id`: (Optional) The GCP project ID. If not specified, will use the default project
+* `credentials`: (Optional) GCP credentials configuration
+
+### Working with Model Versions
+
+Since the Vertex AI Model Registry only supports version-level operations, here's how to work with model versions:
+
+```shell
+# List all model versions
+zenml model-registry models list-versions <model-name>
+
+# Get details of a specific model version
+zenml model-registry models get-version <model-name> -v <version>
+
+# Delete a model version
+zenml model-registry models delete-version <model-name> -v <version>
+```
+
+### Key Differences from MLflow Model Registry
+
+Unlike the MLflow Model Registry, the Vertex AI implementation has some important differences:
+
+1. **Version-Only Interface**: Vertex AI only supports model version operations. You cannot register, delete, or update models directly - only their versions.
+2. **Authentication**: Uses GCP service connectors for authentication, similar to other Vertex AI services in ZenML.
+3. **Staging Levels**: Vertex AI doesn't have built-in staging levels (like Production, Staging, etc.) - these are handled through metadata.
+4. **Default Container Images**: Vertex AI requires a serving container image URI, which defaults to the scikit-learn prediction container if not specified.
+5. **Managed Service**: As a fully managed service, you don't need to worry about infrastructure management, but you need valid GCP credentials.
+
+### Limitations
+
+Based on the implementation, there are some limitations to be aware of:
+
+1. The `register_model()`, `update_model()`, and `delete_model()` methods are not implemented as Vertex AI only supports registering model versions
+2. Model stage transitions (Production, Staging, etc.) are not natively supported
+3. Models must have a serving container image URI specified or will use the default scikit-learn image
+4. All registered models are automatically labeled with `managed_by="zenml"` for tracking purposes
+
+Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration\_code\_docs/integrations-gcp/#zenml.integrations.gcp.model\_registry) to see more about the interface and implementation.
+
+<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
diff --git a/docs/book/toc.md b/docs/book/toc.md
@@ -260,6 +260,7 @@
   * [Develop a custom experiment tracker](component-guide/experiment-trackers/custom.md)
 * [Model Deployers](component-guide/model-deployers/model-deployers.md)
   * [MLflow](component-guide/model-deployers/mlflow.md)
+  * [VertexAI](component-guide/model-deployers/vertex.md)
   * [Seldon](component-guide/model-deployers/seldon.md)
   * [BentoML](component-guide/model-deployers/bentoml.md)
   * [Hugging Face](component-guide/model-deployers/huggingface.md)
@@ -289,6 +290,7 @@
   * [Develop a Custom Annotator](component-guide/annotators/custom.md)
 * [Model Registries](component-guide/model-registries/model-registries.md)
   * [MLflow Model Registry](component-guide/model-registries/mlflow.md)
+  * [VertexAI](component-guide/model-registries/vertex.md)
   * [Develop a Custom Model Registry](component-guide/model-registries/custom.md)
 * [Feature Stores](component-guide/feature-stores/feature-stores.md)
   * [Feast](component-guide/feature-stores/feast.md)