-
Notifications
You must be signed in to change notification settings - Fork 464
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactor model deployer configurations and add VertexAI model deployer
- Loading branch information
Showing
3 changed files
with
331 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,179 @@ | ||
# Vertex AI Model Deployer | ||
|
||
[Vertex AI](https://cloud.google.com/vertex-ai) provides managed infrastructure for deploying machine learning models at scale. The Vertex AI Model Deployer in ZenML allows you to deploy models to Vertex AI endpoints, providing a scalable and managed solution for model serving. | ||
|
||
## When to use it? | ||
|
||
You should use the Vertex AI Model Deployer when: | ||
|
||
* You're already using Google Cloud Platform (GCP) and want to leverage its native ML infrastructure | ||
* You need enterprise-grade model serving capabilities with autoscaling | ||
* You want a fully managed solution for hosting ML models | ||
* You need to handle high-throughput prediction requests | ||
* You want to deploy models with GPU acceleration | ||
* You need to monitor and track your model deployments | ||
|
||
This is particularly useful in the following scenarios: | ||
* Deploying models to production with high availability requirements | ||
* Serving models that need GPU acceleration | ||
* Handling varying prediction workloads with autoscaling | ||
* Integrating model serving with other GCP services | ||
|
||
{% hint style="warning" %} | ||
The Vertex AI Model Deployer requires a Vertex AI Model Registry to be present in your stack. Make sure you have configured both components properly. | ||
{% endhint %} | ||
|
||
## How to deploy it? | ||
|
||
The Vertex AI Model Deployer is provided by the GCP ZenML integration. First, install the integration: | ||
|
||
```shell | ||
zenml integration install gcp -y | ||
``` | ||
|
||
### Authentication and Service Connector Configuration | ||
|
||
The Vertex AI Model Deployer requires proper GCP authentication. The recommended way to configure this is using the ZenML Service Connector functionality: | ||
|
||
```shell | ||
# Register the service connector with a service account key | ||
zenml service-connector register vertex_deployer_connector \ | ||
--type gcp \ | ||
--auth-method=service-account \ | ||
--project_id=<PROJECT_ID> \ | ||
[email protected] \ | ||
--resource-type gcp-generic | ||
|
||
# Register the model deployer | ||
zenml model-deployer register vertex_deployer \ | ||
--flavor=vertex \ | ||
--location=us-central1 | ||
|
||
# Connect the model deployer to the service connector | ||
zenml model-deployer connect vertex_deployer --connector vertex_deployer_connector | ||
``` | ||
|
||
{% hint style="info" %} | ||
The service account needs the following permissions: | ||
- `Vertex AI User` role for deploying models | ||
- `Vertex AI Service Agent` role for managing model endpoints | ||
{% endhint %} | ||
|
||
## How to use it | ||
|
||
### Deploy a model in a pipeline | ||
|
||
Here's an example of how to use the Vertex AI Model Deployer in a ZenML pipeline: | ||
|
||
```python | ||
from typing_extensions import Annotated | ||
from zenml import ArtifactConfig, get_step_context, step | ||
from zenml.client import Client | ||
from zenml.integrations.gcp.services.vertex_deployment import ( | ||
VertexAIDeploymentConfig, | ||
VertexDeploymentService, | ||
) | ||
|
||
@step(enable_cache=False) | ||
def model_deployer( | ||
model_registry_uri: str, | ||
) -> Annotated[ | ||
VertexDeploymentService, | ||
ArtifactConfig(name="vertex_deployment", is_deployment_artifact=True) | ||
]: | ||
"""Model deployer step.""" | ||
zenml_client = Client() | ||
current_model = get_step_context().model | ||
model_deployer = zenml_client.active_stack.model_deployer | ||
|
||
# Configure the deployment | ||
vertex_deployment_config = VertexAIDeploymentConfig( | ||
location="europe-west1", | ||
name="zenml-vertex-quickstart", | ||
model_name=current_model.name, | ||
description="Vertex AI model deployment example", | ||
model_id=model_registry_uri, | ||
machine_type="n1-standard-4", # Optional: specify machine type | ||
min_replica_count=1, # Optional: minimum number of replicas | ||
max_replica_count=3, # Optional: maximum number of replicas | ||
) | ||
|
||
# Deploy the model | ||
service = model_deployer.deploy_model( | ||
config=vertex_deployment_config, | ||
service_type=VertexDeploymentService.SERVICE_TYPE, | ||
) | ||
|
||
return service | ||
``` | ||
|
||
### Configuration Options | ||
|
||
The Vertex AI Model Deployer accepts a rich set of configuration options through `VertexAIDeploymentConfig`: | ||
|
||
* Basic Configuration: | ||
* `location`: GCP region for deployment (e.g., "us-central1") | ||
* `name`: Name for the deployment endpoint | ||
* `model_name`: Name of the model being deployed | ||
* `model_id`: Model ID from the Vertex AI Model Registry | ||
|
||
* Infrastructure Configuration: | ||
* `machine_type`: Type of machine to use (e.g., "n1-standard-4") | ||
* `accelerator_type`: GPU accelerator type if needed | ||
* `accelerator_count`: Number of GPUs per replica | ||
* `min_replica_count`: Minimum number of serving replicas | ||
* `max_replica_count`: Maximum number of serving replicas | ||
|
||
* Advanced Configuration: | ||
* `service_account`: Custom service account for the deployment | ||
* `network`: VPC network configuration | ||
* `encryption_spec_key_name`: Customer-managed encryption key | ||
* `enable_access_logging`: Enable detailed access logging | ||
* `explanation_metadata`: Model explanation configuration | ||
* `autoscaling_target_cpu_utilization`: Target CPU utilization for autoscaling | ||
|
||
### Running Predictions | ||
|
||
Once a model is deployed, you can run predictions using the service: | ||
|
||
```python | ||
from zenml.integrations.gcp.model_deployers import VertexModelDeployer | ||
from zenml.services import ServiceState | ||
|
||
# Get the deployed service | ||
model_deployer = VertexModelDeployer.get_active_model_deployer() | ||
services = model_deployer.find_model_server( | ||
pipeline_name="deployment_pipeline", | ||
pipeline_step_name="model_deployer", | ||
model_name="my_model", | ||
) | ||
|
||
if services: | ||
service = services[0] | ||
if service.is_running: | ||
# Run prediction | ||
prediction = service.predict( | ||
instances=[{"feature1": 1.0, "feature2": 2.0}] | ||
) | ||
print(f"Prediction: {prediction}") | ||
``` | ||
|
||
### Limitations and Considerations | ||
|
||
1. **Stack Requirements**: | ||
- Requires a Vertex AI Model Registry in the stack | ||
- All stack components must be non-local | ||
|
||
2. **Authentication**: | ||
- Requires proper GCP credentials with Vertex AI permissions | ||
- Best practice is to use service connectors for authentication | ||
|
||
3. **Costs**: | ||
- Vertex AI endpoints incur costs based on machine type and uptime | ||
- Consider using autoscaling to optimize costs | ||
|
||
4. **Region Availability**: | ||
- Service availability depends on Vertex AI regional availability | ||
- Model and endpoint must be in the same region | ||
|
||
Check out the [SDK docs](https://sdkdocs.zenml.io) for more detailed information about the implementation. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,150 @@ | ||
# Vertex AI Model Registry | ||
|
||
[Vertex AI](https://cloud.google.com/vertex-ai) is Google Cloud's unified ML platform that helps you build, deploy, and scale ML models. The Vertex AI Model Registry is a centralized repository for managing your ML models throughout their lifecycle. ZenML's Vertex AI Model Registry integration allows you to register, version, and manage your models using Vertex AI's infrastructure. | ||
|
||
## When would you want to use it? | ||
|
||
You should consider using the Vertex AI Model Registry when: | ||
|
||
* You're already using Google Cloud Platform (GCP) and want to leverage its native ML infrastructure | ||
* You need enterprise-grade model management capabilities with fine-grained access control | ||
* You want to track model lineage and metadata in a centralized location | ||
* You're building ML pipelines that need to integrate with other Vertex AI services | ||
* You need to manage model deployment across different GCP environments | ||
|
||
This is particularly useful in the following scenarios: | ||
|
||
* Building production ML pipelines that need to integrate with GCP services | ||
* Managing multiple versions of models across development and production environments | ||
* Tracking model artifacts and metadata in a centralized location | ||
* Deploying models to Vertex AI endpoints for serving | ||
|
||
{% hint style="warning" %} | ||
Important: The Vertex AI Model Registry implementation only supports the model version interface, not the model interface. This means you cannot register, delete, or update models directly - you can only work with model versions. Operations like `register_model()`, `delete_model()`, and `update_model()` are not supported. | ||
{% endhint %} | ||
|
||
## How do you deploy it? | ||
|
||
The Vertex AI Model Registry flavor is provided by the GCP ZenML integration. First, install the integration: | ||
|
||
```shell | ||
zenml integration install gcp -y | ||
``` | ||
|
||
### Authentication and Service Connector Configuration | ||
|
||
The Vertex AI Model Registry requires proper GCP authentication. The recommended way to configure this is using the ZenML Service Connector functionality. You have several options for authentication: | ||
|
||
1. Using a GCP Service Connector with a dedicated service account (Recommended): | ||
```shell | ||
# Register the service connector with a service account key | ||
zenml service-connector register vertex_registry_connector \ | ||
--type gcp \ | ||
--auth-method=service-account \ | ||
--project_id=<PROJECT_ID> \ | ||
[email protected] \ | ||
--resource-type gcp-generic | ||
|
||
# Register the model registry | ||
zenml model-registry register vertex_registry \ | ||
--flavor=vertex \ | ||
--location=us-central1 | ||
|
||
# Connect the model registry to the service connector | ||
zenml model-registry connect vertex_registry --connector vertex_registry_connector | ||
``` | ||
|
||
2. Using local gcloud credentials: | ||
```shell | ||
# Register the model registry using local gcloud auth | ||
zenml model-registry register vertex_registry \ | ||
--flavor=vertex \ | ||
--location=us-central1 | ||
``` | ||
|
||
{% hint style="info" %} | ||
The service account used needs the following permissions: | ||
- `Vertex AI User` role for creating and managing model versions | ||
- `Storage Object Viewer` role if accessing models stored in Google Cloud Storage | ||
{% endhint %} | ||
|
||
## How do you use it? | ||
|
||
### Register models inside a pipeline | ||
|
||
Here's an example of how to use the Vertex AI Model Registry in your ZenML pipeline using the provided model registration step: | ||
|
||
```python | ||
from typing_extensions import Annotated | ||
from zenml import ArtifactConfig, get_step_context, step | ||
from zenml.client import Client | ||
from zenml.logger import get_logger | ||
|
||
logger = get_logger(__name__) | ||
|
||
@step(enable_cache=False) | ||
def model_register() -> Annotated[str, ArtifactConfig(name="model_registry_uri")]: | ||
"""Model registration step.""" | ||
# Get the current model from the context | ||
current_model = get_step_context().model | ||
|
||
client = Client() | ||
model_registry = client.active_stack.model_registry | ||
model_version = model_registry.register_model_version( | ||
name=current_model.name, | ||
version=str(current_model.version), | ||
model_source_uri=current_model.get_model_artifact("sklearn_classifier").uri, | ||
description="ZenML model registered after promotion", | ||
) | ||
logger.info( | ||
f"Model version {model_version.version} registered in Model Registry" | ||
) | ||
|
||
return model_version.model_source_uri | ||
``` | ||
|
||
### Configuration Options | ||
|
||
The Vertex AI Model Registry accepts the following configuration options: | ||
|
||
* `location`: The GCP region where the model registry will be created (e.g., "us-central1") | ||
* `project_id`: (Optional) The GCP project ID. If not specified, will use the default project | ||
* `credentials`: (Optional) GCP credentials configuration | ||
|
||
### Working with Model Versions | ||
|
||
Since the Vertex AI Model Registry only supports version-level operations, here's how to work with model versions: | ||
|
||
```shell | ||
# List all model versions | ||
zenml model-registry models list-versions <model-name> | ||
|
||
# Get details of a specific model version | ||
zenml model-registry models get-version <model-name> -v <version> | ||
|
||
# Delete a model version | ||
zenml model-registry models delete-version <model-name> -v <version> | ||
``` | ||
|
||
### Key Differences from MLflow Model Registry | ||
|
||
Unlike the MLflow Model Registry, the Vertex AI implementation has some important differences: | ||
|
||
1. **Version-Only Interface**: Vertex AI only supports model version operations. You cannot register, delete, or update models directly - only their versions. | ||
2. **Authentication**: Uses GCP service connectors for authentication, similar to other Vertex AI services in ZenML. | ||
3. **Staging Levels**: Vertex AI doesn't have built-in staging levels (like Production, Staging, etc.) - these are handled through metadata. | ||
4. **Default Container Images**: Vertex AI requires a serving container image URI, which defaults to the scikit-learn prediction container if not specified. | ||
5. **Managed Service**: As a fully managed service, you don't need to worry about infrastructure management, but you need valid GCP credentials. | ||
|
||
### Limitations | ||
|
||
Based on the implementation, there are some limitations to be aware of: | ||
|
||
1. The `register_model()`, `update_model()`, and `delete_model()` methods are not implemented as Vertex AI only supports registering model versions | ||
2. Model stage transitions (Production, Staging, etc.) are not natively supported | ||
3. Models must have a serving container image URI specified or will use the default scikit-learn image | ||
4. All registered models are automatically labeled with `managed_by="zenml"` for tracking purposes | ||
|
||
Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration\_code\_docs/integrations-gcp/#zenml.integrations.gcp.model\_registry) to see more about the interface and implementation. | ||
|
||
<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters