diff --git a/content/modules/ROOT/pages/34_using_s3_storage.adoc b/content/modules/ROOT/pages/34_using_s3_storage.adoc index 0ef4d11..be2be61 100644 --- a/content/modules/ROOT/pages/34_using_s3_storage.adoc +++ b/content/modules/ROOT/pages/34_using_s3_storage.adoc @@ -134,7 +134,19 @@ We have previously used a custom workbench to explore how to train a model. Now . Create the `base` and `overlays` directories inside the `standard-workbench` directory. -. Create a `kustomization.yaml` file inside the `standard-workbench/base` directory. This should have resources for the **workbench pvc, data connection, and workbench notebook** and should be in the `parasol-insurance` namespace. +. Create a `kustomization.yaml` file inside the `standard-workbench/base` directory. This should have resources for the **workbench pvc, data connection, and workbench notebook**. Remember to add the`parasol-insurance` namespace. + ++ +.tenants/parasol-insurance/standard-workbench/base/kustomization.yaml +[source,yaml] +---- +apiVersion: kustomize.config.k8s.io/v1beta1 +kind: Kustomization + +resources: + - +---- + + .Solution @@ -180,6 +192,25 @@ spec: . Create a secret named `minio-data-connection.yaml` inside the `standard-workbench/base` directory. This shoud have the minio S3 connection details. This should show up in the Data Connection tab of the Data Science Project if set up correctly. ++ +.tenants/parasol-insurance/standard-workbench/base/minio-data-connection.yaml +[source,yaml] +---- +kind: Secret +apiVersion: v1 +metadata: + name: minio-data-connection + labels: + + annotations: + opendatahub.io/connection-type: s3 + openshift.io/display-name: minio-data-connection + argocd.argoproj.io/sync-wave: "-100" +stringData: + +type: Opaque +---- + + .Solution [%collapsible] @@ -379,7 +410,29 @@ image::Workbench_env_vars.png[] !pip install boto3 ultralytics ---- -. Configure the connection to MinIO S3. Make sure to reference the S3 connection details. +. In a new cell, add and configure the connection to MinIO S3. Make sure to reference the S3 connection details. + ++ +[source, python] +---- +## +from botocore.client import Config + +# Configuration +## +## +## + +# Setting up the MinIO client +s3 = boto3.client( + 's3', + endpoint_url=minio_url, + aws_access_key_id=access_key, + aws_secret_access_key=secret_key, + config=Config(signature_version='s3v4'), +) +---- + + .Solution @@ -407,7 +460,19 @@ s3 = boto3.client( ---- ==== -. Using the boto3.client from the previous step, define a function to list the current buckets. Name this function `get_minio_buckets` +. Using the boto3.client variable from the previous step, define a function to list the current buckets in a new cell. Name this function `get_minio_buckets` + ++ +[source, python] +---- +# Function to get MinIO server bucket info +# Print the list of buckets in S3 +def get_minio_buckets(): + # This function retrieves the list of buckets as an example. + + +get_minio_buckets() +---- + .Solution @@ -434,7 +499,19 @@ get_minio_buckets() We currently have no buckets in the S3 storage. We will create a bucket and upload a file to it. ==== -. Using the boto3.client, create a function to create a new bucket in S3 storage. Name it `create_minio_bucket` with `bucket_name` as an input parameter. +. Using the boto3.client variable, create a function to create a new bucket in S3 storage in a new cell. Name it `create_minio_bucket` with `bucket_name` as an input parameter. + ++ +[source, python] +---- +# Function to create a bucket +def create_minio_bucket(bucket_name): + try: + ## add functionality here + + except Exception as e: + print(f"Error creating bucket '{bucket_name}': {e}") +---- + .Solution @@ -452,7 +529,7 @@ def create_minio_bucket(bucket_name): ---- ==== -. Use the fuctions that you just created to create 2 buckets: `models` and `pipelines`. Use the `get_minio_buckets` function you created to view the newly created buckets. +. In a new cell, Use the fuctions that you just created to create 2 buckets: `models` and `pipelines`. Use the `get_minio_buckets` function you created to view the newly created buckets. + .Solution @@ -465,7 +542,8 @@ create_minio_bucket('pipelines') get_minio_buckets() ---- ==== -. Using the boto3.client, create a function to upload a file to a bucket. This function should be named `upload_file` and should take 3 input parameters: `file_path`, `bucket_name`, and `object_name`. + +. Using the boto3.client variable, create a function to upload a file to a bucket. This function should be named `upload_file` and should take 3 input parameters: `file_path`, `bucket_name`, and `object_name`. + .Solution @@ -497,7 +575,6 @@ model = YOLO("https://rhods-public.s3.amazonaws.com/demo-models/ic-models/accide .Solution [%collapsible] ==== -+ [source, python] ---- # Download the model diff --git a/content/modules/ROOT/pages/36_deploy_model.adoc b/content/modules/ROOT/pages/36_deploy_model.adoc index 7d24cd2..7ca97da 100644 --- a/content/modules/ROOT/pages/36_deploy_model.adoc +++ b/content/modules/ROOT/pages/36_deploy_model.adoc @@ -6,7 +6,7 @@ To do this we would need a data connection to the S3 instance where our model is RHOAI supports the ability to add your own serving runtime. But it does not support the runtimes themselves. Therefore, it is up to you to configure, adjust and maintain your custom runtimes. -In this tutorial we will setup the Triton Runtime (NVIDIA Triton Inference Server). Use the following steps to add the runtime: +In this tutorial we will setup the Triton Runtime (NVIDIA Triton Inference Server) and serve a model using it. . In the `parasol-insurance` tenant, create a directory named `multi-model-serving` @@ -14,9 +14,12 @@ In this tutorial we will setup the Triton Runtime (NVIDIA Triton Inference Serve . Create a directory named `parasol-insurance-dev` under the `multi-model-serving/overlays` directory -. Create a file named `kustomization.yaml` inside the `multi-model-serving/overlays/parasol-insurance-dev` directory with the following content: +. Create a `kustomization.yaml` inside the `multi-model-serving/overlays/parasol-insurance-dev` directory and point it to the base folder of the `multi-model-serving` directory. + +.Solution +[%collapsible] +==== .multi-model-serving/overlays/parasol-insurance-dev/kustomization.yaml [source,yaml] ---- @@ -26,8 +29,9 @@ kind: Kustomization resources: - ../../base ---- +==== -. Create a file named `kustomization.yaml` inside the `multi-model-serving/base` directory with the following content: +. Create a `kustomization.yaml` inside the `multi-model-serving/base` directory. This should have the `parasol-insurance` namespace, as well as **data-connection, inference-service, and serving-runtime** as resources. + .multi-model-serving/base/kustomization.yaml @@ -36,6 +40,20 @@ resources: apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization +resources: + +---- + ++ +.Solution +[%collapsible] +==== +.multi-model-serving/base/kustomization.yaml +[source,yaml] +---- +apiVersion: kustomize.config.k8s.io/v1beta1 +kind: Kustomization + namespace: parasol-insurance resources: @@ -43,10 +61,14 @@ resources: - inference-service.yaml - serving-runtime.yaml ---- +==== -. To create a data connection, create a file named 'data-connection.yaml' inside the `multi-model-serving/base` directory with the following content: +. Create a data connection with the minio details. Create a file named 'data-connection.yaml' inside the `multi-model-serving/base` directory with the minio details. Make sure to add the RHOAI labels so it will show in the RHOAI Dashboard. + +.Solution +[%collapsible] +==== .multi-model-serving/base/data-connection.yaml [source,yaml] ---- @@ -69,6 +91,7 @@ stringData: AWS_DEFAULT_REGION: east-1 type: Opaque ---- +==== . To create the custom serving Triton runtime, create a file named 'serving-runtime.yaml' inside the `multi-model-serving/base` directory with the following content: @@ -169,8 +192,10 @@ spec: ---- ## Inference Service +Once we have our serving runtime, we can use it as the runtime for our Inference Service. + +. To create the Inference Service, create a file named 'inference-service.yaml' inside the `multi-model-serving/base` directory. Make sure to add the RHOAI labels so we can view it in the RHOAI dashboard. -. To create the inference service, create a file named 'inference-service.yaml' inside the `multi-model-serving/base` directory with the following content: + .multi-model-serving/base/inference-service.yaml @@ -178,6 +203,37 @@ spec: ---- apiVersion: serving.kserve.io/v1beta1 kind: InferenceService +metadata: + annotations: + openshift.io/display-name: accident-detect-model + serving.kserve.io/deploymentMode: ModelMesh + name: accident-detect-model + labels: + +spec: + predictor: + model: + modelFormat: + name: + version: '1' + name: '' + resources: {} + runtime: + storage: + key: + path: +---- + + ++ +.Solution +[%collapsible] +==== +.multi-model-serving/base/inference-service.yaml +[source,yaml] +---- +apiVersion: serving.kserve.io/v1beta1 +kind: InferenceService metadata: annotations: openshift.io/display-name: accident-detect-model @@ -198,16 +254,21 @@ spec: key: accident-model-data-conn path: accident_model/accident_detect.onnx ---- +==== -. Push the changes to the ai-accelerator repository +. Push the changes to your ai-accelerator fork. -. Wait for the application to sync +. Wait for the application to sync in Argo. . Navigate to RHOAI, and validate that there is a new model serving under the `Models` tab, and check that its status looks green. ## Test the served model -To test if the served model is working as expected, go back to RHOAI `parasol-insurance` project and go to the _workbenches_ tab. Stop the `standard-workbench` and start the `custom-workbench`. Once the custom-workbench is running, navigate to `parasol-insurance/lab-materials/04`. Open the `04-05-model-serving` notebook. We need to change the RestURL/infer_url value. We can get it from the model that we just deployed. +To test if the served model is working as expected, go back to RHOAI `parasol-insurance` project and go to the _workbenches_ tab. + +Stop the `standard-workbench` and start the `custom-workbench`. + +Once the custom-workbench is running, navigate to `parasol-insurance/lab-materials/04`. Open the `04-05-model-serving` notebook. We need to change the RestURL/infer_url value. We can get it from the model that we just deployed. Make sure to change the values in the notebook when testing: @@ -216,10 +277,6 @@ image::model_serving_notebook_changes.png[] After making these changes, run the notebook and we should see an output to the image that we pass to the model. -[TIP] -==== -Validate changes against https://github.com/redhat-ai-services/ai-accelerator-qa/pull/new/36_deploy_model[Deploy model branch] -==== [CAUTION] ====