From 02293eb08b1c33e7b8717ecf76a1746f84160f4b Mon Sep 17 00:00:00 2001 From: Tom Corcoran Date: Wed, 19 Jun 2024 08:32:50 +1000 Subject: [PATCH] Progress on exercises (#9) * Continued outline for scenario 3 * Updates to scenarios 3 and 4 * Updates to scenarios 3 and 4 * Code for scenario 3 * Code for scenario 3 * Code for scenario 3 * Updated docs for scenario 3 * Updated docs for scenario 3 * Updated docs for scenario 3 * Updated docs for scenario 3 * Fixing Notebook that download the contents of an Object store bucket (source) then push those contents to another Object store bucket (target) --- data/hackathon/scenario1.mdx | 2 +- data/hackathon/scenario3.mdx | 120 ++++++- data/hackathon/scenario4.mdx | 17 + .../inference-api-body.json | 32 ++ .../minio_pull_from_and_push_to.ipynb | 310 ++++++++++++++++++ .../vllm-runtime-small-for-granite-7b.yaml | 31 ++ 6 files changed, 497 insertions(+), 15 deletions(-) create mode 100644 data/hackathon/scenario4.mdx create mode 100644 temp/scenario3_hybrid_cloud/inference-api-body.json create mode 100644 temp/scenario3_hybrid_cloud/minio_pull_from_and_push_to.ipynb create mode 100644 temp/scenario3_hybrid_cloud/vllm-runtime-small-for-granite-7b.yaml diff --git a/data/hackathon/scenario1.mdx b/data/hackathon/scenario1.mdx index 1835a68..6138153 100644 --- a/data/hackathon/scenario1.mdx +++ b/data/hackathon/scenario1.mdx @@ -37,7 +37,7 @@ For this challenge you'll be given two OpenShift clusters - a Single Node OpenShift cluster representing the ACME On Premises environment All challenge tasks must be performed on these clusters so your solutions can be graded successfully. -Some challenges wil require the use of specific clusters +Some challenges will require the use of specific clusters You can and are encouraged to use any supporting documentation or other resources in order to tackle each of the challenge tasks. diff --git a/data/hackathon/scenario3.mdx b/data/hackathon/scenario3.mdx index ba5041d..225f749 100644 --- a/data/hackathon/scenario3.mdx +++ b/data/hackathon/scenario3.mdx @@ -8,39 +8,131 @@ authors: ['default'] summary: "Let's deploy the first model across the hybrid cloud." --- -As a sales team you've got an upcoming demo with the Acme Financial Services data science team, who have been training models on their laptops. +As a sales team you've got an upcoming demo with the ACME Financial Services data science team, who have been training models on their laptops. The team have given you access to one of their models in the ACME Financial Services object storage and want to see how this could be deployed to a cluster running in the cloud. - -## 3.1 - Replicate Model to Cloud Storage +## 3.1 - Examine your On-Premises and Cloud based storage For this task, your team are required to use the `granite-7b-lab` model available in the object storage running in the ACME Financial Services on prem cluster which is based on Minio. -After locating the model in on premises object storage, your team need to replicate this model to the ACME Financial Services cloud cluster object storage so that it could be served in future. - -Documentation you may find helpful is: -- https://min.io/docs/minio/linux/index.html - +The `granite-7b-lab` model is available in the object storage running in the ACME Financial Services on prem cluster which is based on Minio. It's located under the `models` bucket. +After locating the model in on premises object storage, your team will need to replicate this model to the ACME Financial Services cloud cluster object storage (to a bucket and sub-folder under the same name) so that it could be served in future. +Examine both storage locations. -## 3.2 - Install Openshift AI related operators +## 3.2 - Install operators related to Openshift AI -Now that you've helped the ACME team replicate their chosen model to their cloud OpenShift Cluster, they want to serve the model ASAP. +Now that you're aware of ACME's replication requirements their chosen model to their cloud OpenShift Cluster, it's time to do that replication and then serve the model ASAP. -For this challenge your team must demonstrate to ACME how to install OpenShift AI, and serve the existing model called `granite-7b-lab` via OpenShift AI. +For the first part of this challenge your team must demonstrate to ACME how to install OpenShift AI, replicate to the cloud and serve the existing model called `granite-7b-lab` via OpenShift AI. Install the following opertors (do not install any custom resources) - OpenShift AI - OpenShift Service Mesh - OpenShift Serverless -## 3.2 - Install Openshift AI +## 3.3 - Install OpenShift AI +Wait until the three operators specified in the previoius section have fully provisioned, before proceeding. +You won't need any Custom Resources for OpenShift Service Mesh and OpenShift Serverless + +You will need one for OpenShift AI. A valid strategy would be to open the yaml view and go with all the defaults - the only addition to be to add this knative-serving-cert secret ingressGateway: certificate: - secretName: knative-serving-cert - + `secretName: knative-serving-cert` Documentation you may find helpful is: - https://access.redhat.com/documentation/en-us/red_hat_openshift_ai_self-managed/2.9/html/installing_and_uninstalling_openshift_ai_self-managed/index + + +## 3.4 - Set up your OpenShift AI project and workbench + +An OpenShift AI project maps directly to an OpenShift project, enjoying the same RBAC and isoltion capailities. +OpenShift AI Workbenches pertain to a base container image, encapsulating a particular toolset used by the data scientists, e.g. Pytorch, Tensorflow etc. + +Now open OpenShift AI and do the following +- create a project +- create a workbench that + - uses Pytorch as a basis + - uses a Persistent Volume of at least 60GB + - uses a Data Connection to your Minio object storage + - uses a Medium sized Container without an accelerator + + +Documentation you may find helpful is: + +- https://access.redhat.com/documentation/en-us/red_hat_openshift_ai_self-managed/2.9/html/getting_started_with_red_hat_openshift_ai_self-managed/creating-a-data-science-project_get-started + +- https://access.redhat.com/documentation/en-us/red_hat_openshift_ai_self-managed/2.9/html/getting_started_with_red_hat_openshift_ai_self-managed/creating-a-project-workbench_get-started + + + + +## 3.5 - Replicate Model to Cloud Storage + +For this task, your team are required to use the `granite-7b-lab` model available in the object storage running in the ACME Financial Services on prem cluster which is based on Minio. + +How you do this replication is up to you. There are options using the Minio CLI and also using hint 3.5 below. + +Documentation you may find helpful is: +- https://min.io/docs/minio/linux/index.html + + + + +## 3.6 - Use your cloud-based OpenShift AI to Serve the model and make it easily consumable by intelligent applications for inference +Single Model Serving is the preferred mode for serving LLMs + +### 3.6.1 - Import a VLLM Server and enable Single Model Serving +VLLM is a popular model server format whose APIs are compatible with Open AI (Chat GPT) APIs. This format then lends itself to easy migration of apps already using Open AI - to OpenShift AI. + +You may use Hint 3.6.1 below + +Documentation you may find helpful is: +- https://access.redhat.com/documentation/en-us/red_hat_openshift_ai_self-managed/2.9/html-single/serving_models/index#adding-a-custom-model-serving-runtime-for-the-single-model-serving-platform_serving-large-models + + +### 3.6.2 - Create a Single Model Server on your cloud based OpenShift + +Documentation you may find helpful is: +- https://access.redhat.com/documentation/en-us/red_hat_openshift_ai_self-managed/2.9/html-single/serving_models/index#deploying-models-on-the-single-model-serving-platform_serving-large-models + +### 3.4.3 - Make an Inference call to the model. + +Note you should not need to use a token. + +You may use Hint 3.6.2 below + + +Documentation you may find helpful is: +- https://access.redhat.com/documentation/en-us/red_hat_openshift_ai_self-managed/2.9/html-single/serving_models/index#accessing-inference-endpoint-for-deployed-model_serving-large-models + + + +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + +HINTS +[3.5] You can use this notebook to pull from local object storage and push to cloud object storage + https://github.com/tnscorcoran/rhods-finetunning-demo/blob/main/minio_pull_from_and_push_to.ipynb + TODO - Code the notebook and move it to the correct git repo + (note in a production environment, this would likely be automed using Gitops) + TODO - confirm Gitops would be used + +[3.6.1] You can import this yaml to set up your vLLM server + + TODO correct location + https://github.com/tnscorcoran/hackathon/blob/main/temp/scenario3_hybrid_cloud/vllm-runtime-small-for-granite-7b.yaml + +[3.6.2] You can use this JSON as the API body to make an inference call + + TODO correct location + https://github.com/tnscorcoran/hackathon/blob/main/temp/scenario3_hybrid_cloud/inference-api-body.json + + + + + +- \ No newline at end of file diff --git a/data/hackathon/scenario4.mdx b/data/hackathon/scenario4.mdx new file mode 100644 index 0000000..c3266da --- /dev/null +++ b/data/hackathon/scenario4.mdx @@ -0,0 +1,17 @@ +--- +title: Open Innovation - Integrating a new piece of tooling into OpenShift AI Workspace +exercise: 3 +date: '2024-06-05' +tags: ['openshift','ai','kubernetes'] +draft: false +authors: ['default'] +summary: "Let's add some tooling in satndardised way available to all data science users" +--- + +As a sales team you've got an upcoming demo with the Acme Financial Services data science team, who have been training models on their laptops. +Their data scientists are given directives on what tools to use - but in reality, what each one uses has drifted away from what's been directed, resulting in inconsistencies in tooling used across users and more importantly between deveopment and production environments +You are required to show a solution for this problem on OpenShift AI. + +## 4.1 - Test a new library inside a Jupyter Notebook + +For this task, your team are required to \ No newline at end of file diff --git a/temp/scenario3_hybrid_cloud/inference-api-body.json b/temp/scenario3_hybrid_cloud/inference-api-body.json new file mode 100644 index 0000000..06f9f40 --- /dev/null +++ b/temp/scenario3_hybrid_cloud/inference-api-body.json @@ -0,0 +1,32 @@ +{ + "model": "/mnt/models/", + "prompt": [ + "Give me the history of Arbour hill prison in Dublin" + ], + "max_tokens": 1028, + "temperature": 1, + "top_p": 1, + "n": 1, + "stream": false, + "logprobs": 0, + "echo": false, + "stop": [ + "string" + ], + "presence_penalty": 0, + "frequency_penalty": 0, + "best_of": 2, + "user": "string", + "top_k": -1, + "ignore_eos": false, + "use_beam_search": false, + "stop_token_ids": [ + 0 + ], + "skip_special_tokens": true, + "spaces_between_special_tokens": true, + "repetition_penalty": 1, + "min_p": 0, + "include_stop_str_in_output": false, + "length_penalty": 1 +} diff --git a/temp/scenario3_hybrid_cloud/minio_pull_from_and_push_to.ipynb b/temp/scenario3_hybrid_cloud/minio_pull_from_and_push_to.ipynb new file mode 100644 index 0000000..89536f3 --- /dev/null +++ b/temp/scenario3_hybrid_cloud/minio_pull_from_and_push_to.ipynb @@ -0,0 +1,310 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "e0a7a17e", + "metadata": { + "tags": [] + }, + "source": [ + "# Use this notebook to \n", + " - download the contents of an Object store bucket (source) to a local folder within Jupyter Hub (download_to)\n", + " - then push those contents to another Object store bucket (target)\n", + "\n", + "This *substantially faster* than downloading and uploading manually.\n", + "\n", + "### You will need to make substitutions take actions in the cell marked **MAKE CHANGES HERE** below " + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "79714be6-696a-4e62-8c99-ae9bc2ef451a", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "import os\n", + "import boto3" + ] + }, + { + "cell_type": "markdown", + "id": "0a20fc45-7adf-4853-a2f0-4d490d3c4e3f", + "metadata": {}, + "source": [ + "##### The next cell requires you to have created an OpenShift AI Data Connection to Object storage (Minio) \n", + "##### inside your OpenShift AI Project. If you have not, manually override them (*source_region* should be 'none')" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "37789ea0-8729-4bf5-8ae5-e976fddfc800", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "minio\n", + "minio123\n", + "none\n", + "https://minio-api-minio.apps.rosa-55nsv.lax9.p1.openshiftapps.com\n", + "models\n" + ] + } + ], + "source": [ + "source_key_id = os.getenv(\"AWS_ACCESS_KEY_ID\")\n", + "source_secret_key = os.getenv(\"AWS_SECRET_ACCESS_KEY\")\n", + "source_region = os.getenv(\"AWS_DEFAULT_REGION\")\n", + "source_endpoint = os.getenv(\"AWS_S3_ENDPOINT\")\n", + "source_bucket_name = os.getenv(\"AWS_S3_BUCKET\")\n", + "\n", + "print (source_key_id)\n", + "print (source_secret_key)\n", + "print (source_region)\n", + "print (source_endpoint)\n", + "print (source_bucket_name)\n" + ] + }, + { + "cell_type": "markdown", + "id": "f4a47044-7da0-4465-8d99-53f055e474a9", + "metadata": { + "tags": [] + }, + "source": [ + "# **MAKE CHANGES HERE** \n", + "\n", + "Details of what you need to do: \n", + "- *download_to* is the name of the local directory that will be created here in Jupyter Hub to download to and upload from\n", + "- *source_bucket* and *source_subfolder* are the bucket and subfolder in the Object Storage you will pull from\n", + "- *target_bucket* and *target_subfolder* are the bucket and subfolder in the Object Storage you will push to\n", + "\n", + "## Note - before running the notebook, you should delete the **download_to** if it's already there" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "f89ec490-f1c9-4f7a-a2a7-0244c988bb27", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "download_to=\"\"\n", + "\n", + "source_bucket = \"\"\n", + "source_subfolder = \"\"\n", + "\n", + "target_bucket = \"\"\n", + "target_subfolder = \"\"\n", + "target_url = \"\"\n", + "target_key_id = \"\"\n", + "target_secret_key = \"\"\n", + "target_endpoint = \"\"\n", + "\n", + "\n", + "# Example entries (URLs will be invalid for you)\n", + "download_to=\"download_to\"\n", + "\n", + "source_bucket = \"models\"\n", + "source_subfolder = \"granite-7b-lab/\"\n", + "\n", + "target_bucket = \"models-target\"\n", + "target_subfolder = \"granite-7b-lab/\"\n", + "target_url = \"https://minio-api-minio.apps.rosa-55nsv.lax9.p1.openshiftapps.com\"\n", + "target_key_id = source_key_id\n", + "target_secret_key = source_secret_key\n", + "target_endpoint = target_url\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "46070831-9839-4e82-b0ce-5738f133a9d7", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "s3 = boto3.client(\n", + " \"s3\",\n", + " aws_access_key_id=source_key_id,\n", + " aws_secret_access_key=source_secret_key,\n", + " endpoint_url=source_endpoint,\n", + " verify=True)\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "63190c78-c675-4000-8b78-015b35850a16", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "if not os.path.exists(download_to):\n", + " os.mkdir(download_to) " + ] + }, + { + "cell_type": "markdown", + "id": "9732d9fa-1280-40bf-922d-f0c6d61ae75a", + "metadata": {}, + "source": [ + "## Download from source Object Storage" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "2ba297df-bd90-4c4d-9bf6-70b05305a988", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Downloading granite-7b-lab/.gitattributes to download_to/.gitattributes\n", + "Downloading granite-7b-lab/README.md to download_to/README.md\n", + "Downloading granite-7b-lab/added_tokens.json to download_to/added_tokens.json\n", + "Downloading granite-7b-lab/config.json to download_to/config.json\n", + "Downloading granite-7b-lab/generation_config.json to download_to/generation_config.json\n", + "Downloading granite-7b-lab/model-00001-of-00003.safetensors to download_to/model-00001-of-00003.safetensors\n", + "Downloading granite-7b-lab/model-00002-of-00003.safetensors to download_to/model-00002-of-00003.safetensors\n", + "Downloading granite-7b-lab/model-00003-of-00003.safetensors to download_to/model-00003-of-00003.safetensors\n", + "Downloading granite-7b-lab/model.safetensors.index.json to download_to/model.safetensors.index.json\n", + "Downloading granite-7b-lab/paper.pdf to download_to/paper.pdf\n", + "Downloading granite-7b-lab/special_tokens_map.json to download_to/special_tokens_map.json\n", + "Downloading granite-7b-lab/tokenizer.json to download_to/tokenizer.json\n", + "Downloading granite-7b-lab/tokenizer.model to download_to/tokenizer.model\n", + "Downloading granite-7b-lab/tokenizer_config.json to download_to/tokenizer_config.json\n" + ] + } + ], + "source": [ + "import boto3\n", + "import os\n", + "\n", + "def download_s3_folder(bucket_name, folder_name, local_dir):\n", + "\n", + " # List all objects in the specified folder\n", + " paginator = s3.get_paginator('list_objects_v2')\n", + " pages = paginator.paginate(Bucket=bucket_name, Prefix=folder_name)\n", + "\n", + " for page in pages:\n", + " if 'Contents' in page:\n", + " for obj in page['Contents']:\n", + " key = obj['Key']\n", + " if not key.endswith('/'):\n", + " local_file_path = os.path.join(local_dir, key[len(folder_name):])\n", + " local_file_dir = os.path.dirname(local_file_path)\n", + "\n", + " if not os.path.exists(local_file_dir):\n", + " os.makedirs(local_file_dir)\n", + " \n", + " print(f\"Downloading {key} to {local_file_path}\")\n", + " s3.download_file(bucket_name, key, local_file_path)\n", + "\n", + "\n", + "\n", + "# Example usage\n", + "bucket_name = source_bucket\n", + "folder_name = source_subfolder # Ensure it ends with a slash\n", + "local_dir = download_to\n", + "\n", + "download_s3_folder(bucket_name, folder_name, local_dir)\n" + ] + }, + { + "cell_type": "markdown", + "id": "8fbb46b7-6117-4b53-b043-ab4249fed56d", + "metadata": {}, + "source": [ + "## Upload to target Object Storage" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "13950de4-9f96-4117-8254-8e37cc5fa48c", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "added_tokens.json\n", + "special_tokens_map.json\n", + "config.json\n", + "model.safetensors.index.json\n", + "model-00002-of-00003.safetensors\n" + ] + } + ], + "source": [ + "target_s3 = boto3.client(\n", + " \"s3\",\n", + " aws_access_key_id=target_key_id,\n", + " aws_secret_access_key=target_secret_key,\n", + " endpoint_url=target_endpoint,\n", + " verify=True)\n", + "\n", + "Direc = download_to\n", + "\n", + "files = os.listdir(Direc)\n", + "files = [f for f in files if os.path.isfile(Direc+'/'+f)]\n", + "\n", + "for filename in files:\n", + " print(filename)\n", + " target_s3.upload_file(download_to+\"/\"+filename, target_bucket, target_subfolder+filename)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "139b96f1-50ca-4ebd-af7a-45098fb9d21b", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "print (\"Done\")" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3.9", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.18" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/temp/scenario3_hybrid_cloud/vllm-runtime-small-for-granite-7b.yaml b/temp/scenario3_hybrid_cloud/vllm-runtime-small-for-granite-7b.yaml new file mode 100644 index 0000000..48b9a1a --- /dev/null +++ b/temp/scenario3_hybrid_cloud/vllm-runtime-small-for-granite-7b.yaml @@ -0,0 +1,31 @@ +apiVersion: serving.kserve.io/v1alpha1 +kind: ServingRuntime +labels: + opendatahub.io/dashboard: "true" +metadata: + annotations: + openshift.io/display-name: vLLM-SMALL + name: vllm-small +spec: + builtInAdapter: + modelLoadingTimeoutMillis: 90000 + containers: + - args: + - --model + - /mnt/models/ + - --download-dir + - /models-cache + - --port + - "8080" + - --max-model-len + - "2048" + image: quay.io/rh-aiservices-bu/vllm-openai-ubi9:0.4.2 + name: kserve-container + ports: + - containerPort: 8080 + name: http1 + protocol: TCP + multiModel: false + supportedModelFormats: + - autoSelect: true + name: pytorch