Stable Diffusion XL in Red Hat OpenShift AI

This readme shows how to serve a Stable Diffusion XL model in Red Hat OpenShift AI (RHOAI) along with the steps to fine-tune the model.

This project takes the latest SDXL model and familiarizes it with Toy Jensen via finetuning on a few pictures, thereby teaching it to generate new images which include him when it didn't recognize him previously.

Once the model is fine-tuned we will show the steps to deploy the model in RHOAI as well as to access the deployed model to generate the image.

Prerequisites

Before you can fine-tune and serve a model in Red Hat OpenShift AI, you will need to install RHOAI and enable NVIDIA GPU by following these links:

Red Hat OpenShift AI installation
Enable NVIDIA GPU

This project generates LoRA (Low-Rank Adaptation of Large Language Models) weights when the base model is fine-tuned. These weights can be uploaded to one of the following:

MinIO
- Install the oc client if using MinIO for model storage
AWS S3
- Setup IAM user/credentials/permissions to create the bucket as well as to upload the objects when using this appraoch

Quickstart

Open up Red Hat OpenShift AI by selecting it from OpenShift Application Launcher. This will open up Red Hat OpenShift AI in a new tab.

Create Data Science project

Select Data Science Projects in the left navigation menu.

Create a new Data Science project by clicking on Create data science project button. Provide the Name as well as the Resource name for the project, and click on Create button. This will create a new data science project for you.

Select your newly created project by clicking on it.

Below is a gif showing Create data science project dialogs:

Setup LoRA upload approach

The LoRA weights that are generated while fine-tuning the base image need to be uploaded to either AWS S3 or MinIO so that they are available to the model when the model is deployed.

Setup MinIO

To setup MinIO, for storing the LoRA weights, execute the following commands in a terminal/console:

# Login to OpenShift (if not already logged in)
oc login --token=<OCP_TOKEN>

# Install MinIO
MINIO_USER=<USERNAME> \
   MINIO_PASSWORD="<PASSWORD>" \
   envsubst < minio-setup.yml | \
   oc apply -f - -n <PROJECT_CREATED_IN_PREVIOUS_STEP>

Set <USERNAME> and <PASSWORD> to some valid values, in the above command, before executing it

Once MinIO is setup, you can access it within your project. The yaml that was applied above creates these two routes:

minio-ui - for accessing the MinIO UI
minio-api - for API access to MinIO
- Take note of the minio-api route location as that will be needed in next section.

Setup AWS S3

To setup AWS S3, for storing the LoRA weights, setup the following:

Create IAM user
Add following permissions to the user, with the Effect: "Allow":
- s3:ListBucket
- s3:*Object
- s3:ListAllMyBuckets
- s3:CreateBucket
  - This permission is ONLY needed if you want bucket to be created by the notebook
For the above permissions, set Resource to:
- arn:aws:s3:::*
- If an already existing bucket is used, then the Resource can be set to the specific bucket, e.g.
  - arn:aws:s3:::<EXISTING_BUCKET_NAME>

Create workbench

To use RHOAI for this project, you need to create a workbench first. In the newly created data science project, create a new Workbench by clicking Create workbench button in the Workbenches tab.

When creating the workbench, add the following environment variables:

AWS_ACCESS_KEY_ID
- MinIO user name if using MinIO else use AWS credentials
AWS_SECRET_ACCESS_KEY
- MinIO password if using MinIO else use AWS credentials
AWS_S3_ENDPOINT
- minio-api route location if using MinIO else use AWS S3 endpoint that is in the format of https://s3.<REGION>.amazonaws.com
AWS_S3_BUCKET
- This bucket should either be existing or will be created by one of the Jupyter notebooks to upload the LoRA weights.
- If using AWS S3 and the bucket does not exist, make sure correct permissions are assigned to the IAM user to be able to create the bucket, as shown here
AWS_DEFAULT_REGION
- Set it to us-east-1 if using MinIO otherwise use the correct AWS region
The environment variables can be added one by one, or all together by uploading a secret yaml file

Use the following values for other fields:

Notebook image:
- Image selection: PyTorch
- Version selection: 2024.1
Deployment size:
- Container size: Medium
- Accelerator: NVIDIA GPU
- Number of accelerators: 1
Cluster storage: 50GB

Create the workbench with above settings.

Below is a gif showing various sections of Create Workbench:

Create Data connection

Create a new data connection that can be used by the init-container (storage-initializer) to fetch the LoRA weights generated in next step when deploying the model.

To create a Data connection, use the following steps:

Click on Add data connection button in the Data connections tab in your newly created project
Use the following values for this data connection:
- Name: minio
- Access key: value specified for AWS_ACCESS_KEY_ID field in Create Workbench section
- Secret key: value specified for AWS_SECRET_ACCESS_KEY field in Create Workbench section
- Endpoint: value specified for AWS_S3_ENDPOINT field in Create Workbench section
- Access key: value specified for AWS_DEFAULT_REGION field in Create Workbench section
- Bucket: value specified for AWS_S3_BUCKET field in Create Workbench section
Create the data connection by clicking on Add data connection button

Below is a gif showing the Add data connection dialog (the values shown are for MinIO):

Add Serving runtime

You can either build the Serving runtime, from igm-repo sub-directory by following the instructions provided there, or use the existing yaml for adding the serving runtime for deploying the model generated in this project.

Follow these steps to use the existing yaml:

Expand Settings sidebar menu in RHOAI
Click on Serving runtimes in the expanded sidebar menu
Click on Add serving runtime button
Use the following values in the Add serving runtime page:
- Select the model serving platforms this runtime supports: Single-model serving platform
- Select the API protocol this runtime supports: REST
- YAML: Drag & drop Stable_Diffusion-ServingRuntime yaml file or paste the contents of this file after selecting Start from scratch option
Click on Create button to create this new Serving runtime

You can read more about Model serving here

Below is a gif showing fields on Add serving runtime page:

Open workbench

Now that the workbench is created and running, follow these steps to setup the project:

Select your newly created project by clicking on Data Science Projects in the sidebar menu
Click on Workbenches tab and open the newly created workbench by clicking on the Open link
The workbench will open up in a new tab
When the workbench is opened for the first time, you will be shown an Authorize Access page.
- Click Allow selected permissions button in this page.
In the workbench, click on Terminal icon in the Launcher tab.
Clone this repository in the Terminal by running the following command:
git clone https://github.com/sgahlot/workbench-example-sdxl-customization.git

Below is a gif showing Open workbench pages:

Run Jupyter notebook

The notebook mentioned in this section is used to take the base model and fine-tune it to generate LoRA weights that are used later on to generate toy-jensen image

Once the repository is cloned, select the folder where you cloned the repository (in the sidebar) and navigate to code/rhoai directory and open up FineTuning-SDXL.ipynb
If AWS S3 is used to store the LoRA weights, modify the last cell as shown below:
- XFER_LOCATION = 'MINIO' - change it to XFER_LOCATION = 'AWS'
Run this notebook by selecting Run -> Run All Cells menu item
When the notebook successfully runs, your fine-tuned model should have been uploaded to AWS or MinIO in the bucket specified for AWS_S3_BUCKET in Create Workbench section.

Deploy model

Once the initial notebook has run successfully and the data connection is created, you can deploy the model by following these steps:

In the RHOAI tab, select Models tab (for your newly created project) and click on Deploy model button
Fill in the following fields as described below:
- Model name: <PROVIDE_a_name_for_the_model>
- Serving runtime: Stable Diffusion
- Model framework: sdxl
- Model server size: Small
- Accelerator: NVIDIA GPU
- Model route:
  - If you want to access this model endpoint from outside the cluster, make sure to check the Make deployed models available through an external route checkbox. By default the model endpoint is only available as an internal service.
- Model location: Select Existing data connection option
  - Name: Name of data connection created in previous step
  - Path: model
Click on Deploy to deploy this model

Copy the inference endpoint once the model is deployed successfully (it will take a few minutes to deploy the model).

Generate image

A toy-jensen image can now be generated, using the deployed model. To generate and retrieve the image, use the following steps:

Open up GenerateImageUsingModel.ipynb
Set the value of inference_endpoint variable correctly by pointing it to your model's inference endpoint
- Your model inference endpoint should have been copied in the previous section
Run this notebook by selecting Run -> Run All Cells menu item
When the notebook successfully runs, you should see a toy-jensen image generated in the last cell.

System used

Red Hat OpenShift AI: 2.10.0, 2.13.0, 2.14.0
GPU: 1x NVIDIA A10G
Storage: 50GB

Python module versions

Even though the latest version is used for all the modules that are installed for this project, here are the versions that are used underneath (in case any version incompatibility occurs in future):

accelerate: 1.1.1
boto3: 1.34.111
botocore: 1.34.111
dataclass_wizard: 0.26.0
diffusers: 0.32.0.dev0
ipywidgets: 8.1.2
jupyterlab: 3.6.8
huggingface_hub: 0.26.2
minio: 7.2.9
peft: 0.13.2
transformers: 4.46.2
torch: 2.2.2+cu121
torchvision: 0.17.2+cu121

Notebooks with output

The following notebooks contain output to give you an idea on how the outputs will look when the notebooks are run:

FineTuning-SDXL-01
FineTuning-SDXL-02
GenerateImageUsingModel-01
GenerateImageUsingModel-02

Links

Red Hat OpenShift AI installation
About Model serving
Enable NVIDIA GPU
Image Generation Models on OpenShift
Stable Diffusion XL model
LoRA

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Stable Diffusion XL in Red Hat OpenShift AI

Prerequisites

Quickstart

Create Data Science project

Setup LoRA upload approach

Setup MinIO

Setup AWS S3

Create workbench

Create Data connection

Add Serving runtime

Open workbench

Run Jupyter notebook

Deploy model

Generate image

System used

Python module versions

Notebooks with output

Links

Files

README.md

Latest commit

History

README.md

File metadata and controls

Stable Diffusion XL in Red Hat OpenShift AI

Prerequisites

Quickstart

Create Data Science project

Setup LoRA upload approach

Setup MinIO

Setup AWS S3

Create workbench

Create Data connection

Add Serving runtime

Open workbench

Run Jupyter notebook

Deploy model

Generate image

System used

Python module versions

Notebooks with output

Links