This lab covers MLOps using Azure DevOps and GitHub, training and deployment of models, real time scoring using a REST API endpoint.
The high-level steps covered in the lab are:
- Explore dashboard of COVID-19 data
- Explore lab scenario
- Observe the execution of CI/CD pipeline all the way to the update on the REST API endpoint
- Explore the execution and artifacts of the existing model training Azure DevOps pipeline: AML workspace with notebooks, experiments, and data
- Explore the execution of the existing Azure DevOps release pipeline
- Using GitHub actions to trigger CI/CD pipeline by committing a simple code change
- Keep gates within Azure Pipeline to pre-empt the triggers before the pipeline moves stages
- Approve changes after validation to see it move stages
- Repeat previous steps with data drift triggers
Understanding the source datasets is very important in AI and ML. To help you expedite the process, we have created a Power BI dashboard you can use to explore them at the begining of each lab.
To get more details about the source datasets, check out the Data Overview section.
To explore the dashboard of COVID-19 data, open the Azure-AI-in-a-Day-Data-Overview.pbix
file located on the desktop of the virtual machine provided with your environment.
Managing a virtually non-stop flux of incoming research documents should be based on a fully automated and traceable process. Everything from data to code must be tracked and monitored. The complex processes of Machine Learning model training and operationalization require secure, end-to-end approaches that allow teams of developers and analysts to iterate through multiple versions of the solution.
Using GitHub and GitHub Actions, we will build an end-to-end Machine Learning process, where data and code act like inputs and actionable REST API endpoints are the result. Our pipelines will automate building and operationalizing the Machine Learning model that classifies research papers.
The following diagram highlights the portion of the general architecture covered by this lab.
-
Open the Azure DevOps portal and select the Sign in to Azure DevOps link.
-
To sign-in, use the Azure credentials provided by the lab environment.
-
In Azure DevOps, open the pre-created project named
odluserXXXXXX-project
and navigate to the Pipelines section and observe the two pipelines pre-created for you:Model-Train-Register-CI
andModel-Deploy-CD
. -
Select the
Model-Train-Register-CI
pipeline and then select the first run as illustrated in the picture bellow. -
On the run results page, check the Summary section to understand how the pipeline is linked to GitHub. Right click on the repository name on the Repository and version column as illustrated bellow and choose to open it in a new tab.
-
You will be prompted to sign in with the Git credentials provided for you in the lab environment.
-
You will be asked to verify your account, so you should open your user's mailbox on https://outlook.office365.com/ to be able to receive the verification codes for GitHub authentication. Use the same GitHub user account credentials to open Outlook.
-
In GitHub, you will see the
azure-ai-in-a-day-lab02
repository that was pre-generated under your GitHub account. -
If the repository is not automatically opened, please select it from the available repositories list.
-
Spend a few moments to browse the repository structure. Focus on the
.pipelines
folder which contains the YAML scripts used by the Azure DevOps pipelines discussed later in this exercise. -
Going back to the Azure DevOps portal browser tab, on the
Model-Train-Register-CI
pipeline run page observe the two stages of the CI pipeline. Select the Model CI stage to open the execution log for this pipeline run. -
Select the Publish Azure Machine Learning Pipeline task node under Model CI Pipeline. Scroll down on the execution log window and observe the publishing of the Azure Machine Learning pipeline.The AML pipeline involves three main steps: Train, Evaluate and Register model.
-
Going back to the pipeline stages page, expand the Train and evaluate model stage from the Expand stage button.
-
Also you should inspect the artifact of the training stage:
-
Going back to the pipeline stages page, select the Train and evaluate model stage to open the run execution log.
-
Expand the Train and evaluate model task nodes and observe the three main tasks involved in this stage: Get pipeline ID for execution, Trigger ML Training Pipeline and Publish artifact is new model was registered.
-
To be able to understand the artifacts published in your Azure Machine Learning Workspace, open a new browser tab and sign in to the Azure Portal with the Azure credentials provided in the lab. Open the available Resource Group, locate and select the Machine Learning workspace that was pre-created in the lab environment.
-
Select Launch studio to navigate to the Azure Machine Learning Studio.
-
In the Azure Machine Learning Studio, select Pipelines from the left navigation menu, go to Pipeline endpoints and check the published training pipeline in the
ai-in-a-day-XXXXXX
workspace. -
The build pipeline for training automatically triggers every time there's a change in the master branch. After the pipeline is finished, you'll see a new model in the ML Workspace. Navigate to the Models section in ML Studio, using the left navigation menu and check the registered model named
COVID19Articles_model.pkl
.
The release deployment and batch scoring pipelines have the following behaviors:
-
The pipeline will automatically trigger on completion of the Model-Train-Register-CI pipeline for the master branch.
-
The pipeline will default to using the latest successful build of the Model-Train-Register-CI pipeline. It will deploy the model produced by that build.
-
You can specify a Model-Train-Register-CI build ID when running the pipeline manually. You can find this in the url of the build, and the model registered from that build will also be tagged with the build ID. This is useful to skip model training and registration, and deploy/score a model successfully registered by a Model-Train-Register-CI build.
The pipeline has the following stages:
-
Deploy to ACI
-
Deploy the model to the QA environment in Azure Container Instances.
-
Smoke test: scoring sample step
-
-
-
In Azure DevOps, open your project and navigate to the Pipelines section. Select the
Model-Deploy-CD
pipeline and then select the first run as illustrated in the picture bellow. -
The first time when the CD pipeline runs, it will use the latest model created by the Model-Train-Register-CI pipeline. Given that the pre-created pipeline has already finished the first execution, check the displayed execution stages as illustrated bellow.
-
Select the first stage: Deploy to ACI to open the execution log. Expand the list of executed tasks under the stage node. Select the Parse Json for Model Name and Version task. You can see how the model name and version are downloaded from the
Model-Train-Register-CI
pipeline. -
Select the Deploy to ACI (CLI) node and scroll the execution log to see the performed steps inside the deployment task.
-
Select the Smoke test: The test sends a sample query to the scoring web service and verifies that it returns the expected response. Have a look at the smoke test code for more details on the steps involved there.
-
To be able to understand the artifacts published in your Azure Machine Learning Workspace, go back to the Azure Machine Learning Studio page already opened in your browser and move directly to step 8. If you don't find ML Studio opened from the previous task, open a new browser tab and sign in to the Azure Portal with the Azure credentials provided in the lab. Open the available Resource Group, locate and select the Machine Learning workspace that was pre-created in the lab environment.
-
Select Launch studio to navigate to the Azure Machine Learning Studio.
-
In the Azure Machine Learning Studio, select Endpoints (1) from the left navigation menu, go to Real-time endpoints (2) and select the published ACI endpoint named
mlops-aci
(3). -
When opening the ACI endpoint details page, you can identify the REST endpoint used for scoring in the Smoke test example on step 5 of this Task.
In Azure DevOps Pipelines, teams can also take advantage of the Approvals and Gates feature to control the workflow of the deployment pipeline. Each stage in a release pipeline can be configured with pre-deployment and post-deployment conditions that can include waiting for users to manually approve or reject deployments, and checking with other automated systems that specific conditions are met.
-
Open the Azure DevOps portal and if not already signed in, select the Sign in to Azure DevOps link. (To sign-in, use the Azure credentials provided by the lab environment.)
-
In Azure DevOps, open the pre-created project named
odluserXXXXXX-project
, navigate to the Pipelines section and observe the two pipelines pre-created for you:Model-Train-Register-CI
andModel-Deploy-CD
. Select the Model-Train-Register-CI pipeline. -
Select Edit to open the pipeline designer.
-
In the pipeline YAML definition, locate the end of the job
Model_CI_Pipeline
(should be line 46). Add a new, empty line and then insert the following snippet (make sure to replaceodl_user_XXXXXX
with the actual user name assigned to your environment):- job: "Validate_Code_Quality" dependsOn: "Model_CI_Pipeline" displayName: "Validate code quality" timeoutInMinutes: 15 pool: server steps: - task: ManualValidation@0 timeoutInMinutes: 10 inputs: notifyUsers: | [email protected] instructions: 'Please validate the code quality analysis results and resume.' onTimeout: reject
After performing the changes, your YAML code should look similar to this:
-
Save and Run the pipeline. Select the new pipeline Run from the Runs list and check the status of the
Model CI
stage. Wait until theModel CI Pipeline
job completes and then notice theReview
popup that appears when theValidate code quality
job starts executing. SelectReview
to perform the manual intervention. -
Notice the timeout indication at the top of the
Manual Validation
popup. SelectResume
to allow the pipeline execution to continue.
/OPTIONAL_ACTION
Github workflows are triggered based on events specified inside workflows. These events can be from inside the github repo like a push commit or can be from outside like a webhook.
We have created sample workflow file train_deploy.yml to train the model and deploy the ACI endpoint, similar to what we did in the previous exercise using DevOps Pipelines. This GitHub workflow trains the model in a first job and, on successful training completion, deploys the model inside the second job. You will activate this workflow by doing a commit to any file under the included path: COVID19Articles_GH/
folder.
-
Sign-in to GitHub with your GitHub account.
-
Navigate to the already generated repository for this lab, named
azure-ai-in-a-day-lab02
. If the repository is not automatically opened, please select it from the available repositories list. -
Navigate to the
.github/workflows
folder and open the train_deploy.yml (1) workflow definition file . Observe the two jobs defined for the GitHub Actions workflow: train-register and deploy (3). The workflow is triggered when a code change is committed inside the COVID19Articles_GH repository folder or themaster
branch (2). -
With the repository opened, select the Actions section from the top navigation menu. Observe the active workflow that was already initiated at the first code commit, when the
.github/workflows/train_deploy.yml
definition file was initially created in the repository. -
Going back to the repository files, open the COVID19Articles_GH/train folder and select the
train_aml.py
code file. -
Open the
train_aml.py
code file in edit mode. -
Go to line 172 and change the
max_depth
argument value from 5 to 4 (1). Commit your change (2). -
Navigate to the Actions section in GitHub and observe how your code change automatically triggered the GitHub Action train-deploy-workflow. Select the workflow run.
-
Observe the two jobs of the workflow run. Select each stage and check the job steps in the execution log.
-
While waiting for the workflow to execute, watch it how it moves from the first job to the second one and switch back to the Azure Machine Learning Studio page to see the generated artifacts of each step: register dataset, submit experiment, register model and deploy ACI endpoint.
-
Navigate to the Experiments section (1). You should be able to see the new experiment COVID19_Classification_GH (2) that was started by the GitHub Actions Workflow you triggered. Expand the experiment to see the created experiment Run.
-
Navigate to the Models (1) page. You should be able to see the model registered by the GitHub Actions workflow, named COVID19Articles_githubactions (2).
-
Check the deploy job execution log while it is executing. When the deploy stage completes, in ML Studio, navigate to the Endpoints (1) page. You should be able to see the ACI endpoint named
mlops-aci-githubactions
(2) published inside the GH Actionsdeploy
job.
In the Azure Portal, we prepared an Azure Function that triggers the above described GitHub workflow every time a new version of the COVID16Articles.csv file is uploaded in the datastore.
-
Open the Azure portal and sign-in using the Azure credentials provided by the lab environment.
-
From the left navigation menu, select the Storage accounts option and locate the storage named
mlstrgXXXXXX
as illustrated bellow. -
Select
mlstrgXXXXXX
storage account and select Containers from the account Overview page. -
Select the
azureml-blobstore-...
container. This is the storage container linked to the default Datastore used by Azure Machine Learning Workspace. -
Navigate to the
training-data\COVID19Articles.csv
file. Download it to your computer and change anything inside the file. -
Upload the new version of COVID19Articles.csv in the same
training-data
folder by using the Upload button from the top menu bar.
7.Select the option to overwrite the file since it already exists in the container.