End to end machine learning using multiple AWS accounts across multiple environments.
Few terms to get familiar with before we get started with lab provisioning:
Tools Account - An AWS account managed by a centralized IT team, who are responsible for deploying the ML models to production through MLOps code pipeline.
DataScience Account - An AWS account used by Data Scientists where they could provision Amazon Sagemaker notebook instance, train ML models, and submit once approved.
Stage Account - An AWS account where the code pipeline automatically deploys to and validates the ML models. Production Account - An AWS account where the production applications run. MLOps code pipeline from this lab can be extended to
Note: For this lab, we are are only using Tools, DataScience and Stage accounts. The MLOps pipeline can however be extended to auto-deploy the models to Production environment.
The figure below shows the architecture you will build using this lab.
Step-1: Prepare the Lab environment
- Configure Service Catalog Product/Portfolio in the Tools Account and share it with a Spoke account (DataScience Account for this lab).
- Configure a Service Catalog Product/Portfolio and other networking resources in the DataScience account and allow access to Data Scientists user/role.
- Configure the stage accounts [Steps]
- Configure MLOps Pipeline in the Tools Account [Steps]
Step-2: Data scientists request AWS resources
- Log in to the DataScience AWS account
- Go to AWS Service Catalog and launch the Sagemaker Notebook instance
- Use the Outputs from AWS Service Catalog and continue with remaining work.
Step-3: Data scientists build/train the ML models and submit the final Model.
- Steps to start a notebook
- Steps to build/train the ML model
- Steps to submit the Model to S3 bucket in Tools account
In this section, we will deploy the AWS Service Catalog portfolio in Tools account and share it with the DataScience account, allow Data Scientists to launch Service Catalog resources, and setups the ML Codepiepline. For this lab, we will use CloudFormation to create all the required resources.
Make a note the AWS Account IDs for the Tools, Datascience and Stage accounts provided to you. You will use these in the steps below.
PLEASE READ: Service Catalog is a regional service. Please make sure that you use the same region for the three accounts you work with. We will create all lab resources in the AWS region : us-east-1
- Clone or download the zip file of this repo.
git clone https://github.com/sirimuppala/cross-account-mlops.git
- If you downloaded the zip, unzip the file.
1.1.1 Log in to your assigned Tools Account using the credentials provided by your lab administrator.
1.2.1 Copy and paste the below link in your web browser of your Tools Account https://us-east-2.console.aws.amazon.com/cloudformation#/stacks/new?stackName=LabSCToolsAccountSetup&templateURL=https://marketplace-sa-resources.s3.amazonaws.com/scmlops/prepare_tools_account.yaml
- In Create stack page, choose Next
- In Specify stack details page, Type in your DataScience Account Id under SpokeAccountID
- In Configure stack options page, leave the defaults and choose Next
- Scroll down Review LabSCToolsAccountSetup page to review the selections and choose Create stack
- Wait for the stack to deploy resource completely.
- Choose Outputs section and note down the values of
MasterPortfolioId
,SagemakerProductID
, andToolsAccountID
. You will use this information in the next step.
1.2.2 Go to Service Catalog Console - https://us-east-2.console.aws.amazon.com/servicecatalog/
- Choose Portfolios and Data Scientists - Sample Portfolio
- Choose Share(1) to list the accounts the portfolio is shared with. PS: No action needed, just verify the portfolio is shared with SpokeAccountID.
1.3.1 Create a CloudFormation stack to prepare lambda functions to be used by MLOps pipeline
- a. In Create stack page, choose Upload a template file, Choose file :
tools-account/pipeline/PrepPipeline.yml
; Click Next - b. In Specify stack details page, type in
MLOpsPipelinePrep
for Stack Name. - c. In Configure stack options page, leave the defaults and choose Next
- d. Scroll down Review page to review the selections and choose Create stack
This step will create an S3 bucket with name "mlops-bia-lambda-functions-XXXXXXXXXXXX" where the X's represent the AWS Account ID.
1.3.2 From the S3 console, upload lambda zip files from the downloaded git repo code to the S3 bucket created "mlops-bia-lambda-functions-XXXXXXXXXXXX"
- a. Upload tools-account/lambda-code/MLOps-BIA-DeployModel.py.zip
- b. Upload tools-account/lambda-code/MLOps-BIA-GetStatus.py.zip
- c. Upload tools-account/lambda-code/MLOps-BIA-EvaluateModel.py.zip
1.3.3 Create a CloudFormation stack to setup the MLOps Code Pipeline
- a. In Create stack page, choose Upload a template file, Choose file :
tools-account/pipeline/BuildPipeline.yml
; Click Next - b. In Specify stack details page, type in
MLOpsPipeline
for Stack Name. - c. In Configure stack options page, type in 'DataScienceAccountID', 'StageAccountID', an UniqueID and click Next
- d. Scroll down Review page to review the selections, select checkbox to acknowledge that IAM resources will be created. Click Create stack
1.4 Log in to your assigned DataScience Account using the Lab Administrator credentials provided.
1.5.1 Copy and paste the below link in your web browser https://us-east-2.console.aws.amazon.com/cloudformation#/stacks/new?stackName=LabDSAccountSCSetup&templateURL=https://marketplace-sa-resources.s3.amazonaws.com/scmlops/prepare_datascientist_account.yaml
- a. In Create stack page, choose Next
- b. Enter the MasterPortfolioId, SagemakerProductID and ToolsAccountID you noted in Step 1.2.1 and choose Next.
- c. In Configure stack options page, leave the defaults and choose Next
- d. Scroll down Review LabDSAccountSCSetup page and select I acknowledge that AWS CloudFormation might create IAM resources option and choose Create stack
- f. Check in the Outputs tab, and note down the SwitchRoleLink role. You will use the URL link value to switch role as DataScientist in Step-2 below.
1.6 Create DataScience resources (May be not needed?? ) 1.6.1 Create a CloudFormation stack to create a S3 bucket to hold all training data and model artifacts
- a. In Create stack page, choose "Upload a template file", Choose file : datascience-account/CreateDSResources.yml; Click Next
- b. In Specify stack details page, type in "DSResources" for Stack Name.
- c. In Configure stack options page, leave the defaults and choose Next
- d. Scroll down Review page to review the selections, select checkbox to acknowledge that IAM resources will be created. Click Create stack
1.7.1 Log in to your assigned Stage Account using the Lab Administrator credentials provided.
1.7.2 Create a CloudFormation stack
- a. In Create stack page, choose "Upload a template file", Choose file : stage-account/CreateResources.yml; Click Next
- b. In Specify stack details page, type in "StageResources" for Stack Name.
- c. In Configure stack options page, type in 'ToolsAccountID' and choose Next
- d. Scroll down Review page to review the selections and click Create stack
In this section, you will login in as a Data Scientist and launch a Secure Sagemaker Notebook from the self-service portal powered by AWS Service Catalog.
2.1. Log in to the Data Scientists account using the same Lab Administrator credentials as you used in step 1.4
2.2. Switch to DataScientist role, using the URL you copied in Step 1.5.1(f)
2.3. Under Find services, search for and choose Service Catalog
2.4. Now you will see a "Amazon Secure Sagemaker" product under Products list. PS: If you don't see a product in your page, ensure you were able to switch the role properly and also in correct region. You can get this information from the top-right corner of the page.
-
Click on the product. Click LAUNCH PRODUCT button
-
Under Product version page, enter a name for your service catalog product and choose NEXT
-
Select SagemakerInstance notebook instance size (small for the purposes of this lab) and select a team name TeamName
-
In TagOptions page, select a Value from drop-down for tag cost-center and choose NEXT
-
Leave defaults in Notifications page and choose NEXT
-
Under Review page, review all the options selected and choose LAUNCH
-
On sucessful completion of the SC product launch, the Data scientist can get the notebook access information on Outputs page of the provisioned product (as shown below).
-
Make note of the BucketName value in the outputs. You will use this in Step 3.
-
Click on SageMakerNoteBookURL to open the Notebook interface on the console. Alternatively, Click on SageMakerNoteBookTerminalURL to open the Terminal.
You are now accessing the SageMaker notebook instance you self-provisioned as a datascientist.
Step-3 : Data scientists build/train the ML models. Once ready submit the ML model to kick off MLOps
In this step, you will build an XGBoost model in SageMaker notebook instance provisioned in Step 2. Once the model is validated and ready to be handed over to IT, you will transfer the model along with test data to IT Tools account.
3.1 Open the "xgboost_abalone.ipynb" in Jupyter. Steps to follow in this step are documented in the "xgboost_abalone.ipynb" itself. Please read through the narration and execute each cell.
In the last cell of "xgboost_abalone.ipynb" you transfer the ML model along with test data to Tools account, which automatically kicks off the MLOps pipeline.
3.2 Login to the Tools account.
3.3 Search for and navigate to "CodePipeline".
3.4 Click on the pipeline with name starting with "MLOpsPipeline-CodePipeline"
3.5 The pipeline shows multiple stages : Source, DeployModels-Tools, DeployModels-Stage.
3.6 The Source stage was triggered when you copied the model to the Tools S3 bucket.
3.7 The pipeline automatically deploys and validates the model in Tools account and then deploys and validates the model in StageAccount.
While this pipeline is limited to deploying and validating the model in two accounts, this can be extended to more accounts / envrionments, for eg., performance, non-prod and production.
Once you are done with the lab, delete the resources created to avoid unnecessary costs. Please delete the resources in the order specified.
-
Stage Account
- Empty the S3 bucket with name starting with "mlops-bia-data-model"
- Delete the CloudFormation Stack with name "StageResources"
- Delete the Amazon SageMaker Resources - Model, Endpoint Configuration, Endpoint.
-
DataScience Account
-
If not already logged in with the "DataScientist" role, login using DataScientist Role. Terminate the provisioned SageMaker product.
-
Login with the administrator credentials (originally provided by the lab administrator) 2.1 Empty the S3 bucket with name starting with "datascience-project" (May not be needed??) 2.1 Delete the CloudFormation stack with name "LabDSAccountSCSetup" 2.2 Delete the CloudFormation Stack with name "DSEnvironment" (May not be needed??)
-
-
Tools Account
- Delete the CloudFormation Stack with name "LabSCToolsAccountSetup"
- Empty the S3 bucket with name starting with "mlops-bia-data-model-"
- Empty the S3 bucket with name stating with "mlops-bia-codepipeline-artifacts"
- Empty the S3 bucket with name stating with "mlops-bia-lambda-code-"
- Delete the CloudFormation Stack with name "MLOpsPipeline". Wait till the stack is deleted.
- Delete the CloudFormation Stack with name "MLOpsPipelinePrep"