Skip to content

Latest commit

 

History

History
112 lines (72 loc) · 5.84 KB

README.md

File metadata and controls

112 lines (72 loc) · 5.84 KB

ML Pipelines with AWS Glue and Amazon SageMaker using Jenkins

In this repository we are stepping through the implementation of a CI/CD ML pipeline using AWS Glue for data processing, Amazon SageMaker for training, versioning, and hosting Real-Time endpoints, and Jenkins CI/CD pipelines for orchestrating the Workflow. Through the usage of AWS CLI APIs for SageMaker, and AWS CLI APIs for AWS Glue we are showing how to implement CI/CD ML pipelines for processing data using AWS Glue, training ML models using Amazon SageMaker Training, deploying ML models using Amazon SageMaker Hosting Services, or perform batch inference by using Amazon SageMaker Batch Transform.

Everything can be tested by using the following frameworks:

Reference Architecture

Alt text

Training pipeline

Alt text

Deployment pipeline

Alt text

Environment Setup

Setup the ML environment by deploying the CloudFormation templates described as below:

1.00-ml-environment: This template is deploying the necessary resources for the Amazon SageMaker Resources, such as Amazon KMS Key and Alias, Amazon S3 Bucket for storing code and ML model artifacts, Amazon SageMaker Model Registry for versioning ML models, IAM policies and roles for Amazon SageMaker and for Jenkins AWS Profile. Parameters:

  • KMSAlias: Name of the the KMS alias. Optional
  • ModelPackageGroupDescription: Description for the Amazon SageMaker Model Package Group. Optional
  • ModelPackageGroupName: Name for the Amazon SageMaker Model Package Group. Mandatory
  • S3BucketName: Amazon S3 Bucket name. Mandatory

Source Code

Build and Train ML models

Inference and Deploy ML models

Jenkins Environment

In this section, we are setting up a local Jenkins environment for testing the ML pipelines. Please follow the README for running Jenkins by using the provided Dockerfile in a container.

Setup pipeline

For creating the Jenkins Pipeline:

Create Job

Alt text

Create Pipeline

Alt text

Define Jenkinsfile

Create a Jenkins pipeline for the specific purpose by copying the content from

Alt text

Define Jenkinsfile from Git repository

Create a Jenkins pipeline by pointing to a Jenkinsfile directly from the Git repository:

Alt text

Conclusion

In this example we shared how to implement end to end pipelines for Machine Learning workloads using Jenkins, by using APIs with AWS CLI for interacting with AWS Glue, Amazon SageMaker for processing, training, and versioning ML models, for creating real-time endpoints or perform batch inference using Amazon SageMaker.

If you have any comments, please contact:

Bruno Pistone [email protected]

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.