MLflow with basic auth
A dockerized MLflow Tracking Server with basic auth (username and password).
You will have three options to deploy the server: AWS, Heroku, and local.
We provide a Terraform stack that can be easily used to deploy the MLflow Tracking Server.
NOTE: This project is not intended to be used for production deployments. It is intended to be used for testing and development.
The environment variables below are required to deploy this project.
Variable | Description | Default |
---|---|---|
PORT | Port for the MLflow server | 80 |
MLFLOW_ARTIFACT_URI | S3 Bucket URI for MLflow's artifact store | "./mlruns" |
MLFLOW_BACKEND_URI | SQLAlchemy database uri (if provided, the other variables MLFLOW_DB_* are ignored) |
|
DATABASE_URL | SQLAlchemy database uri, it's used by Heroku deployment. Basically, we will move it to MLFLOW_BACKEND_URI . |
|
MLFLOW_DB_DIALECT | Database dialect (e.g. postgresql, mysql+pymysql, sqlite) | "postgresql" |
MLFLOW_DB_USERNAME | Backend store username | "mlflow" |
MLFLOW_DB_PASSWORD | Backend store password | "mlflow" |
MLFLOW_DB_HOST | Backend store host | |
MLFLOW_DB_PORT | Backend store port | 3306 |
MLFLOW_DB_DATABASE | Backend store database | "mlflow" |
MLFLOW_TRACKING_USERNAME | Username for MLflow UI and API | "mlflow" |
MLFLOW_TRACKING_PASSWORD | Password for MLflow UI and API | "mlflow" |
Amazon ECR
Amazon Elastic Container Registry (ECR) is a fully managed container registry that makes it easy to store, manage, share, and deploy your container images and artifacts anywhere.
App Runner
AWS App Runner is a fully managed service that makes it easy for developers to quickly deploy containerized web applications and APIs, at scale and with no prior infrastructure experience required. Start with your source code or a container image.
Amazon S3
Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.
Amazon Aurora Serverless
Amazon Aurora Serverless is an on-demand, auto-scaling configuration for Amazon Aurora. It automatically starts up, shuts down, and scales capacity up or down based on your application's needs. You can run your database on AWS without managing database capacity.
To deploy MLflow, you'll need to:
-
Create an AWS account if you don't already have one.
-
Configure AWS CLI to use your AWS account.
-
Clone this repository.
git clone https://github.com/DougTrajano/mlflow-server.git
- Open
mlflow-server/terraform
folder.
cd mlflow-server/terraform
- Run the following command to create all the required resources:
terraform init
terraform apply -var mlflow_username="YOUR-USERNAME" -var mlflow_password="YOUR-PASSWORD"
See a full list of variables that can be used in terraform/variables.tf.
- Type "yes" when prompted to continue.
Plan: 21 to add, 0 to change, 0 to destroy.
Changes to Outputs:
+ artifact_bucket_id = (known after apply)
+ mlflow_password = (sensitive value)
+ mlflow_username = "doug"
+ service_url = (known after apply)
+ status = (known after apply)
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
This will create the following resources:
- An S3 bucket is used to store MLflow artifacts.
- An [IAM role and policy that allows MLflow to access the S3 bucket.
- An Aurora RDS Serverless database (PostgreSQL) is used to store MLflow data.
- An App Runner that will run the MLflow Tracking Server.
- (Optional) A set of network resources such as VPC, Subnet, and Security group.
- Heroku Account
- AWS Account
- The Heroku deployment will use an Amazon S3 bucket for storing the MLflow tracking data.
- AWS CLI
- Terraform CLI
-
Create an AWS account if you don't already have one.
-
Configure AWS CLI to use your AWS account.
-
Clone this repository.
git clone https://github.com/DougTrajano/mlflow-server.git
- Open
mlflow-server/terraform
folder.
cd mlflow-server/terraform
- Run the following command to create only the S3 bucket
terraform init
terraform apply -var environment="heroku" -target="module.s3"
- Type "yes" when prompted to continue.
Plan: 5 to add, 0 to change, 0 to destroy.
Changes to Outputs:
+ artifact_bucket_id = (known after apply)
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
- Create an IAM Policy for the S3 bucket as follows:
IAM Policy example
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::mlflow-heroku-20220723133820303500000001"
},
{
"Effect": "Allow",
"Action": [
"s3:*",
"s3-object-lambda:*"
],
"Resource": "arn:aws:s3:::mlflow-heroku-20220723133820303500000001/*"
}
]
}
- Create an IAM User and attach the IAM Policy previously created.
Take note of the IAM User access key and secret key, you'll need them in the step 5.
- Click on the "Deploy to Heroku" button below.
- Follow the instructions on the new page to create an MLflow Tracking Server.
- Docker and Docker Compose.
- Clone this repository.
git clone https://github.com/DougTrajano/mlflow-server.git
- Open the
mlflow-server
folder.
cd mlflow-server
- Run the following command to create all the required resources:
docker-compose up -d --build
The link that you will use to access the MLflow Tracking Server will depend on the deployment method you choose.
- For AWS, the link will be something like
https://XXXXXXXXX.aws-region.awsapprunner.com/
.- You can find it in the AWS App Runner console.
- For Heroku, the link will be something like
https://XXXXXXXXX.herokuapp.com/
.- You can find it in the Heroku dashboard.
- For Local, the link will be something like
http://localhost:80/
.
Also, you can track your experiments using MLflow API.
import os
import mlflow
os.environ["MLFLOW_TRACKING_URI"] = "<<YOUR-MLFLOW-TRACKING-URI>>"
os.environ["MLFLOW_EXPERIMENT_NAME"] = "<<YOUR-EXPERIMENT-NAME>>"
os.environ["MLFLOW_TRACKING_USERNAME"] = "<<YOUR-MLFLOW-USERNAME>>"
os.environ["MLFLOW_TRACKING_PASSWORD"] = "<<YOUR-MLFLOW-PASSWORD>>"
# AWS AK/SK are required to upload artifacts to S3 Bucket
os.environ["AWS_ACCESS_KEY_ID"] = "<<AWS-ACCESS-KEY-ID>>"
os.environ["AWS_SECRET_ACCESS_KEY"] = "<<AWS-SECRET-ACCESS-KEY>>"
SEED = 1993
mlflow.start_run()
mlflow.log_param("seed", SEED)
mlflow.end_run()