-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #72 from mlinfra-io/add-mlflow-to-kubernetes
add-mlflow-to-kubernetes
- Loading branch information
Showing
11 changed files
with
641 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
name: aws-complete-k8s | ||
provider: | ||
name: aws | ||
account_id: "793009824629" | ||
region: "eu-central-1" | ||
deployment: | ||
type: kubernetes | ||
config: | ||
vpc: | ||
create_database_subnets: true | ||
enable_nat_gateway: true | ||
one_nat_gateway_per_az: false | ||
kubernetes: | ||
k8s_version: "1.28" | ||
cluster_endpoint_public_access: true | ||
spot_instance: false | ||
tags: | ||
data_versioning: "lakefs" | ||
node_groups: | ||
- name: k8s-node-group | ||
instance_types: | ||
- t3.medium | ||
desired_size: 1 | ||
min_size: 1 | ||
max_size: 3 | ||
disk_size: 20 | ||
stack: | ||
- data_versioning: | ||
name: lakefs | ||
params: | ||
remote_tracking: true | ||
database_type: "postgres" | ||
tags: | ||
database_type: "postgres" | ||
data_versioning: "lakefs" | ||
remote_tracking: true | ||
- experiment_tracking: | ||
name: mlflow | ||
params: | ||
remote_tracking: true | ||
mlflow_data_bucket_name: "mlflow-bucket" | ||
tags: | ||
database_type: "postgres" | ||
experiment_tracking: "mlflow" | ||
remote_tracking: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
name: aws-complete-k8s | ||
provider: | ||
name: aws | ||
account_id: "793009824629" | ||
region: "eu-central-1" | ||
deployment: | ||
type: kubernetes | ||
stack: | ||
- data_versioning: | ||
name: lakefs | ||
- experiment_tracking: | ||
name: mlflow |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
* The vpc needs to have a `nat gateway configured <https://repost.aws/questions/QU8XmyDQZOQkq9SSHoIM3tJg/setting-up-an-eks-node-group-on-a-private-subnet>`_ to allow the nodegroups to be able to find the eks cluster | ||
* You can choose between creating a single nat gatway or one nat gateway per az. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
name: aws-mlflow-k8s | ||
provider: | ||
name: aws | ||
account_id: "793009824629" | ||
region: "eu-central-1" | ||
deployment: | ||
type: kubernetes | ||
config: | ||
vpc: | ||
create_database_subnets: true | ||
enable_nat_gateway: true | ||
one_nat_gateway_per_az: false | ||
kubernetes: | ||
k8s_version: "1.28" | ||
cluster_endpoint_public_access: true | ||
spot_instance: false | ||
tags: | ||
experiment_tracking: "mlflow" | ||
node_groups: | ||
- name: mlflow-node-group | ||
instance_types: | ||
- t3.medium | ||
desired_size: 1 | ||
min_size: 1 | ||
max_size: 3 | ||
disk_size: 20 | ||
stack: | ||
- experiment_tracking: | ||
name: mlflow | ||
params: | ||
remote_tracking: true | ||
mlflow_data_bucket_name: "mlflow-bucket" | ||
tags: | ||
database_type: "postgres" | ||
experiment_tracking: "mlflow" | ||
remote_tracking: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
name: aws-mlflow-k8s | ||
provider: | ||
name: aws | ||
account_id: "793009824629" | ||
region: "eu-central-1" | ||
deployment: | ||
type: kubernetes | ||
stack: | ||
- experiment_tracking: | ||
name: mlflow |
8 changes: 8 additions & 0 deletions
8
src/mlinfra/modules/applications/kubernetes/experiment_tracking/mlflow/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
mlflow does not provide official support for helm chart (see [here](https://github.com/mlflow/mlflow/issues/6118)) | ||
The other two candidates for deploying mlflow using helm charts are | ||
- [mlflow community chart](https://github.com/community-charts/helm-charts/tree/main/charts/mlflow) | ||
- [bitnami helm chart](https://github.com/bitnami/charts/tree/main/bitnami/mlflow) | ||
|
||
mlflow community chart has not been maintained for over a year now. | ||
It has better api for deployment compared to bitnami chart. | ||
Deploying bitnami chart was so much pain that i decided to go ahead with community chart for now. |
58 changes: 58 additions & 0 deletions
58
...mlinfra/modules/applications/kubernetes/experiment_tracking/mlflow/mlflow_kubernetes.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
inputs: | ||
- name: vpc_id | ||
user_facing: false | ||
description: VPC id | ||
value: module.vpc.vpc_id | ||
default: None | ||
- name: vpc_cidr_block | ||
user_facing: false | ||
description: VPC CIDR block required for SG of RDS | ||
value: module.vpc.vpc_cidr_block | ||
default: None | ||
- name: db_subnet_group_name | ||
user_facing: false | ||
description: DB Subnet group name | ||
value: module.vpc.database_subnet_group | ||
default: None | ||
- name: oidc_provider_arn | ||
user_facing: false | ||
description: The ARN of the OIDC provider to use for authentication | ||
value: module.eks.oidc_provider_arn | ||
default: None | ||
- name: oidc_provider | ||
user_facing: false | ||
description: The OIDC provider to use for authentication | ||
value: module.eks.oidc_provider | ||
default: None | ||
- name: remote_tracking | ||
user_facing: true | ||
description: Deploys an external Postgres RDS server as backend store and S3 as artifact store for mlflow. | ||
default: true | ||
- name: rds_instance_class | ||
user_facing: true | ||
description: RDS instance class to deploy mlflow backend on | ||
default: "db.t4g.medium" | ||
- name: mlflow_chart_version | ||
user_facing: true | ||
description: mlflow Chart version. See here for more details; https://artifacthub.io/packages/helm/mlflow/mlflow | ||
default: "1.0.8" | ||
- name: service_account_namespace | ||
user_facing: true | ||
description: The namespace where the service account would be installed | ||
default: mlflow | ||
- name: service_account_name | ||
user_facing: true | ||
description: The name of the service account to use for mlflow | ||
default: mlflow-sa | ||
- name: mlflow_data_bucket_name | ||
user_facing: true | ||
description: mlflow S3 data bucket name | ||
default: "mlflow-data-bucket" | ||
- name: tags | ||
user_facing: true | ||
description: Tags for mlflow module | ||
default: | ||
data_versioning: "mlflow" | ||
outputs: | ||
clouds: | ||
- aws |
Oops, something went wrong.