Merge pull request #48 from platinfra/update-documentation-with-new-s…

…ections Update documentation with new sections
mlinfra-io · Jan 29, 2024 · ea71973 · ea71973
2 parents 4a82ac6 + cfe7a86
commit ea71973
Show file tree

Hide file tree

Showing 65 changed files with 590 additions and 162 deletions.
diff --git a/.github/ISSUE_TEMPLATE/feature_request.yaml b/.github/ISSUE_TEMPLATE/feature_request.yaml
@@ -0,0 +1,45 @@
+name: Feature Request
+description: Suggest an idea for this project
+title: "[Feature]: "
+labels: ["type/feature \U0001F4A1"]
+body:
+  - type: markdown
+    attributes:
+      value: "## Feature Request\nPlease fill in this form to describe the feature request in detail."
+
+  - type: textarea
+    attributes:
+      label: Describe the Feature
+      description: "Provide a detailed description of the feature you're proposing."
+      placeholder: "Explain the feature here..."
+    validations:
+      required: true
+
+  - type: textarea
+    attributes:
+      label: Importance of the Feature
+      description: "Explain why this feature is important. How will it benefit the project or its users?"
+      placeholder: "Describe the importance here..."
+    validations:
+      required: true
+
+  - type: textarea
+    attributes:
+      label: Additional context
+      description: Add any other additional context about the feature here.
+    validations:
+      required: false
+
+  - type: dropdown
+    attributes:
+      label: Feature Category
+      description: "Select the category that best describes the feature."
+      options:
+        - New Cloud Provider
+        - MLOps Stack Application
+    validations:
+      required: true
+
+  - type: markdown
+    attributes:
+      value: "### Additional Information\nFeel free to add any other context or screenshots about the feature request here."
diff --git a/.github/release.yaml b/.github/release.yaml
@@ -0,0 +1,21 @@
+# .github/release.yml
+
+changelog:
+  exclude:
+    labels:
+      - ignore-for-release
+    authors:
+      - octocat
+  categories:
+    - title: 🏕 Features
+      labels:
+        - "*"
+      exclude:
+        labels:
+          - dependencies
+    - title: 👒 Dependencies
+      labels:
+        - dependencies
+    - title: Documentation Updates
+      labels:
+        - docs
diff --git a/.github/workflows/on_pr.yml b/.github/workflows/on_pr.yml
@@ -0,0 +1,33 @@
+name: Bump version
+on:
+  push:
+    branches:
+      - master
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3
+      - name: Bump version and push tag
+        id: tag_version
+        uses: mathieudutour/[email protected]
+        with:
+          github_token: ${{ secrets.GITHUB_TOKEN }}
+          default_bump: patch
+          release_branches: main
+          tag_prefix: ""
+
+      - name: Update version in Python pyproject.toml
+        if: steps.tag_version.outputs.new_tag != ''
+        run: |
+          NEW_VERSION=${{ steps.tag_version.outputs.new_tag }}
+          echo "New version: $NEW_VERSION"
+
+          # Update the version in pyproject.toml
+          sed -i "s/^version = .*/version = '\"${NEW_VERSION}\"'/" pyproject.toml
+
+          git config --local user.email "[email protected]"
+          git config --local user.name "GitHub Action"
+          git add pyproject.toml
+          git commit -m "Update version to $NEW_VERSION"
+          git push
diff --git a/.github/workflows/python-publish.yml b/.github/workflows/python-publish.yml
@@ -17,23 +17,21 @@ permissions:
 
 jobs:
   deploy:
-
     runs-on: ubuntu-latest
-
     steps:
-    - uses: actions/checkout@v3
-    - name: Set up Python
-      uses: actions/setup-python@v3
-      with:
-        python-version: '3.x'
-    - name: Install dependencies
-      run: |
-        python -m pip install --upgrade pip
-        pip install build
-    - name: Build package
-      run: python -m build
-    - name: Publish package
-      uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
-      with:
-        user: __token__
-        password: ${{ secrets.PYPI_API_TOKEN }}
+      - uses: actions/checkout@v3
+      - name: Set up Python
+        uses: actions/setup-python@v3
+        with:
+          python-version: "3.10"
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install build
+      - name: Build package
+        run: python -m build
+      - name: Publish package
+        uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
+        with:
+          user: __token__
+          password: ${{ secrets.PYPI_API_TOKEN }}
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -10,12 +10,22 @@ as well as any project-related communication through discussions.
 - To get started, first pat yourself!
 
 - platinfra is organised as follows:
-    - `platinfra` cli and all terraform modules are in the `platinfra` repo
+    - `platinfra` cli, all terraform modules and docs are in the `platinfra` repo
     - Generated docs are in `platinfra.github.io` repo but the source code is in `platinfra` repo.
 
 - Fork the platinfra repo and clone it to your local machine.
+- Create a python virtual environment and install the dependencies:
+```bash
+python -m venv venv
+source venv/bin/activate
+pip install -r requirements-dev.txt
+```
 - To get started with the website, fork both platinfra and platinfra.github.io repos and clone them to your local machine.
-- To get started with the documentation, go to `platinfra/docs` and run `mkdocs serve` to view the docs locally.
+- To get started with the documentation, you'd need to first install the docs dependencies:
+```bash
+pip install -r requirements-docs.txt
+```
+- go to `platinfra/docs` and run `mkdocs serve` to view the docs locally.
 
 ## Types of Contributions
 
@@ -28,7 +38,6 @@ as well as any project-related communication through discussions.
     - Detailed steps to reproduce the bug.
     - Any details about your local setup that might be helpful in troubleshooting.
     - When posting Python stack traces, please quote them using [Markdown blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks/).
-    - Label the issue with `bug`
 
 
 ### Submitting Ideas or Feature Requests
@@ -37,15 +46,14 @@ The best way is to file an issue on GitHub:
 
 - Explain in detail how it would work.
 - Keep the scope as narrow as possible, to make it easier to implement.
-- Label the issue with `feature-request`
 
 ### Improve Documentation
 
 platinfra could always use better documentation, so feel free to create an issue and discuss your changes.
 
 ## Bugfix resolution time expectations
 
-- We will respond to all new issues within 2-3 days
+- We will respond to all new issues as soon as possible
 - For any serious (production breaking) bug we will try to resolve ASAP and do a hotfix release
 - For other bugs we will try to resolve them within the next 2 releases (There is a release every 2 weeks).
 

diff --git a/README.md b/README.md
@@ -60,7 +60,7 @@ This project will be supporting the following providers:
 `platinfra` intends to support as many [MLOps tools](https://github.com/EthicalML/awesome-production-machine-learning/) deployable in a platform in their standalone as well as high availability across different layers of an MLOps stack:
 - data_versioning
 - experiment_tracker
-- pipelining / orchestrator
+- orchestrator
 - artifact_tracker / model_registry
 - model_serving / model_inference
 - monitoring
@@ -87,7 +87,7 @@ stack:
     - secrets_manager # can also be vault or any other
   experiment_tracker:
     - mlflow # can be weights and biases or determined, or neptune or clearml and so on...
-  pipelining:
+  orchestrator:
     - zenml # can also be argo, or luigi, or airflow, or dagster, or prefect or flyte or kubeflow and so on...
   orchestrator:
     - aws-batch # can also be aws step functions or aws-fargate or aws-eks or azure-aks and so on...

diff --git a/docs/_images/stack-components-dark.png b/docs/_images/stack-components-dark.png
diff --git a/docs/_images/stack-components-light.png b/docs/_images/stack-components-light.png
diff --git a/docs/about_me.md b/docs/about_me.md
@@ -0,0 +1,7 @@
+# About Me
+
+I'm [Ali Abbas Jaffri](https://aliabbasjaffri.github.io/) and i've been working in the space of MLOps since the fall of 2019. It started off during the course of my Master's thesis at TUM but soon grew into a passion. I've been working ever since on the deployment of ML infrastructure.
+
+I started this project in the midst of 2023 after having deliberated on the issue of quick deployment of ML infrastructure. Platinfra emerged from my personal need to address these challenges. As I shared my experiences with fellow platform enthusiasts, I realized I was not alone in this journey.
+
+I compiled my insights on deploying various MLOps stacks in different environments into this tool, aspiring to simplify the deployment process across all cloud providers and platform tools. This is an ongoing effort and i really wish to cover more cloud providers and MLOps toolings in the near future. I genuinely hope you find Platinfra as useful and transformative as I envisioned and hope you enjoy using it!
diff --git a/docs/Acknowledgements.md → docs/acknowledgements.md b/docs/Acknowledgements.md → docs/acknowledgements.md
@@ -1,6 +1,6 @@
 ### Tools
 
-- This project is heavily inspired by [opta](https://github.com/run-x/opta), a tool quite ahead of its time, and got discontinued way before its time. I would like to thank the team behind it for their work and providing me inspiration to condense the idea of deployment of MLOps tools into a single tool.
+- This project is inspired by [opta](https://github.com/run-x/opta), _a tool quite ahead of its time, and got discontinued way before its time_. I would like to thank the team behind it for their work and providing me inspiration to condense the idea of deployment of MLOps tools into a single tool.
 - The project leans heavily on [terraform aws modules](https://github.com/terraform-aws-modules/) by [Anton Babenko](https://www.linkedin.com/in/antonbabenko/) for the AWS infrastructure.
 
 ### Resources
@@ -15,4 +15,4 @@ I would also like to thank the following individuals:
 
 - [Ghania Riaz](https://www.linkedin.com/in/ghaniariaz) for her patience in bearing me with a laptop on weekdays, holidays and our vacations!
 - [Nicholas Junge](https://www.linkedin.com/in/nicholas-junge/) for his valuable feedback and support in the early stages of this project.
-- [Hamza Tahir](https://www.linkedin.com/in/hamzatahirofficial); an old friend and an endless source of inspiration on my journey in the MLOps space.
+- [Hamza Tahir](https://www.linkedin.com/in/hamzatahirofficial): an old friend and an endless source of inspiration on my journey in the MLOps space.
diff --git a/docs/code/aws/cloud_infra.md b/docs/code/aws/cloud_infra.md
@@ -0,0 +1,75 @@
+`cloud_infra` deploys MLOps `stack` on top of Cloud provider VMs.
+
+
+## Complete example with all stacks
+
+=== "Simple Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/cloud_infra/complete/aws-complete.yaml"
+    ```
+=== "Advanced Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/cloud_infra/complete/aws-complete-advanced.yaml"
+    ```
+
+## data_versioning
+
+#### lakefs
+
+=== "Simple Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/cloud_infra/lakefs/aws-lakefs.yaml"
+    ```
+=== "Advanced Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/cloud_infra/lakefs/aws-lakefs-advanced.yaml"
+    ```
+
+## experiment_tracking
+
+#### mlflow
+
+=== "Simple Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/cloud_infra/mlflow/aws-mlflow.yaml"
+    ```
+=== "Advanced Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/cloud_infra/mlflow/aws-mlflow-advanced.yaml"
+    ```
+
+#### wandb
+
+=== "Simple Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/cloud_infra/wandb/aws-wandb.yaml"
+    ```
+=== "Advanced Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/cloud_infra/wandb/aws-wandb-advanced.yaml"
+    ```
+
+
+## orchestrator
+
+#### prefect
+
+=== "Simple Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/cloud_infra/prefect/aws-prefect.yaml"
+    ```
+=== "Advanced Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/cloud_infra/prefect/aws-prefect-advanced.yaml"
+    ```
+
+#### dagster
+
+=== "Simple Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/cloud_infra/dagster/aws-dagster.yaml"
+    ```
+=== "Advanced Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/cloud_infra/dagster/aws-dagster-advanced.yaml"
+    ```
diff --git a/docs/code/aws/kubernetes.md b/docs/code/aws/kubernetes.md
@@ -0,0 +1,13 @@
+`kubernetes` deploys MLOps `stack` on top of Cloud provider's kubernetes. In case of AWS, its EKS.
+
+
+#### lakefs
+
+===+ "Simple Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/kubernetes/lakefs/aws-lakefs.yaml"
+    ```
+=== "Advanced Deployment Configuration"
+    ```yaml
+    --8<-- "docs/examples/kubernetes/lakefs/aws-lakefs-advanced.yaml"
+    ```
diff --git a/docs/examples b/docs/examples
@@ -0,0 +1 @@
+../examples
diff --git a/docs/index.md b/docs/index.md
@@ -1,11 +1,27 @@
-# Welcome to platinfra
+## What is platinfra?
 
-`platinfra` came to be when i started exploring the different tools in mlops space and had the intention of deploying them on the cloud. The idea was to liberate the IaC logic for creating MLOps stacks and accelerate the deployment and decision making process for MLOps engineers so that they can focus where it matters; on choosing the right tooling for their workflows.
+`platinfra` is a python package designed to streamline the deployment of various MLOps tools within a MLOps stack, quickly and with best practices. The core philosophy behind `platinfra` is to simplify and expedite the deployment of MLOps infrastructure. This approach enables _ML Engineers_ or _ML Platform engineers_ to concentrate on delivering business value, by significantly reducing the time and resources typically required for deploying MLOps tools on the cloud.
 
-platinfra allows MLOps and DevOps to deploy different MLOps tools for different stages of the machine learning lifecycle. The magic of platinfra is hidden in the python layer, that reads through the deployment config and deploys all these tools using terraform modules and connects them using dynamically generated set of roles and permissions.
+## Why platinfra?
 
+`platinfra` was conceived from a personal challenge I faced while deploying various MLOps tools in cloud environments. The absence of a standardized method for deploying MLOps tools and infrastructure on the cloud was not only noticeable but also a source of frustration. As my discussions with industry peers expanded, it became evident that knowledge regarding the deployment of ML Infrastructure is fragmented across various tool-specific documentation sources. This fragmentation hinders rapid and best practice-compliant deployment of diverse MLOps tools within a stack, posing challenges for platform teams who aim to:
+
+- Experiment with different tools in MLOps stacks to customize solutions according to their specific needs.
+- Swiftly evaluate new tools without delving into extensive deployment documentation.
+- Deploy tools within the MLOps stack efficiently and in accordance with best practices, thereby circumventing the need for prolonged development cycles and complex planning.
+
+The fundamental concept behind `platinfra` is to establish a universal Infrastructure as Code (IaC) framework that expedites the creation and deployment of MLOps stacks. This package empowers _MLOps Engineers_ and _Platform Engineers_ to deploy a variety of MLOps tools across different stages of the machine learning lifecycle. The essence of PlatInfra lies in its Python layer, which interprets the deployment configuration and utilizes Terraform modules to deploy these tools. It further enhances the process by dynamically generating a suite of inputs, roles, and permissions, thereby simplifying and streamlining the deployment process.
+
+## How does it work?
 platinfra deploys infrastructure using a declarative approach. The minimal spec for aws cloud as infra with custom applications deployed is as follows:
 
+
+!!! note "The following is just a sample configuration."
+
+    The following sample yaml serves as a reference for the configuration of a MLOps stack
+    which can be deployed using `platinfra`. Some of the stacks and their toolings are currently
+    under active development and may not be available to use right away.
+
 ```yaml
 name: aws-mlops-stack
 provider:
@@ -15,18 +31,13 @@ deployment:
   type: kubernetes
 stack:
   data_versioning:
-    - dvc # can also be pachyderm or lakefs or neptune and so on
+    - lakefs # can also be pachyderm or neptune and so on
   experiment_tracker:
     - mlflow # can be weights and biases or determined, or neptune or clearml and so on...
-  pipelining:
-    - zenml # can also be argo, or luigi, or airflow, or dagster, or prefect or flyte or kubeflow and so on...
   orchestrator:
-    - aws-batch # can also be aws step functions or aws-fargate or aws-eks or azure-aks and so on...
-  runtime_engine:
-    - ray # can also be horovod or apache spark
+    - zenml # can also be argo, or luigi, or aws-batch or airflow, or dagster, or prefect  or kubeflow or flyte
   artifact_tracker:
     - mlflow # can also be neptune or clearml or lakefs or pachyderm or determined or wandb and so on...
-  # model registry and serving are quite close, need to think about them...
   model_registry:
     - bentoml # can also be  mlflow or neptune or determined and so on...
   model_serving:
@@ -36,3 +47,11 @@ stack:
   alerting:
     - mlflow # can be mlflow or neptune or determined or weaveworks or prometheus or grafana and so on...
 ```
+
+The above configuration can be simply deployed using the following command:
+
+```bash
+platinfra terraform --action=apply --stack-config-path=aws-mlops-stack.yaml
+```
+
+For more details, refer to the [User Guide](./user_guide/index.md) section.