Skip to content

Commit

Permalink
Finish ML deployment module (#280)
Browse files Browse the repository at this point in the history
* move files

* delete local deployment module + fix readme

* fix numbering

* move exercise files

* fix cross link

* include nice table in intro

* correct PyTorch spelling in documentation and comments

* getting started on the bento stuff

* Pre-commit fixes

* update with some images

* first bento exercise finished

* exercise on adaptive batching

* more exercises

* finish module

* fix links

---------

Co-authored-by: SkafteNicki <[email protected]>
  • Loading branch information
SkafteNicki and SkafteNicki authored Nov 9, 2024
1 parent 674c3ba commit 92b03d4
Show file tree
Hide file tree
Showing 52 changed files with 1,133 additions and 741 deletions.
Binary file added figures/bentoml_adaptive_batching.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified figures/icons/bentoml.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
27 changes: 14 additions & 13 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ plugins:
- search
- glightbox
- same-dir
- markdown-exec
- git-revision-date-localized:
enable_creation_date: true
- exclude:
Expand Down Expand Up @@ -127,25 +128,25 @@ nav:
- S7 - Deployment 📦:
- s7_deployment/README.md
- M22 - Requests and APIs: s7_deployment/apis.md
- M23 - Local Deployment: s7_deployment/local_deployment.md
- M24 - Cloud Deployment: s7_deployment/cloud_deployment.md
- M25 - API Testing: s7_deployment/testing_apis.md
- M23 - Cloud Deployment: s7_deployment/cloud_deployment.md
- M24 - API Testing: s7_deployment/testing_apis.md
- M25 - ML deployment: s7_deployment/ml_deployment.md
- M26 - Frontend: s7_deployment/frontend.md
- S8 - Monitoring 📊:
- s8_monitoring/README.md
- M26 - Data Drifting: s8_monitoring/data_drifting.md
- M27 - System Monitoring: s8_monitoring/monitoring.md
- M27 - Data Drifting: s8_monitoring/data_drifting.md
- M28 - System Monitoring: s8_monitoring/monitoring.md
- S9 - Scalable applications ⚖️:
- s9_scalable_applications/README.md
- M28 - Distributed Data Loading: s9_scalable_applications/data_loading.md
- M29 - Distributed Training: s9_scalable_applications/distributed_training.md
- M30 - Scalable Inference: s9_scalable_applications/inference.md
- M29 - Distributed Data Loading: s9_scalable_applications/data_loading.md
- M30 - Distributed Training: s9_scalable_applications/distributed_training.md
- M31 - Scalable Inference: s9_scalable_applications/inference.md
- S10 - Extra 🔥:
- s10_extra/README.md
- M31 - Documentation: s10_extra/documentation.md
- M32 - Hyperparameter optimization: s10_extra/hyperparameters.md
- M33 - High Performance Clusters: s10_extra/high_performance_clusters.md
- M34 - Frontend: s10_extra/frontend.md
- M35 - ML deployment: s10_extra/ml_deployment.md
- M32 - Documentation: s10_extra/documentation.md
- M33 - Hyperparameter optimization: s10_extra/hyperparameters.md
- M34 - High Performance Clusters: s10_extra/high_performance_clusters.md

# - M35 - Designing Pipelines: s10_extra/design.md
# - M37 - Workflow orchestration: s10_extra/orchestration.md
# - M38 - Kubernetes: s10_extra/kubernetes.md
Expand Down
2 changes: 1 addition & 1 deletion pages/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ probabilities are, what a classifier is, what overfitting means etc. This corres
course [02450](https://kurser.dtu.dk/course/02450). The actual focus of the course is not on machine learning models,
but we will be using these basic concepts throughout the exercises.

Additionally, we recommend basic knowledge about deep learning and how to code in [Pytorch](https://pytorch.org/),
Additionally, we recommend basic knowledge about deep learning and how to code in [PyTorch](https://pytorch.org/),
corresponding to the curriculum covered in [02456](https://kurser.dtu.dk/course/02456). From prior experience, we know
that not all students have gained knowledge about deep learning models before this course, and we will be covering the
basics of how to code in PyTorch in one of the
Expand Down
4 changes: 2 additions & 2 deletions pages/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ a different stack of tools that better fits your needs. Regardless of the stack,

| Framework | Description |
|-----------|-------------|
| ![Pytorch](../figures/icons/pytorch.png){ width="50" } | **Pytorch** is the backbone of our code, it provides the computational engine and the data structures that we need to define our data structures. |
| ![Pytorch Lightning](../figures/icons/lightning.png){ width="50" } | **Pytorch lightning** is a framework that provides a high-level interface to Pytorch. It provides a lot of functionality that we need to train our models, such as logging, checkpointing, early stopping, etc. such that we do not have to implement it ourselves. It also allows us to scale our models to multiple GPUs and multiple nodes. |
| ![PyTorch](../figures/icons/pytorch.png){ width="50" } | **PyTorch** is the backbone of our code, it provides the computational engine and the data structures that we need to define our data structures. |
| ![PyTorch Lightning](../figures/icons/lightning.png){ width="50" } | **PyTorch lightning** is a framework that provides a high-level interface to PyTorch. It provides a lot of functionality that we need to train our models, such as logging, checkpointing, early stopping, etc. such that we do not have to implement it ourselves. It also allows us to scale our models to multiple GPUs and multiple nodes. |
| ![Conda](../figures/icons/conda.png){ width="50" } | We control the dependencies and Python interpreter using **Conda** that enables us to construct reproducible virtual environments |
| ![Hydra](../figures/icons/hydra.png){ width="50" } | For configuring our experiments we use **Hydra** that allows us to define a hierarchical configuration structure config files |
| ![Wandb](../figures/icons/w&b.png){ width="50" } | Using **Weights and Bias** allows us to track and log any values and hyperparameters for our experiments |
Expand Down
14 changes: 7 additions & 7 deletions pages/projects.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,12 @@ group yet.
We strive to keep the tools thought in this course as open-source as possible. The great thing about the open-source
community is that whatever problem you are working on, there is probably some package out there that can get you
at least 10% of the way. For the project, we want to enforce this point and you are required to include some third-party
package, that is neither Pytorch or one of the tools already covered in the course, into your project.
package, that is neither PyTorch or one of the tools already covered in the course, into your project.

If you have no idea what framework to include, the [Pytorch ecosystem](https://pytorch.org/ecosystem/) is a great place
for finding open-source frameworks that can help you accelerate your own projects where Pytorch is the backengine. All
tools in the ecosystem should work greatly together with Pytorch. However, it is important to note that the ecosystem is
not a complete list of all the awesome packages that exist to extend the functionality of Pytorch. If you are still
If you have no idea what framework to include, the [PyTorch ecosystem](https://pytorch.org/ecosystem/) is a great place
for finding open-source frameworks that can help you accelerate your own projects where PyTorch is the backengine. All
tools in the ecosystem should work greatly together with PyTorch. However, it is important to note that the ecosystem is
not a complete list of all the awesome packages that exist to extend the functionality of PyTorch. If you are still
missing inspiration for frameworks to use, we highly recommend these three that have been used in previous years of the
course:

Expand All @@ -51,7 +51,7 @@ course:
texts such as classification, information extraction, question answering, summarization, translation, text generation,
etc in 100+ languages. Its aim is to make cutting-edge NLP easier to use for everyone.

* [Pytorch-Geometric](https://github.com/rusty1s/pytorch_geometric). PyTorch Geometric (PyG) is a geometric deep
* [PyTorch-Geometric](https://github.com/rusty1s/pytorch_geometric). PyTorch Geometric (PyG) is a geometric deep
learning. It consists of various methods for deep learning on graphs and other irregular structures, also known as
geometric deep learning, from a variety of published papers.

Expand Down Expand Up @@ -188,7 +188,7 @@ point on the checklist for the exam.
you can optimize your code
* [ ] Use Weights & Biases to log training progress and other important metrics/artifacts in your code. Additionally,
consider running a hyperparameter optimization sweep.
* [ ] Use Pytorch-lightning (if applicable) to reduce the amount of boilerplate in your code
* [ ] Use PyTorch-lightning (if applicable) to reduce the amount of boilerplate in your code

### Week 2

Expand Down
4 changes: 2 additions & 2 deletions pages/timeplan.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ especially with a focus on making everything reproducible.

Date | Day | Presentation topic | Frameworks | Format
-----|-----------|--------------------------------------------------------------------|--------------------------------------|-----------
6/1/25 | Monday | [Deep learning software📝](../slides/DeepLearningSoftware.pdf) | Terminal, Conda, IDE, Pytorch | [Exercises](../s1_development_environment/README.md)
6/1/25 | Monday | [Deep learning software📝](../slides/DeepLearningSoftware.pdf) | Terminal, Conda, IDE, PyTorch | [Exercises](../s1_development_environment/README.md)
7/1/25 | Tuesday | [MLOps: what is it?📝](../slides/IntroToMLOps.pdf) | Git, CookieCutter, Pep8, DVC | [Exercises](../s2_organisation_and_version_control/README.md)
8/1/25 | Wednesday | [Reproducibility📝](../slides/ReproducibilityAndSoftware.pdf) | Docker, Hydra | [Exercises](../s3_reproducibility/README.md)
9/1/25 | Thursday | [Debugging📝](../slides/DebuggingML.pdf) | Debugger, Profiler, Wandb, Lightning | [Exercises](../s4_debugging_and_logging/README.md)
Expand Down Expand Up @@ -60,7 +60,7 @@ important topic if we ever want our applications to be used by many people at th
Date | Day | Presentation topic | Frameworks | Format
-----|-----------|--------------------------------------------------------------|------------------------------------------|----------
20/1/25 | Monday | [Monitoring📝](../slides/Monitoring.pdf) | Evidently AI, Prometheus, GCP Monitoring | [Exercises](../s8_monitoring/README.md)
21/1/25 | Tuesday | [Scalable applications📝](../slides/ScalingApplications.pdf) | Pytorch, Lightning | [Exercises](../s9_scalable_applications/README.md)
21/1/25 | Tuesday | [Scalable applications📝](../slides/ScalingApplications.pdf) | PyTorch, Lightning | [Exercises](../s9_scalable_applications/README.md)
22/1/25 | Wednesday | Company presentation (TBA) | - | [Projects](projects.md)
23/1/25 | Thursday | No lecture | - | [Projects](projects.md)
24/1/25 | Friday | No lecture | - | [Projects](projects.md)
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
[tool.ruff]
line-length = 120
target-version = "py311"

[tool.ruff.lint]
select = [
Expand Down Expand Up @@ -60,6 +61,7 @@ ignore = [
"B905",
"COM812",
"ISC001",
"TCH003",
]
exclude=["student_repos/"]

Expand Down
2 changes: 1 addition & 1 deletion reports/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ curriculum in this course. Therefore, we do not expect at all that you have chec
you can optimize your code
* [ ] Use Weights & Biases to log training progress and other important metrics/artifacts in your code. Additionally,
consider running a hyperparameter optimization sweep.
* [ ] Use Pytorch-lightning (if applicable) to reduce the amount of boilerplate in your code
* [ ] Use PyTorch-lightning (if applicable) to reduce the amount of boilerplate in your code

### Week 2

Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ pymdown-extensions == 10.12
mkdocs-same-dir == 0.1.2
mkdocs-git-revision-date-localized-plugin == 1.3.0
mkdocs-exclude == 1.0.2
markdown-exec[ansi] == 1.9.3

# Developer stuff
ruff == 0.7.2
Expand Down
12 changes: 3 additions & 9 deletions s10_extra/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,24 +9,18 @@ Some of them may still be under construction and may in the future be moved into

Learn how to setup a simple documentation system for your application

[:octicons-arrow-right-24: M31: Documentation](documentation.md)
[:octicons-arrow-right-24: M32: Documentation](documentation.md)

- ![](../figures/icons/optuna.png){align=right : style="height:100px;width:100px"}

Learn how to do hyperparameter optimization using Optuna

[:octicons-arrow-right-24: M32: Hyperparameter Optimization](hyperparameters.md)
[:octicons-arrow-right-24: M33: Hyperparameter Optimization](hyperparameters.md)

- ![](../figures/icons/pbs.png){align=right : style="height:100px;width:100px"}

Learn how to use HPC systems that use PBS to do job scheduling

[:octicons-arrow-right-24: M33: High Performance Clusters](high_performance_clusters.md)

- ![](../figures/icons/streamlit.png){align=right : style="height:100px;width:100px"}

Learn how to create a frontend for your application using Streamlit

[:octicons-arrow-right-24: M34: Frontend](frontend.md)
[:octicons-arrow-right-24: M34: High Performance Clusters](high_performance_clusters.md)

</div>
4 changes: 2 additions & 2 deletions s10_extra/documentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,7 @@ Before getting started with this set of exercises you should have completed
- main
permissions:
contents: write # (1)
contents: write # (1)!
jobs:
deploy:
Expand Down Expand Up @@ -311,7 +311,7 @@ Before getting started with this set of exercises you should have completed
* In the `Branch` setting choose the `gh-pages` branch and `/(root)` folder and save
<figure markdown>
![Image](../figures/github_pages.png){ width="700" }
![Image](../figures/github_pages.png){ width="700" }
</figure>
This should then start deploying your site to `https://<your-username>.github.io/<your-reponame>/`. If it does not
Expand Down
2 changes: 1 addition & 1 deletion s10_extra/high_performance_clusters.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ of cluster. For the purpose of this exercise we are going to see how we can run
1. First we need to load the correct version of CUDA. A cluster system often contains multiple versions of specific
software to suit the needs of all their users, and it is the users that are in charge of *loading* the correct
software during job submission. The only extra software that needs to be loaded for most Pytorch applications
software during job submission. The only extra software that needs to be loaded for most PyTorch applications
are a CUDA module. You can check which modules are available on the cluster with
```bash
Expand Down
Loading

0 comments on commit 92b03d4

Please sign in to comment.