Add a plugin to fix md formatting issues in mkdocs

Signed-off-by: Fabrice Normandin <[email protected]>
mila-iqia · lebrice · Jul 3, 2024 · Jun 27, 2024 · Jun 27, 2024 · Jun 27, 2024
commit 6d0844fb6de20f792783f5d19c8d9ca9d9c94204
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -72,16 +72,19 @@ repos:
 
   # md formatting
   - repo: https://github.com/executablebooks/mdformat
-    rev: 0.7.16
+    rev: 0.7.17
     hooks:
       - id: mdformat
         args: ["--number"]
         additional_dependencies:
           - mdformat-gfm
           - mdformat-tables
           - mdformat_frontmatter
-          # - mdformat-toc
-          # - mdformat-black
+          - mdformat-toc
+          - mdformat-config
+          - mdformat-black
+          # see https://github.com/KyleKing/mdformat-mkdocs
+          - mdformat-mkdocs[recommended]>=2.1.0
         require_serial: true
 
 

diff --git a/docs/contributing.md b/docs/contributing.md
@@ -0,0 +1,5 @@
+# Contributing
+
+TODOs:
+
+- [ ] Describe how to contribute to the project.
diff --git a/docs/examples.md b/docs/examples.md
@@ -1,5 +1,14 @@
 # Examples
 
+TODOs:
+
+- [ ] Show examples (that are also to be tested with doctest or similar) of how to add a new algo.
+- [ ] Show examples of how to add a new datamodule.
+- [ ] Add a link to the RL example once [#13](https://github.com/mila-iqia/ResearchTemplate/issues/13) is done.
+- [ ] Add a link to the NLP example once [#14](https://github.com/mila-iqia/ResearchTemplate/issues/14) is done.
+- [ ] Add an example of how to use Jax for the dataset/dataloading:
+    - Either through an RL example, or with `tfds` in [#18](https://github.com/mila-iqia/ResearchTemplate/issues/18)
+
 ## Simple run
 
 ```bash
@@ -11,3 +20,22 @@ python project/main.py algorithm=example_algo datamodule=mnist network=fcnet
 ```bash
 python project/main.py experiment=cluster_sweep_example
 ```
+
+## Using Jax
+
+You can use Jax for your dataloading, your network, or the learning algorithm, all while still benefiting from the nice stuff that comes from using PyTorch-Lightning.
+
+How does this work?
+Well, we use [torch-jax-interop](https://www.github.com/lebrice/torch_jax_interop), another package developed here at Mila, which allows easy interop between torch and jax code. See the readme on that repo for more details.
+
+### Example Algorithm that uses Jax
+
+You can use Jax for your training step, but not the entire training loop (since that is handled by Lightning).
+There are a few good reasons why you should let Lightning handle the training loop, most notably the fact that it handles all the logging, checkpointing, and other stuff that you'd lose if you swapped out the entire training framework for something based on Jax.
+
+In this [example Jax algorithm](https://www.github.com/mila-iqia/ResearchTemplate/tree/master/project/algorithms/jax_algo.py),
+a Neural network written in Jax (using [flax](https://flax.readthedocs.io/en/latest/)) is wrapped using the `torch_jax_interop.JaxFunction`, so that its parameters are learnable. The parameters are saved on the LightningModule as nn.Parameters (which use the same underlying memory as the jax arrays). In this example, the loss function is written in PyTorch, while the network forward and backward passes are written in Jax.
+
+### Example datamodule that uses Jax
+
+(todo)
diff --git a/docs/help.md b/docs/help.md
@@ -0,0 +1,5 @@
+# Help and Support
+
+## FAQ
+
+## How to get help
diff --git a/docs/index.md b/docs/index.md
@@ -1,9 +1,9 @@
 # Research Project Template
 
-<!-- For full documentation visit [mkdocs.org](https://www.mkdocs.org). -->
-
-![Build](https://github.com/mila-iqia/ResearchTemplate/workflows/build.yml/badge.svg)
+[![Build](https://github.com/mila-iqia/ResearchTemplate/actions/workflows/build.yml/badge.svg?branch=master)](https://github.com/mila-iqia/ResearchTemplate/actions/workflows/build.yml)
 [![codecov](https://codecov.io/gh/mila-iqia/ResearchTemplate/graph/badge.svg?token=I2DYLK8NTD)](https://codecov.io/gh/mila-iqia/ResearchTemplate)
+[![hydra](https://img.shields.io/badge/Config-Hydra_1.3-89b8cd)](https://hydra.cc/)
+[![license](https://img.shields.io/badge/License-MIT-green.svg?labelColor=gray)](https://github.com/mila-iqia/ResearchTemplate#license)
 
 Please note: This is a Work-in-Progress. The goal is to make a first release by the end of summer 2024.
 
@@ -17,8 +17,8 @@ This project makes use of the following libraries:
 
 - [Hydra](https://hydra.cc/) is used to configure the project. It allows you to define configuration files and override them from the command line.
 - [PyTorch Lightning](https://lightning.ai/docs/pytorch/stable/) is used to as the training framework. It provides a high-level interface to organize ML research code.
-  - Please note: This repo does not restrict you to use PyTorch. You can also use Jax, as is shown in the [Jax example](https://www.github.com/mila-iqia/ResearchTemplate/tree/master/project/algorithms/jax_algo.py)
-- [Weights & Biases](wandb.ai) is used to log metrics and visualize results.
+    - 🔥 Please note: You can also use [Jax](https://jax.readthedocs.io/en/latest/) with this repo, as is shown in the [Jax example](examples.md#using-jax) 🔥
+- [Weights & Biases](https://wandb.ai) is used to log metrics and visualize results.
 - [pytest](https://docs.pytest.org/en/stable/) is used for testing.
 
 ## Usage
@@ -29,7 +29,7 @@ To see all available options:
 python project/main.py --help
 ```
 
-todo
+For a detailed list of examples, see the [examples page](examples.md).
 
 <!-- * `mkdocs new [dir-name]` - Create a new project.
 * `mkdocs serve` - Start the live-reloading docs server.
@@ -49,12 +49,3 @@ project/
 docs/            # documentation
 conftest.py      # Test fixtures and utilities
 ```
-
-<!--
-## How does it work?
-
-todo  -->
-
-## Running tests
-
-todo -->
diff --git a/docs/intro.md b/docs/intro.md
@@ -0,0 +1,35 @@
+# Introduction
+
+## Why should you use this template?
+
+### Why should you use *a* template in the first place?
+
+For many good reasons, which are very well described [here in a similar project](https://cookiecutter-data-science.drivendata.org/why/)! 😊
+
+Other good reads:
+
+- [https://cookiecutter-data-science.drivendata.org/why/](https://cookiecutter-data-science.drivendata.org/why/)
+- [https://cookiecutter-data-science.drivendata.org/opinions/](https://cookiecutter-data-science.drivendata.org/opinions/)
+- [https://12factor.net/](https://12factor.net/)
+- [https://github.com/ashleve/lightning-hydra-template/tree/main?tab=readme-ov-file#main-ideas](https://github.com/ashleve/lightning-hydra-template/tree/main?tab=readme-ov-file#main-ideas)
+
+### Why should you use *this* template (instead of another)?
+
+You are welcome (and encouraged) to use other similar templates which, at the time of writing this, have significantly better documentation. However, there are several advantages to using this particular template:
+
+- ❗Support for both Jax and Torch with PyTorch-Lightning ❗
+- Easy development inside a devcontainer with VsCode
+- Tailor-made for ML researchers that run their jobs on SLURM clusters (with default configurations for the [Mila](https://docs.mila.quebec) and [DRAC](https://docs.alliancecan.ca) clusters.)
+- Rich typing of all parts of the source code using Python 3.12's new type annotation syntax
+- A comprehensive suite of automated tests for new algorithms, datasets and networks
+
+This template is geared specifically for ML researchers that run their jobs on SLURM clusters.
+A particular emphasis  for development specifically with a SLURM cluster, and more particularly still, with the Mila and DRAC clusters in mind. The target audience is (currently) limited to Mila researchers, but there's no reason why this
+
+## Main concepts
+
+### Datamodule
+
+### Network
+
+### Algorithm
diff --git a/docs/related.md b/docs/related.md
@@ -0,0 +1,21 @@
+# Related projects and resources
+
+There are other very similar projects with significantly better documentation. In all cases that involve Hydra and PyTorch-Lightning, this documentation also applies directly to this project, so in order to avoid copying their documentation, here are some links:
+
+- [lightning-hydra-template](https://github.com/ashleve/lightning-hydra-template)
+
+    - How it works: https://github.com/gorodnitskiy/yet-another-lightning-hydra-template/tree/main?tab=readme-ov-file#workflow---how-it-works
+
+- [yet-another-lightning-hydra-template](https://github.com/gorodnitskiy/yet-another-lightning-hydra-template)
+
+    - Excellent template.  based on the lightning-hydra-template. Great documentation, which is referenced extensively in this project.
+    - - Has a **great** Readme with lots of information
+    - - Is really well organized
+    - - doesn't support Jax
+    - - doesn't have a devcontainer
+    - Great blog: https://hackernoon.com/yet-another-lightning-hydra-template-for-ml-experiments
+
+- [cookiecutter-data-science](https://github.com/drivendataorg/cookiecutter-data-science)
+
+    - Awesome library for data science.
+    - Related projects: https://github.com/drivendataorg/cookiecutter-data-science/blob/master/docs/docs/related.md#links-to-related-projects-and-references
diff --git a/docs/tests.md b/docs/tests.md
@@ -0,0 +1,10 @@
+# Tests
+
+TODOs:
+
+- [ ] Described what is tested by the included automated tests (a bit like what is done [here](https://github.com/gorodnitskiy/yet-another-lightning-hydra-template?tab=readme-ov-file#tests))
+- [ ] Add some examples of how to run tests
+- [ ] describe why the test files are next to the source files, and why TDD is good, and why ML researchers should care more about tests.
+- [ ] Explain how the fixtures in `conftest.py` work (indirect parametrization of the command-line overrides, etc).
+- [ ] Describe the Github Actions workflows that come with the template, and how to setup a self-hosted runner for template forks.
+- [ ] Add links to relevant documentation ()
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -6,13 +6,17 @@ repo_url: https://www.github.com/mila-iqia/ResearchTemplate
 theme: readthedocs
 nav:
   - Home: index.md
-  - install.md
+  - intro.md
   - examples.md
+  - install.md
+  - tests.md
   - related.md
+  - help.md
+  - contributing.md
 markdown_extensions:
   - toc:
       permalink: "#"
-      toc_depth: 2
+      toc_depth: 3
 
 # todo: take a look at https://github.com/drivendataorg/cookiecutter-data-science/blob/master/docs/mkdocs.yml
 #   - admonition