From f5c970137529393eed11452bbf51b9aaa987e6be Mon Sep 17 00:00:00 2001
From: Mirko Bronzi <m.bronzi@gmail.com>
Date: Fri, 20 Aug 2021 14:30:37 -0400
Subject: [PATCH] release 2.1 (#41)

* feat: add option to specify a temporary folder for the experiment. (#5)

* added option to rsync input and output data

* added docstring

* logging to stdout now

* fixed script for clusters - now using slurm tmpdir to write temp results

* fixing travis

* added missing docstring

* fixed tensorflow part (method signature change)

* renamed variables

* Seed for reproducibility (#6)

* added option to rsync input and output data

* added docstring

* logging to stdout now

* fixed script for clusters - now using slurm tmpdir to write temp results

* fixing travis

* added missing docstring

* fixed tensorflow part (method signature change)

* added seed for pytorch

* fixed typo

* added comment on how to use seed

* fixed flake8

* added test on reproducibility

* removed pytorch part from tensorflow

* fixed cookiecutter syntax

* added check for tensorflow

* fixed typo in test file

* added command to set the seed in tensorflow

* fixed flake8 error

* fixed typos

* removed duplicate log

* typo in docstring

* better error message in test

* added test to check repro using Orion (#8)

* added test to check repro using Orion

* more log into travis

* more info to debug travis

* running two trials for orion

* added seed to orion

* added orion test to tensorflow part

* better log messages in travis

* Add support for keras and Pytorch Lighning (#12)

* added code for keras - still need to complete all tests

* fixed flake8

* started adding PyTorch Lightning support - note that mlflow and loading/saving model still does not work

* fixed api change

* fixed pytorch early stopping

* fixed flake8

* fixed flake8 for pytorch version

* fixed keras part for flake8

* added code to resume a model - for pytorch lightning

* removed forgotten diff

* fixed start_from_scratch (not loading a model even if present) / now printing the val loss in the logs

* pytorch lightning now correctly logging under the same run

* now pytorch is correctly resuming training and continues to plot in the same mlflow run

* added github actions

* using a different ubuntu image

* printing folder - trying to fix github actions

* telling git who I am..

* removed not useful test

* fixed typo in test folder

* removed travis configuration - using github actions from now on

* correctly handling the saved models in pytorch

* now passing the full hyper-parameter object to train_impl method (for more flexibility).

* added option to ask for gpus in pytorch

* improved error message

* Fixups for the lightning_and_keras PR (#12) (#22)

* Update torch model to pl-lightning model

* Refactor train+model impl w/ optim module

* Refactor data loader w/ data module for plightning

* removing codecov from cookeicutter. (#24)

* moving to github actions (#25)

* removing coverage computation

* moving from travis to gitbug actions.

* setting fake name/email for git.

* removed (not-correct) duplicate for github actions config file.

* fixing tests.

* refactored pytorch models. (#26)

Co-authored-by: Mirko Bronzi <m.bronzi@gmail.com>

* running CI also on develop.

Co-authored-by: Pierre-Luc St-Charles <pierreluc.stcharles@mila.quebec>

* Adding more CI backends. (#27)

* added github actions.

* moved python version to 3.9 - by default.

* added support for azure continuous integration.

* updated mlflox/orion dependencies.

* now correctly restoring models for pytorch. (#28)

* Now running test-coverage locally. (#30)

* running test coverage locally.

* fixed project name.

* correctly allowing mlflow to work in any folder. (#29)

* removed duplicate CI.

* Update cookiecutter doc url (#37)

* made the template generic by default - will add mila-specific aspects only if enabled at template instantiation time (#38)

* default branch is now main (#39)

* made the template generic by default - will add mila-specific aspects only if enabled at template instantiation time

* now using main as the default branch for github

* Fixed typo

Co-authored-by: Pierre-Luc St-Charles <pierreluc.stcharles@mila.quebec>
Co-authored-by: Mathieu Germain <mathieu.germain@gmail.com>
---
 README.md                                     |  2 +-
 cookiecutter.json                             |  1 +
 .../.github/workflows/tests.yml               |  4 +--
 {{cookiecutter.project_slug}}/README.md       | 29 ++++++++++---------
 .../examples/{slurm_cc => slurm}/config.yaml  |  0
 .../examples/{slurm_cc => slurm}/run.sh       |  0
 .../{slurm_mila => slurm}/to_submit.sh        | 13 ++++++++-
 .../examples/slurm_cc/to_submit.sh            | 16 ----------
 .../examples/slurm_cc_orion/to_submit.sh      | 23 ---------------
 .../examples/slurm_mila/config.yaml           | 14 ---------
 .../examples/slurm_mila/run.sh                |  2 --
 .../examples/slurm_mila_orion/config.yaml     | 14 ---------
 .../slurm_mila_orion/orion_config.yaml        | 16 ----------
 .../examples/slurm_mila_orion/run.sh          |  2 --
 .../config.yaml                               |  0
 .../orion_config.yaml                         |  0
 .../{slurm_cc_orion => slurm_orion}/run.sh    |  0
 .../to_submit.sh                              | 17 +++++++----
 18 files changed, 43 insertions(+), 110 deletions(-)
 rename {{cookiecutter.project_slug}}/examples/{slurm_cc => slurm}/config.yaml (100%)
 rename {{cookiecutter.project_slug}}/examples/{slurm_cc => slurm}/run.sh (100%)
 rename {{cookiecutter.project_slug}}/examples/{slurm_mila => slurm}/to_submit.sh (55%)
 delete mode 100644 {{cookiecutter.project_slug}}/examples/slurm_cc/to_submit.sh
 delete mode 100644 {{cookiecutter.project_slug}}/examples/slurm_cc_orion/to_submit.sh
 delete mode 100644 {{cookiecutter.project_slug}}/examples/slurm_mila/config.yaml
 delete mode 100644 {{cookiecutter.project_slug}}/examples/slurm_mila/run.sh
 delete mode 100644 {{cookiecutter.project_slug}}/examples/slurm_mila_orion/config.yaml
 delete mode 100644 {{cookiecutter.project_slug}}/examples/slurm_mila_orion/orion_config.yaml
 delete mode 100644 {{cookiecutter.project_slug}}/examples/slurm_mila_orion/run.sh
 rename {{cookiecutter.project_slug}}/examples/{slurm_cc_orion => slurm_orion}/config.yaml (100%)
 rename {{cookiecutter.project_slug}}/examples/{slurm_cc_orion => slurm_orion}/orion_config.yaml (100%)
 rename {{cookiecutter.project_slug}}/examples/{slurm_cc_orion => slurm_orion}/run.sh (100%)
 rename {{cookiecutter.project_slug}}/examples/{slurm_mila_orion => slurm_orion}/to_submit.sh (70%)

diff --git a/README.md b/README.md
index 2674d52..b3b6e0f 100644
--- a/README.md
+++ b/README.md
@@ -13,7 +13,7 @@ A cookiecutter is a generic project template that will instantiate a new project
 * Flake8
 * Pytest
 
-More information on what a cookiecutter is [here.](https://cookiecutter.readthedocs.io/en/)
+More information on what a cookiecutter is [here.](https://cookiecutter.readthedocs.io)
 
 Quickstart
 ----------
diff --git a/cookiecutter.json b/cookiecutter.json
index 3e6909c..67f79ec 100644
--- a/cookiecutter.json
+++ b/cookiecutter.json
@@ -7,6 +7,7 @@
   "project_short_description": "{{ cookiecutter.project_name }} is wonderful!",
   "python_version": "3.8",
   "dl_framework": ["pytorch", "tensorflow_cpu", "tensorflow_gpu"],
+  "environment": ["generic", "mila"],
   "pypi_username": "{{ cookiecutter.github_username }}",
   "version": "0.0.1",
   "open_source_license": ["MIT license", "BSD license", "ISC license", "Apache Software License 2.0", "GNU General Public License v3", "Not open source"]
diff --git a/{{cookiecutter.project_slug}}/.github/workflows/tests.yml b/{{cookiecutter.project_slug}}/.github/workflows/tests.yml
index 7f65188..6800e41 100644
--- a/{{cookiecutter.project_slug}}/.github/workflows/tests.yml
+++ b/{{cookiecutter.project_slug}}/.github/workflows/tests.yml
@@ -4,11 +4,11 @@ on:
   # but only for the main/develop branch
   push:
     branches:
-      - master
+      - main
       - develop
   pull_request:
     branches:
-      - master
+      - main
       - develop
 jobs:
   build:
diff --git a/{{cookiecutter.project_slug}}/README.md b/{{cookiecutter.project_slug}}/README.md
index 4180275..9a594a6 100644
--- a/{{cookiecutter.project_slug}}/README.md
+++ b/{{cookiecutter.project_slug}}/README.md
@@ -1,5 +1,3 @@
-[![Build Status](https://travis-ci.com/{{ cookiecutter.github_username }}/{{ cookiecutter.project_slug }}.png?branch=master)](https://travis-ci.com/{{ cookiecutter.github_username }}/{{ cookiecutter.project_slug }})
-
 {% set is_open_source = cookiecutter.open_source_license != 'Not open source' -%}
 
 # {{ cookiecutter.project_name }}
@@ -46,9 +44,12 @@ These hooks will:
 Go on github and follow the instructions to create a new project.
 When done, do not add any file, and follow the instructions to
 link your local git to the remote project, which should look like this:
+(PS: these instructions are reported here for your convenience.
+We suggest to also look at the GitHub project page for more up-to-date info)
 
     git remote add origin git@github.com:{{ cookiecutter.github_username }}/{{ cookiecutter.project_slug }}.git
-    git push -u origin master
+    git branch -M main
+    git push -u origin main
 
 ### Setup Continuous Integration
 
@@ -66,7 +67,7 @@ Check the following instructions for more details.
 Github actions are already configured in `.github/workflows/tests.yml`.
 Github actions are already enabled by default when using Github, so, when
 pushing to github, they will be executed automatically for pull requests to
-`master` and to `develop`.
+`main` and to `develop`.
 
 #### Travis
 
@@ -120,12 +121,10 @@ Note you have two new folders now:
 You can run mlflow from this folder (`examples/local`) by running
 `mlflow ui`.
 
-#### Run on the Mila cluster
-(NOTE: this example also apply to Compute Canada - use the folders
-`slurm_cc` and `slurm_cc_orion` instead of `slurm_mila` and `slurm_mila_orion`.)
+#### Run on a remote cluster (with Slurm)
 
-First, bring you project on the Mila cluster (assuming you didn't create your
-project directly there). To do so, simply login on the Mila cluster and git
+First, bring you project on the cluster (assuming you didn't create your
+project directly there). To do so, simply login on the cluster and git
 clone your project:
 
     git clone git@github.com:{{ cookiecutter.github_username }}/{{ cookiecutter.project_slug }}.git
@@ -135,12 +134,13 @@ Then activate your virtual env, and install the dependencies:
     cd {{ cookiecutter.project_slug }}
     pip install -e .
 
-To run with SLURM, just:
+To run with Slurm, just:
 
-    cd examples/slurm_mila
+    cd examples/slurm
     sh run.sh
 
 Check the log to see that you got an almost perfect loss (i.e., 0).
+{%- if cookiecutter.environment == 'mila' %}
 
 #### Measure GPU time (and others) on the Mila cluster
 
@@ -184,11 +184,12 @@ In a separate shell on your local computer, run the following command:
 where `<username>` is your user name on the Mila cluster and `<hostname>` is the name of the machine your job is currenty running on (`leto35` in our example). You can then navigate your local browser to `http://localhost:19999/` to view the ressources being used on the cluster and monitor your job. You should see something like this:
 
 ![image](https://user-images.githubusercontent.com/18450628/88088807-fe2acd80-cb58-11ea-8ab2-bd090e8a826c.png)
+{%- endif %}
 
-#### Run with Orion on the Mila cluster
+#### Run with Orion on the Slurm cluster
 
 This example will run orion for 2 trials (see the orion config file).
-To do so, go into `examples/slurm_mila_orion`.
+To do so, go into `examples/slurm_orion`.
 Here you can find the orion config file (`orion_config.yaml`), as well as the config
 file (`config.yaml`) for your project (that contains the hyper-parameters).
 
@@ -204,7 +205,7 @@ Inside these folders, you can find the models (the best one and the last one), t
 the hyper-parameters for this trial, and the log file.
 
 You can check orion status with the following commands:
-(to be run from `examples/slurm_mila_orion`)
+(to be run from `examples/slurm_orion`)
 
     export ORION_DB_ADDRESS='orion_db.pkl'
     export ORION_DB_TYPE='pickleddb'
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_cc/config.yaml b/{{cookiecutter.project_slug}}/examples/slurm/config.yaml
similarity index 100%
rename from {{cookiecutter.project_slug}}/examples/slurm_cc/config.yaml
rename to {{cookiecutter.project_slug}}/examples/slurm/config.yaml
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_cc/run.sh b/{{cookiecutter.project_slug}}/examples/slurm/run.sh
similarity index 100%
rename from {{cookiecutter.project_slug}}/examples/slurm_cc/run.sh
rename to {{cookiecutter.project_slug}}/examples/slurm/run.sh
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_mila/to_submit.sh b/{{cookiecutter.project_slug}}/examples/slurm/to_submit.sh
similarity index 55%
rename from {{cookiecutter.project_slug}}/examples/slurm_mila/to_submit.sh
rename to {{cookiecutter.project_slug}}/examples/slurm/to_submit.sh
index fed5b77..c87cc12 100644
--- a/{{cookiecutter.project_slug}}/examples/slurm_mila/to_submit.sh
+++ b/{{cookiecutter.project_slug}}/examples/slurm/to_submit.sh
@@ -1,5 +1,16 @@
 #!/bin/bash
-#SBATCH --partition=long
+{%- if cookiecutter.environment == 'mila' %}
+## this is for the mila cluster (uncomment it if you need it):
+##SBATCH --account=rrg-bengioy-ad
+## this instead for ComputCanada (uncomment it if you need it):
+##SBATCH --partition=long
+# to attach a tag to your run (e.g., used to track the GPU time)
+# uncomment the following line and add replace `my_tag` with the proper tag:
+##SBATCH --wckey=my_tag
+{%- endif %}
+{%- if cookiecutter.environment == 'generic' %}
+## set --account=... or --partition=... as needed.
+{%- endif %}
 #SBATCH --cpus-per-task=2
 #SBATCH --gres=gpu:1
 #SBATCH --mem=5G
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_cc/to_submit.sh b/{{cookiecutter.project_slug}}/examples/slurm_cc/to_submit.sh
deleted file mode 100644
index 5e33447..0000000
--- a/{{cookiecutter.project_slug}}/examples/slurm_cc/to_submit.sh
+++ /dev/null
@@ -1,16 +0,0 @@
-#!/bin/bash
-#SBATCH --account=rrg-bengioy-ad
-#SBATCH --cpus-per-task=2
-#SBATCH --gres=gpu:1
-#SBATCH --mem=5G
-#SBATCH --time=0:05:00
-#SBATCH --job-name={{ cookiecutter.project_slug }}
-#SBATCH --output=logs/%x__%j.out
-#SBATCH --error=logs/%x__%j.err
-# remove one # if you prefer receiving emails
-##SBATCH --mail-type=all
-##SBATCH --mail-user={{ cookiecutter.email }}
-
-export MLFLOW_TRACKING_URI='mlruns'
-
-main --data ../data --output output --config config.yaml --tmp-folder ${SLURM_TMPDIR} --disable-progressbar
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_cc_orion/to_submit.sh b/{{cookiecutter.project_slug}}/examples/slurm_cc_orion/to_submit.sh
deleted file mode 100644
index 209fecb..0000000
--- a/{{cookiecutter.project_slug}}/examples/slurm_cc_orion/to_submit.sh
+++ /dev/null
@@ -1,23 +0,0 @@
-#!/bin/bash
-# __TODO__ fix options if needed
-#SBATCH --job-name={{ cookiecutter.project_slug }}
-#SBATCH --account=rrg-bengioy-ad
-#SBATCH --cpus-per-task=2
-#SBATCH --gres=gpu:1
-#SBATCH --mem=5G
-#SBATCH --time=0:05:00
-#SBATCH --output=logs/%x__%j.out
-#SBATCH --error=logs/%x__%j.err
-# remove one # if you prefer receiving emails
-##SBATCH --mail-type=all
-##SBATCH --mail-user={{ cookiecutter.email }}
-
-export MLFLOW_TRACKING_URI='mlruns'
-export ORION_DB_ADDRESS='orion_db.pkl'
-export ORION_DB_TYPE='pickleddb'
-
-orion -v hunt --config orion_config.yaml \
-    main --data ../data --config config.yaml --disable-progressbar \
-    --output '{exp.working_dir}/{exp.name}_{trial.id}/' \
-    --log '{exp.working_dir}/{exp.name}_{trial.id}/exp.log' \
-    --tmp-folder ${SLURM_TMPDIR}
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_mila/config.yaml b/{{cookiecutter.project_slug}}/examples/slurm_mila/config.yaml
deleted file mode 100644
index 2e58acc..0000000
--- a/{{cookiecutter.project_slug}}/examples/slurm_mila/config.yaml
+++ /dev/null
@@ -1,14 +0,0 @@
-# general
-batch_size: 32
-optimizer: adam
-loss: L1
-patience: 5
-architecture: my_model
-max_epoch: 99
-exp_name: my_exp_1
-# set to null to avoid setting a seed (can speed up GPU computation, but
-# results will not be reproducible)
-seed: 1234
-
-# architecture
-size: 10
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_mila/run.sh b/{{cookiecutter.project_slug}}/examples/slurm_mila/run.sh
deleted file mode 100644
index 9370362..0000000
--- a/{{cookiecutter.project_slug}}/examples/slurm_mila/run.sh
+++ /dev/null
@@ -1,2 +0,0 @@
-mkdir -p logs
-sbatch to_submit.sh
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_mila_orion/config.yaml b/{{cookiecutter.project_slug}}/examples/slurm_mila_orion/config.yaml
deleted file mode 100644
index 5c0028c..0000000
--- a/{{cookiecutter.project_slug}}/examples/slurm_mila_orion/config.yaml
+++ /dev/null
@@ -1,14 +0,0 @@
-# general
-batch_size: 32
-optimizer: adam
-loss: L1
-patience: 5
-architecture: my_model
-max_epoch: 99
-exp_name: my_exp_1
-# set to null to avoid setting a seed (can speed up GPU computation, but
-# results will not be reproducible)
-seed: 1234
-
-# architecture
-size: 'orion~uniform(1,100,discrete=True)'
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_mila_orion/orion_config.yaml b/{{cookiecutter.project_slug}}/examples/slurm_mila_orion/orion_config.yaml
deleted file mode 100644
index f6bd2e1..0000000
--- a/{{cookiecutter.project_slug}}/examples/slurm_mila_orion/orion_config.yaml
+++ /dev/null
@@ -1,16 +0,0 @@
-experiment:
-  name:
-    my_exp
-  max_trials: 2
-  working_dir:
-    orion_working_dir
-  algorithms:
-    random:
-      seed: 1234
-evc:
-  non_monitored_arguments:
-    - output
-    - data
-    - tmp-folder
-  ignore_code_changes:
-    true
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_mila_orion/run.sh b/{{cookiecutter.project_slug}}/examples/slurm_mila_orion/run.sh
deleted file mode 100644
index 9370362..0000000
--- a/{{cookiecutter.project_slug}}/examples/slurm_mila_orion/run.sh
+++ /dev/null
@@ -1,2 +0,0 @@
-mkdir -p logs
-sbatch to_submit.sh
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_cc_orion/config.yaml b/{{cookiecutter.project_slug}}/examples/slurm_orion/config.yaml
similarity index 100%
rename from {{cookiecutter.project_slug}}/examples/slurm_cc_orion/config.yaml
rename to {{cookiecutter.project_slug}}/examples/slurm_orion/config.yaml
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_cc_orion/orion_config.yaml b/{{cookiecutter.project_slug}}/examples/slurm_orion/orion_config.yaml
similarity index 100%
rename from {{cookiecutter.project_slug}}/examples/slurm_cc_orion/orion_config.yaml
rename to {{cookiecutter.project_slug}}/examples/slurm_orion/orion_config.yaml
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_cc_orion/run.sh b/{{cookiecutter.project_slug}}/examples/slurm_orion/run.sh
similarity index 100%
rename from {{cookiecutter.project_slug}}/examples/slurm_cc_orion/run.sh
rename to {{cookiecutter.project_slug}}/examples/slurm_orion/run.sh
diff --git a/{{cookiecutter.project_slug}}/examples/slurm_mila_orion/to_submit.sh b/{{cookiecutter.project_slug}}/examples/slurm_orion/to_submit.sh
similarity index 70%
rename from {{cookiecutter.project_slug}}/examples/slurm_mila_orion/to_submit.sh
rename to {{cookiecutter.project_slug}}/examples/slurm_orion/to_submit.sh
index a6e669e..1143f83 100644
--- a/{{cookiecutter.project_slug}}/examples/slurm_mila_orion/to_submit.sh
+++ b/{{cookiecutter.project_slug}}/examples/slurm_orion/to_submit.sh
@@ -1,16 +1,23 @@
 #!/bin/bash
-# __TODO__ fix options if needed
 #SBATCH --job-name={{ cookiecutter.project_slug }}
-#SBATCH --partition=long
+{%- if cookiecutter.environment == 'mila' %}
+## this is for the mila cluster (uncomment it if you need it):
+##SBATCH --account=rrg-bengioy-ad
+## this instead for ComputCanada (uncomment it if you need it):
+##SBATCH --partition=long
+# to attach a tag to your run (e.g., used to track the GPU time)
+# uncomment the following line and add replace `my_tag` with the proper tag:
+##SBATCH --wckey=my_tag
+{%- endif %}
+{%- if cookiecutter.environment == 'generic' %}
+## set --account=... or --partition=... as needed.
+{%- endif %}
 #SBATCH --cpus-per-task=2
 #SBATCH --gres=gpu:1
 #SBATCH --mem=5G
 #SBATCH --time=0:05:00
 #SBATCH --output=logs/%x__%j.out
 #SBATCH --error=logs/%x__%j.err
-# to attach a tag to your run (e.g., used to track the GPU time)
-# uncomment the following line and add replace `my_tag` with the proper tag:
-##SBATCH --wckey=my_tag
 # remove one # if you prefer receiving emails
 ##SBATCH --mail-type=all
 ##SBATCH --mail-user={{ cookiecutter.email }}