The repository has a directory for each specialization S01 - S06. Inside are the directories for the learning units. See the example below:
|--S01
|-- requirements.txt
|-- SLU01 - Pandas 101
|-- media
|-- some-image.csv
|-- data
|-- some-dataset.csv
|-- Examples notebook.ipynb
|-- Exercise notebook.ipynb
|-- Learning notebook.ipynb
|-- README.md
|-- SLU02 - Subsetting Data in Pandas
...
|-- SLU03 - Visualization with Pandas and Matplotlib
...
The learning unit directory naming follows the convention
<specialization ID> - <specialization name>/<learning unit ID> - <learning unit name>
The LU directory contains the README
(a markdown file with the description of the unit), the Learning notebook
, the Exercise notebook
, and the Examples notebook
. It may also contain a media
directory with all the visual material, the data
directory with the datasets, and a utils.py
file with helper code.
The notebooks are all Jupyter notebooks, the same as you worked with during the admissions process. They contain text, interactive code, and tests for your solutions in the exercises.
The requirements.txt
file in each specialization directory contains the packages to be installed in the virtual environment.
Please keep this directory structure also in your workspace repository because the grader in the portal depends on it.
You will need to follow this workflow whenever new learning material is released. Learning units release will be announced in the #announcements channel on Slack. At this point they will be available in this repository. A new Learning Unit is usually released on Monday mornings.
To get the new material, enter your local copy of this repo and pull from the repo:
cd ~/projects/batch7-students/
git pull
Copy the Learning Unit folder to your local batch7-workspace
:
cp -r ~/projects/batch7-students/"<specialization ID> - <specialization name>"/"<learning unit ID> - <learning unit name>" ~/projects/batch7-workspace/"<specialization ID> - <specialization name>"
For example, for the S01 - Bootcamp and Binary Classification
and SLU01 - Pandas 101
, it would look like this:
cp -r ~/projects/batch7-students/"S01 - Bootcamp and Binary Classification"/"SLU01 - Pandas 101" ~/projects/batch7-workspace/"S01 - Bootcamp and Binary Classification"
You will need a new virtual environment for every specialization. You have already created one for S01 during the previous setup steps. Here we will repeat some of those steps so that you have a complete guide for when you start each specialization.
- Open the terminal and create the virtual environment for the specialization:
python3.10 -m venv ~/.virtualenvs/s01
- Activate the virtual environment of the specialization:
source ~/.virtualenvs/s01/bin/activate
- Enter the directory of the specialization and install the requirements:
cd ~/projects/batch7-workspace/"S01 - Bootcamp and Binary Classification"
pip install -r requirements.txt
You will see a lot of output on the terminal while pip installs the packages. You can also notice possible errors.
-
Enter the learning unit directory in your workspace directory (
batch7-workspace
).Note: It is VERY IMPORTANT that you ALWAYS work on the files in your
batch7-workspace
repository, and NEVER change the files in thebatch7-students
local repository! If you do change these files, you can have a merge conflict when you next pull from the GitHub repository.
cd ~/projects/batch7-workspace/"S01 - Bootcamp and Binary Classification"/"SLU01 - Pandas 101"
- Activate the correct virtual environment
source ~/.virtualenvs/s01/bin/activate
- Run the Jupyter notebook:
jupyter notebook
If the previous command does not work, use this one:
jupyter notebook --NotebookApp.use_redirect_file=False
You should see something similar to this in your terminal:
Your browser should pop up with Jupyter open, however, if this does not happen, you can simply copy the link you see on your terminal (the one that contains localhost
) and paste it in your browser's address bar:
Note: If you see these scarry looking error messages, don't worry, you can just ignore them.
You will also see a message about the update of the jupyter notebook, you can ignore it.
After you have studied the Learning Notebook, do the exercises in the Exercise notebook. The notebook has cells where you should write your solutions followed by cells with tests for the solutions. The tests are series of assert
statements. If all the asserts pass, that is if you don't get an AssertionError
or any other kind of error, your solution is correct.
The failing asserts usually give you a hint about the error. Other kinds of errors given by Python will produce a lot of tracebacks indicating the line where the error occured.
Once you've solved all the exercises, we recommend to follow this simple checklist to avoid unexpected surprises:
- Save the notebook (again)
- Run "Restart & Run All"
- At this point the notebook should have run without any failing assertions
If you want to submit your notebook before it is all the way done to check intermediate progress, feel free to do so. The grader in the portal will run all your code, even if there are errors further up in the notebook.
Now you have worked on the exercise notebooks, you should commit the changes to your local repository and transfer them to your GitHub repository. You can test this workflow with the notebooks from the admission process, SLU01, SLU02, and SLU03.
Using the terminal, commit and push the changes (from the LU directory):
git add .
git commit -m 'Completed SLU01'
git push
Now go to the portal and ask it to grade your notebook.
- Go to the Portal and select the learning unit
- Select "Grade"
- You will see your grade, e.g. 20/20.
- If all the exercise asserts passed locally but the grader doesn't give you the expected output head to troubleshooting
As much as we try and have processes in place to prevent errors and bugs in the learning units some make it through to you.
If the problem is not in the exercise notebook you can just pull the new version from the students repo and replace the file. Take care to not overwrite other files.
If the correction is in the exercise notebook, you can't just replace the file because your work is there and you'll lose it!
When a new version of the exercise notebook is released two things will happen. If you submit an old version of the notebook it will be flagged as out of date and not graded. You will have to merge the work you've already done into the new version of the notebook.
Our suggestion to merge the changes is:
- Rename the old version
- Copy the new exercise notebook to your workspace repo
- Open both notebooks and copy-paste your solutions to the new notebook
We understand it's not ideal and are working on improving this workflow using nbdime. If you are comfortable installing Python packages you can try it out, but we offer no support for this at the moment.
During the academy you will surely run into problems and have doubts about the material. We provide you with some different channels to ask for help.
If you feel something is not clear enough or there is a bug in the learning material please follow these steps. Remember, there is no such thing as a dumb question, and by asking questions publicly you will help others!
Are you getting different results locally than in the Portal? If so we will first ask you to do a bit of troubleshooting:
- Ensure that you have saved the changes in the notebook
- Ensure that you have committed and pushed the changes
- Ensure that you are not using packages that are not present in the original
requirements.txt
file (changes to this file or your local environment have no effect) - In the learning unit page in the Portal you are able to download the exercise notebook with the results of the grader by clicking on your grade. Have a look to figure out what went wrong. If none of these steps helped go ahead and ask for help on Slack in the #devops channel.
Is the Portal down or acting out in some unexpected way? Please report it in the #devops channel on Slack.