From c2fa3cf7faff8d1417e0e63620a289d7db7d2126 Mon Sep 17 00:00:00 2001
From: "Das, Snehal" <snehal.das@intel.com>
Date: Fri, 22 Nov 2024 03:42:30 -0800
Subject: [PATCH] Better descriptions, minor corrections

Signed-off-by: Das, Snehal <snehal.das@intel.com>
---
 .../104_Keras_MNIST_with_CPU.ipynb            | 54 +++++++++++--------
 1 file changed, 31 insertions(+), 23 deletions(-)

diff --git a/openfl-tutorials/experimental/104_Keras_MNIST_with_CPU.ipynb b/openfl-tutorials/experimental/104_Keras_MNIST_with_CPU.ipynb
index 7a7995bc55..ad190584af 100644
--- a/openfl-tutorials/experimental/104_Keras_MNIST_with_CPU.ipynb
+++ b/openfl-tutorials/experimental/104_Keras_MNIST_with_CPU.ipynb
@@ -5,14 +5,17 @@
    "id": "b0d201a8",
    "metadata": {},
    "source": [
-    "# Training a CNN on CPU using the Workflow Interface and MNIST data.\n",
+    "# # Workflow Interface 104: Working with Keras on CPU\n",
+    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snehal-das/openfl/blob/develop/openfl-tutorials/experimental/104_Keras_MNIST_with_CPU.ipynb)\n",
+    "\n",
+    "## Training a CNN on CPU using the Workflow Interface and MNIST data.\n",
     "\n",
     "The workflow interface is a way of orchestrating a federated learning experiment with OpenFL. The fundamental idea is to allow collaborators to train the model using the training data, while the aggregator is largely responsible for aggregating the model weights returned by the collaborators.\n",
     "\n",
     "The experiment can be broken down into the following steps:\n",
     "1. Installing pre-requisites: This includes OpenFL, Tensorflow, Keras and NumPy for this example.\n",
     "2. Downloading the training and testing data.\n",
-    "3. Setting up the NN for training.\n",
+    "3. Setting up the neural network for training.\n",
     "4. Define the Aggregator, Collaborators.\n",
     "5. Defining the Workflow - This forms the crux of the example, intending to demonstrate how the training gets split between the aggregator and collaborators.\n",
     "6. Running the experiment and evaluating the model performance."
@@ -23,7 +26,7 @@
    "id": "1888be23",
    "metadata": {},
    "source": [
-    "#### STEP#1: Install pre-requisites"
+    "#### STEP#1: Install pre-requisites for the exercise, including OpenFL and Tensorflow."
    ]
   },
   {
@@ -37,13 +40,13 @@
     "%pip install git+https://github.com/securefederatedai/openfl.git\n",
     "%pip install -r workflow_interface_requirements.txt\n",
     "\n",
-    "#Install Tensorflow and MNIST dataset if not installed\n",
-    "%pip install tensorflow==2.13\n",
-    "\n",
     "# Uncomment this if running in Google Colab and set USERNAME if running in docker container.\n",
-    "# !pip install -r https://raw.githubusercontent.com/intel/openfl/develop/openfl-tutorials/experimental/workflow_interface_requirements.txt\n",
-    "# import os\n",
-    "# os.environ[\"USERNAME\"] = \"colab\""
+    "#%pip install -r https://raw.githubusercontent.com/intel/openfl/develop/openfl-tutorials/experimental/workflow_interface_requirements.txt\n",
+    "#import os\n",
+    "#os.environ[\"USERNAME\"] = \"colab\"\n",
+    "\n",
+    "#Install Tensorflow to access Keras\n",
+    "%pip install tensorflow==2.13"
    ]
   },
   {
@@ -51,7 +54,11 @@
    "id": "5f64c9d5",
    "metadata": {},
    "source": [
-    "#### STEP#2: Download testing and training data."
+    "#### STEP#2: Download testing and training data.\n",
+    "\n",
+    "For this example, we rely on the load_data() API of MNIST which upon being called downloads a total of 70,000 images of handwritten digits - 60,000 for training and 10,000 of testing the neural network model.\n",
+    "\n",
+    "For more details on the implementation, refer to: https://github.com/keras-team/keras/blob/master/keras/src/datasets/mnist.py#L10"
    ]
   },
   {
@@ -63,8 +70,6 @@
    "source": [
     "import tensorflow as tf\n",
     "import tensorflow.python.keras as keras\n",
-    "#import matplotlib.pyplot as plt\n",
-    "#from keras import backend as K\n",
     "from keras.utils import to_categorical\n",
     "from keras.datasets import mnist\n",
     "\n",
@@ -169,6 +174,13 @@
    "metadata": {},
    "source": [
     "#### STEP#4: Initialize the Aggregator and Collaborators.\n",
+    "\n",
+    "We import the `FLSpec`, `LocalRuntime`, and the aggregator, collaborator placement decorators.\n",
+    "\n",
+    "- `FLSpec` – Defines the flow specification. User defined flows are subclasses of this.\n",
+    "- `Runtime` – Defines where the flow runs, infrastructure for task transitions (how information gets sent). The `LocalRuntime` runs the flow on a single node.\n",
+    "- `aggregator/collaborator` - placement decorators that define where the task will be assigned.\n",
+    "\n",
     "Edit collaborator_names to add/remove collaborators."
    ]
   },
@@ -233,7 +245,9 @@
    "id": "e6ba622b",
    "metadata": {},
    "source": [
-    "#### STEP#5: Define the workflow needed to train the model using the data and participants."
+    "#### STEP#5: Define the workflow needed to train the model using the data and participants.\n",
+    "\n",
+    "Now we come to the flow definition. The OpenFL Workflow Interface adopts the conventions set by Metaflow, that every workflow begins with `start` and concludes with the `end` task. The aggregator begins with an optionally passed in model and optimizer. The aggregator begins the flow with the `start` task, where the list of collaborators is extracted from the runtime (`self.collaborators = self.runtime.collaborators`) and is then used as the list of participants to run the task listed in `self.next`, `aggregated_model_validation`. The model, optimizer, and anything that is not explicitly excluded from the next function will be passed from the `start` function on the aggregator to the `aggregated_model_validation` task on the collaborator. Where the tasks run is determined by the placement decorator that precedes each task definition (`@aggregator` or `@collaborator`). Once each of the collaborators (defined in the runtime) complete the `aggregated_model_validation` task, they pass their current state onto the `train` task, from `train` to `local_model_validation`, and then finally to `join` at the aggregator. It is in `join` that an average is taken of the model weights, and the next round can begin."
    ]
   },
   {
@@ -315,26 +329,20 @@
     "        print(f\"Final Loss, Accuracy numbers: Avg. loss: {loss}, Accuracy: {accuracy_percentage:.2f}%\")"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "f0ff1a9f",
-   "metadata": {},
-   "source": [
-    "At this point we are ready to train the model with the dataset downloaded from MNIST. "
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "cb67be11",
    "metadata": {},
    "source": [
-    "#### STEP6: Call KerasMNISTWorkflow to train the model."
+    "#### STEP#6: Call KerasMNISTWorkflow to train the model.\n",
+    "\n",
+    "At this point we are ready to train the model with the dataset downloaded from MNIST. "
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "e6c73353",
+   "id": "366ee972",
    "metadata": {},
    "outputs": [],
    "source": [