diff --git a/docs/contents/ai/socratiq.html b/docs/contents/ai/socratiq.html
index c20e2327..73b009a7 100644
--- a/docs/contents/ai/socratiq.html
+++ b/docs/contents/ai/socratiq.html
@@ -674,9 +674,9 @@ <h2 class="anchored" data-anchor-id="quick-start-guide">Quick Start Guide</h2>
 <h2 class="anchored" data-anchor-id="button-overview">Button Overview</h2>
 <p>The top nav bar provices quick access to the following features:</p>
 <ol type="1">
-<li>Adjust your <a href="@sec-socratiq-settings">settings</a> at any time.</li>
-<li>Track your <a href="@sec-socratiq-dashboard">progress</a> by viewing the dashboard.</li>
-<li>Start new or save your <a href="@sec-socratiq-learning}">conversations</a> with SocratiQ.</li>
+<li>Adjust your <a href="#sec-socratiq-settings">settings</a> at any time.</li>
+<li>Track your <a href="#sec-socratiq-dashboard">progress</a> by viewing the dashboard.</li>
+<li>Start new or save your <a href="#sec-socratiq-learning">conversations</a> with SocratiQ.</li>
 </ol>
 <div id="fig-top-menu" class="quarto-float quarto-figure quarto-figure-center anchored">
 <figure class="quarto-float quarto-float-fig figure">
diff --git a/docs/contents/core/frameworks/frameworks.html b/docs/contents/core/frameworks/frameworks.html
index 3bfef71e..dad5cfa7 100644
--- a/docs/contents/core/frameworks/frameworks.html
+++ b/docs/contents/core/frameworks/frameworks.html
@@ -1610,7 +1610,7 @@ <h3 data-number="6.8.3" class="anchored" data-anchor-id="library"><span class="h
 <section id="choosing-the-right-framework" class="level2" data-number="6.9">
 <h2 data-number="6.9" class="anchored" data-anchor-id="choosing-the-right-framework"><span class="header-section-number">6.9</span> Choosing the Right Framework</h2>
 <p>Choosing the right machine learning framework for a given application requires carefully evaluating models, hardware, and software considerations. <a href="#fig-tf-comparison" class="quarto-xref">Figure&nbsp;<span>6.13</span></a> provides a comparison of different TensorFlow frameworks, which we’ll discuss in more detail:</p>
-<div id="fig-tf-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-align="center" data-caption="TensorFlow Framework Comparison - General">
+<div id="fig-tf-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-caption="TensorFlow Framework Comparison - General" data-align="center">
 <figure class="quarto-float quarto-float-fig figure">
 <div aria-describedby="fig-tf-comparison-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
 <a href="images/png/image4.png" class="lightbox" data-gallery="quarto-lightbox-gallery-14" title="Figure&nbsp;6.13: TensorFlow Framework Comparison - General. Source: TensorFlow."><img src="images/png/image4.png" style="width:100.0%" data-align="center" data-caption="TensorFlow Framework Comparison - General" class="figure-img"></a>
@@ -1630,7 +1630,7 @@ <h3 data-number="6.9.1" class="anchored" data-anchor-id="model"><span class="hea
 <h3 data-number="6.9.2" class="anchored" data-anchor-id="software"><span class="header-section-number">6.9.2</span> Software</h3>
 <p>As shown in <a href="#fig-tf-sw-comparison" class="quarto-xref">Figure&nbsp;<span>6.14</span></a>, TensorFlow Lite Micro does not have OS support, while TensorFlow and TensorFlow Lite do. This design choice for TensorFlow Lite Micro helps reduce memory overhead, make startup times faster, and consume less energy. Instead, TensorFlow Lite Micro can be used in conjunction with real-time operating systems (RTOS) like FreeRTOS, Zephyr, and Mbed OS.</p>
 <p>The figure also highlights an important memory management feature: TensorFlow Lite and TensorFlow Lite Micro support model memory mapping, allowing models to be directly accessed from flash storage rather than loaded into RAM. In contrast, TensorFlow does not offer this capability.</p>
-<div id="fig-tf-sw-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-align="center" data-caption="TensorFlow Framework Comparison - Model">
+<div id="fig-tf-sw-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-caption="TensorFlow Framework Comparison - Model" data-align="center">
 <figure class="quarto-float quarto-float-fig figure">
 <div aria-describedby="fig-tf-sw-comparison-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
 <a href="images/png/image5.png" class="lightbox" data-gallery="quarto-lightbox-gallery-15" title="Figure&nbsp;6.14: TensorFlow Framework Comparison - Software. Source: TensorFlow."><img src="images/png/image5.png" style="width:100.0%" data-align="center" data-caption="TensorFlow Framework Comparison - Model" class="figure-img"></a>
@@ -1646,7 +1646,7 @@ <h3 data-number="6.9.2" class="anchored" data-anchor-id="software"><span class="
 <section id="hardware" class="level3" data-number="6.9.3">
 <h3 data-number="6.9.3" class="anchored" data-anchor-id="hardware"><span class="header-section-number">6.9.3</span> Hardware</h3>
 <p>TensorFlow Lite and TensorFlow Lite Micro have significantly smaller base binary sizes and memory footprints than TensorFlow (see <a href="#fig-tf-hw-comparison" class="quarto-xref">Figure&nbsp;<span>6.15</span></a>). For example, a typical TensorFlow Lite Micro binary is less than 200KB, whereas TensorFlow is much larger. This is due to the resource-constrained environments of embedded systems. TensorFlow supports x86, TPUs, and GPUs like NVIDIA, AMD, and Intel.</p>
-<div id="fig-tf-hw-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-align="center" data-caption="TensorFlow Framework Comparison - Hardware">
+<div id="fig-tf-hw-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-caption="TensorFlow Framework Comparison - Hardware" data-align="center">
 <figure class="quarto-float quarto-float-fig figure">
 <div aria-describedby="fig-tf-hw-comparison-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
 <a href="images/png/image3.png" class="lightbox" data-gallery="quarto-lightbox-gallery-16" title="Figure&nbsp;6.15: TensorFlow Framework Comparison - Hardware. Source: TensorFlow."><img src="images/png/image3.png" style="width:100.0%" data-align="center" data-caption="TensorFlow Framework Comparison - Hardware" class="figure-img"></a>
@@ -1693,7 +1693,7 @@ <h2 data-number="6.10" class="anchored" data-anchor-id="future-trends-in-ml-fram
 <section id="decomposition" class="level3" data-number="6.10.1">
 <h3 data-number="6.10.1" class="anchored" data-anchor-id="decomposition"><span class="header-section-number">6.10.1</span> Decomposition</h3>
 <p>Currently, the ML system stack consists of four abstractions as shown in <a href="#fig-mlsys-stack" class="quarto-xref">Figure&nbsp;<span>6.16</span></a>, namely (1) computational graphs, (2) tensor programs, (3) libraries and runtimes, and (4) hardware primitives.</p>
-<div id="fig-mlsys-stack" class="quarto-float quarto-figure quarto-figure-center anchored" data-align="center" data-caption="Four Abstractions in Current ML System Stack">
+<div id="fig-mlsys-stack" class="quarto-float quarto-figure quarto-figure-center anchored" data-caption="Four Abstractions in Current ML System Stack" data-align="center">
 <figure class="quarto-float quarto-float-fig figure">
 <div aria-describedby="fig-mlsys-stack-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
 <a href="images/png/image8.png" class="lightbox" data-gallery="quarto-lightbox-gallery-17" title="Figure&nbsp;6.16: Four abstractions in current ML system stacks. Source: TVM."><img src="images/png/image8.png" class="img-fluid figure-img" data-align="center" data-caption="Four Abstractions in Current ML System Stack"></a>
diff --git a/docs/contents/core/ml_systems/images/png/hybrid.png b/docs/contents/core/ml_systems/images/png/hybrid.png
new file mode 100644
index 00000000..98dc681a
Binary files /dev/null and b/docs/contents/core/ml_systems/images/png/hybrid.png differ
diff --git a/docs/contents/core/ml_systems/images/png/venndiagram.png b/docs/contents/core/ml_systems/images/png/venndiagram.png
deleted file mode 100644
index 5754133c..00000000
Binary files a/docs/contents/core/ml_systems/images/png/venndiagram.png and /dev/null differ
diff --git a/docs/contents/core/ml_systems/ml_systems.html b/docs/contents/core/ml_systems/ml_systems.html
index 63c8eb57..0b69e679 100644
--- a/docs/contents/core/ml_systems/ml_systems.html
+++ b/docs/contents/core/ml_systems/ml_systems.html
@@ -518,7 +518,6 @@ <h2 id="toc-title">Table of contents</h2>
   <ul>
   <li><a href="#characteristics" id="toc-characteristics" class="nav-link" data-scroll-target="#characteristics"><span class="header-section-number">2.2.1</span> Characteristics</a>
   <ul class="collapse">
-  <li><a href="#definition-of-cloud-ml" id="toc-definition-of-cloud-ml" class="nav-link" data-scroll-target="#definition-of-cloud-ml">Definition of Cloud ML</a></li>
   <li><a href="#centralized-infrastructure" id="toc-centralized-infrastructure" class="nav-link" data-scroll-target="#centralized-infrastructure">Centralized Infrastructure</a></li>
   <li><a href="#scalable-data-processing-and-model-training" id="toc-scalable-data-processing-and-model-training" class="nav-link" data-scroll-target="#scalable-data-processing-and-model-training">Scalable Data Processing and Model Training</a></li>
   <li><a href="#flexible-deployment-and-accessibility" id="toc-flexible-deployment-and-accessibility" class="nav-link" data-scroll-target="#flexible-deployment-and-accessibility">Flexible Deployment and Accessibility</a></li>
@@ -554,7 +553,6 @@ <h2 id="toc-title">Table of contents</h2>
   <ul>
   <li><a href="#characteristics-1" id="toc-characteristics-1" class="nav-link" data-scroll-target="#characteristics-1"><span class="header-section-number">2.3.1</span> Characteristics</a>
   <ul class="collapse">
-  <li><a href="#definition-of-edge-ml" id="toc-definition-of-edge-ml" class="nav-link" data-scroll-target="#definition-of-edge-ml">Definition of Edge ML</a></li>
   <li><a href="#decentralized-data-processing" id="toc-decentralized-data-processing" class="nav-link" data-scroll-target="#decentralized-data-processing">Decentralized Data Processing</a></li>
   <li><a href="#local-data-storage-and-computation" id="toc-local-data-storage-and-computation" class="nav-link" data-scroll-target="#local-data-storage-and-computation">Local Data Storage and Computation</a></li>
   </ul></li>
@@ -577,27 +575,53 @@ <h2 id="toc-title">Table of contents</h2>
   <li><a href="#industrial-iot" id="toc-industrial-iot" class="nav-link" data-scroll-target="#industrial-iot">Industrial IoT</a></li>
   </ul></li>
   </ul></li>
-  <li><a href="#tiny-ml" id="toc-tiny-ml" class="nav-link" data-scroll-target="#tiny-ml"><span class="header-section-number">2.4</span> Tiny ML</a>
+  <li><a href="#mobile-ml" id="toc-mobile-ml" class="nav-link" data-scroll-target="#mobile-ml"><span class="header-section-number">2.4</span> Mobile ML</a>
   <ul>
   <li><a href="#characteristics-2" id="toc-characteristics-2" class="nav-link" data-scroll-target="#characteristics-2"><span class="header-section-number">2.4.1</span> Characteristics</a>
   <ul class="collapse">
-  <li><a href="#definition-of-tinyml" id="toc-definition-of-tinyml" class="nav-link" data-scroll-target="#definition-of-tinyml">Definition of TinyML</a></li>
+  <li><a href="#on-device-processing" id="toc-on-device-processing" class="nav-link" data-scroll-target="#on-device-processing">On-Device Processing</a></li>
+  <li><a href="#optimized-frameworks" id="toc-optimized-frameworks" class="nav-link" data-scroll-target="#optimized-frameworks">Optimized Frameworks</a></li>
+  </ul></li>
+  <li><a href="#benefits-2" id="toc-benefits-2" class="nav-link" data-scroll-target="#benefits-2"><span class="header-section-number">2.4.2</span> Benefits</a>
+  <ul class="collapse">
+  <li><a href="#real-time-processing" id="toc-real-time-processing" class="nav-link" data-scroll-target="#real-time-processing">Real-Time Processing</a></li>
+  <li><a href="#privacy-preservation" id="toc-privacy-preservation" class="nav-link" data-scroll-target="#privacy-preservation">Privacy Preservation</a></li>
+  <li><a href="#offline-functionality" id="toc-offline-functionality" class="nav-link" data-scroll-target="#offline-functionality">Offline Functionality</a></li>
+  </ul></li>
+  <li><a href="#challenges-2" id="toc-challenges-2" class="nav-link" data-scroll-target="#challenges-2"><span class="header-section-number">2.4.3</span> Challenges</a>
+  <ul class="collapse">
+  <li><a href="#resource-constraints" id="toc-resource-constraints" class="nav-link" data-scroll-target="#resource-constraints">Resource Constraints</a></li>
+  <li><a href="#battery-life-impact" id="toc-battery-life-impact" class="nav-link" data-scroll-target="#battery-life-impact">Battery Life Impact</a></li>
+  <li><a href="#model-size-limitations" id="toc-model-size-limitations" class="nav-link" data-scroll-target="#model-size-limitations">Model Size Limitations</a></li>
+  </ul></li>
+  <li><a href="#example-use-cases-2" id="toc-example-use-cases-2" class="nav-link" data-scroll-target="#example-use-cases-2"><span class="header-section-number">2.4.4</span> Example Use Cases</a>
+  <ul class="collapse">
+  <li><a href="#computer-vision-applications" id="toc-computer-vision-applications" class="nav-link" data-scroll-target="#computer-vision-applications">Computer Vision Applications</a></li>
+  <li><a href="#natural-language-processing" id="toc-natural-language-processing" class="nav-link" data-scroll-target="#natural-language-processing">Natural Language Processing</a></li>
+  <li><a href="#health-and-fitness-monitoring" id="toc-health-and-fitness-monitoring" class="nav-link" data-scroll-target="#health-and-fitness-monitoring">Health and Fitness Monitoring</a></li>
+  <li><a href="#personalization-and-user-experience" id="toc-personalization-and-user-experience" class="nav-link" data-scroll-target="#personalization-and-user-experience">Personalization and User Experience</a></li>
+  </ul></li>
+  </ul></li>
+  <li><a href="#tiny-ml" id="toc-tiny-ml" class="nav-link" data-scroll-target="#tiny-ml"><span class="header-section-number">2.5</span> Tiny ML</a>
+  <ul>
+  <li><a href="#characteristics-3" id="toc-characteristics-3" class="nav-link" data-scroll-target="#characteristics-3"><span class="header-section-number">2.5.1</span> Characteristics</a>
+  <ul class="collapse">
   <li><a href="#on-device-machine-learning" id="toc-on-device-machine-learning" class="nav-link" data-scroll-target="#on-device-machine-learning">On-Device Machine Learning</a></li>
   <li><a href="#low-power-and-resource-constrained-environments" id="toc-low-power-and-resource-constrained-environments" class="nav-link" data-scroll-target="#low-power-and-resource-constrained-environments">Low Power and Resource-Constrained Environments</a></li>
   </ul></li>
-  <li><a href="#benefits-2" id="toc-benefits-2" class="nav-link" data-scroll-target="#benefits-2"><span class="header-section-number">2.4.2</span> Benefits</a>
+  <li><a href="#benefits-3" id="toc-benefits-3" class="nav-link" data-scroll-target="#benefits-3"><span class="header-section-number">2.5.2</span> Benefits</a>
   <ul class="collapse">
   <li><a href="#extremely-low-latency" id="toc-extremely-low-latency" class="nav-link" data-scroll-target="#extremely-low-latency">Extremely Low Latency</a></li>
   <li><a href="#high-data-security" id="toc-high-data-security" class="nav-link" data-scroll-target="#high-data-security">High Data Security</a></li>
   <li><a href="#energy-efficiency" id="toc-energy-efficiency" class="nav-link" data-scroll-target="#energy-efficiency">Energy Efficiency</a></li>
   </ul></li>
-  <li><a href="#challenges-2" id="toc-challenges-2" class="nav-link" data-scroll-target="#challenges-2"><span class="header-section-number">2.4.3</span> Challenges</a>
+  <li><a href="#challenges-3" id="toc-challenges-3" class="nav-link" data-scroll-target="#challenges-3"><span class="header-section-number">2.5.3</span> Challenges</a>
   <ul class="collapse">
   <li><a href="#limited-computational-capabilities" id="toc-limited-computational-capabilities" class="nav-link" data-scroll-target="#limited-computational-capabilities">Limited Computational Capabilities</a></li>
   <li><a href="#complex-development-cycle" id="toc-complex-development-cycle" class="nav-link" data-scroll-target="#complex-development-cycle">Complex Development Cycle</a></li>
   <li><a href="#model-optimization-and-compression" id="toc-model-optimization-and-compression" class="nav-link" data-scroll-target="#model-optimization-and-compression">Model Optimization and Compression</a></li>
   </ul></li>
-  <li><a href="#example-use-cases-2" id="toc-example-use-cases-2" class="nav-link" data-scroll-target="#example-use-cases-2"><span class="header-section-number">2.4.4</span> Example Use Cases</a>
+  <li><a href="#example-use-cases-3" id="toc-example-use-cases-3" class="nav-link" data-scroll-target="#example-use-cases-3"><span class="header-section-number">2.5.4</span> Example Use Cases</a>
   <ul class="collapse">
   <li><a href="#wearable-devices" id="toc-wearable-devices" class="nav-link" data-scroll-target="#wearable-devices">Wearable Devices</a></li>
   <li><a href="#predictive-maintenance" id="toc-predictive-maintenance" class="nav-link" data-scroll-target="#predictive-maintenance">Predictive Maintenance</a></li>
@@ -605,9 +629,18 @@ <h2 id="toc-title">Table of contents</h2>
   <li><a href="#environmental-monitoring" id="toc-environmental-monitoring" class="nav-link" data-scroll-target="#environmental-monitoring">Environmental Monitoring</a></li>
   </ul></li>
   </ul></li>
-  <li><a href="#comparison" id="toc-comparison" class="nav-link" data-scroll-target="#comparison"><span class="header-section-number">2.5</span> Comparison</a></li>
-  <li><a href="#conclusion" id="toc-conclusion" class="nav-link" data-scroll-target="#conclusion"><span class="header-section-number">2.6</span> Conclusion</a></li>
-  <li><a href="#sec-ml-systems-resource" id="toc-sec-ml-systems-resource" class="nav-link" data-scroll-target="#sec-ml-systems-resource"><span class="header-section-number">2.7</span> Resources</a></li>
+  <li><a href="#hybrid-ml" id="toc-hybrid-ml" class="nav-link" data-scroll-target="#hybrid-ml"><span class="header-section-number">2.6</span> Hybrid ML</a>
+  <ul>
+  <li><a href="#train-serve-split" id="toc-train-serve-split" class="nav-link" data-scroll-target="#train-serve-split"><span class="header-section-number">2.6.1</span> Train-Serve Split</a></li>
+  <li><a href="#hierarchical-processing" id="toc-hierarchical-processing" class="nav-link" data-scroll-target="#hierarchical-processing"><span class="header-section-number">2.6.2</span> Hierarchical Processing</a></li>
+  <li><a href="#federated-learning" id="toc-federated-learning" class="nav-link" data-scroll-target="#federated-learning"><span class="header-section-number">2.6.3</span> Federated Learning</a></li>
+  <li><a href="#progressive-deployment" id="toc-progressive-deployment" class="nav-link" data-scroll-target="#progressive-deployment"><span class="header-section-number">2.6.4</span> Progressive Deployment</a></li>
+  <li><a href="#collaborative-learning" id="toc-collaborative-learning" class="nav-link" data-scroll-target="#collaborative-learning"><span class="header-section-number">2.6.5</span> Collaborative Learning</a></li>
+  <li><a href="#real-world-integration-patterns" id="toc-real-world-integration-patterns" class="nav-link" data-scroll-target="#real-world-integration-patterns"><span class="header-section-number">2.6.6</span> Real-World Integration Patterns</a></li>
+  </ul></li>
+  <li><a href="#comparison" id="toc-comparison" class="nav-link" data-scroll-target="#comparison"><span class="header-section-number">2.7</span> Comparison</a></li>
+  <li><a href="#conclusion" id="toc-conclusion" class="nav-link" data-scroll-target="#conclusion"><span class="header-section-number">2.8</span> Conclusion</a></li>
+  <li><a href="#sec-ml-systems-resource" id="toc-sec-ml-systems-resource" class="nav-link" data-scroll-target="#sec-ml-systems-resource"><span class="header-section-number">2.9</span> Resources</a></li>
   </ul>
 <div class="toc-actions"><ul><li><a href="https://github.com/harvard-edge/cs249r_book/edit/dev/contents/core/ml_systems/ml_systems.qmd" class="toc-action"><i class="bi bi-github"></i>Edit this page</a></li><li><a href="https://github.com/harvard-edge/cs249r_book/issues/new" class="toc-action"><i class="bi empty"></i>Report an issue</a></li><li><a href="https://github.com/harvard-edge/cs249r_book/blob/dev/contents/core/ml_systems/ml_systems.qmd" class="toc-action"><i class="bi empty"></i>View source</a></li></ul></div></nav>
     </div>
@@ -653,9 +686,9 @@ <h1 class="title"><span id="sec-ml_systems" class="quarto-section-identifier"><s
 </div>
 <div class="callout-body-container callout-body">
 <ul>
-<li><p>Understand the key characteristics and differences between Cloud ML, Edge ML, and TinyML systems.</p></li>
+<li><p>Understand the key characteristics and differences between Cloud ML, Edge ML, Mobile ML, and TinyML systems.</p></li>
 <li><p>Analyze the benefits and challenges associated with each ML paradigm.</p></li>
-<li><p>Explore real-world applications and use cases for Cloud ML, Edge ML, and TinyML.</p></li>
+<li><p>Explore real-world applications and use cases for Cloud ML, Edge ML, Mobile ML, and TinyML.</p></li>
 <li><p>Compare the performance aspects of each ML approach, including latency, privacy, and resource utilization.</p></li>
 <li><p>Examine the evolving landscape of ML systems and potential future developments.</p></li>
 </ul>
@@ -665,7 +698,113 @@ <h1 class="title"><span id="sec-ml_systems" class="quarto-section-identifier"><s
 <h2 data-number="2.1" class="anchored" data-anchor-id="overview"><span class="header-section-number">2.1</span> Overview</h2>
 <p>ML is rapidly evolving, with new paradigms reshaping how models are developed, trained, and deployed. The field is experiencing significant innovation driven by advancements in hardware, software, and algorithmic techniques. These developments are enabling machine learning to be applied in diverse settings, from large-scale cloud infrastructures to edge devices and even tiny, resource-constrained environments.</p>
 <p>Modern machine learning systems span a spectrum of deployment options, each with its own set of characteristics and use cases. At one end, we have cloud-based ML, which leverages powerful centralized computing resources for complex, data-intensive tasks. Moving along the spectrum, we encounter edge ML, which brings computation closer to the data source for reduced latency and improved privacy. At the far end, we find TinyML, which enables machine learning on extremely low-power devices with severe memory and processing constraints.</p>
-<p>This chapter explores the landscape of contemporary machine learning systems, covering three key approaches: Cloud ML, Edge ML, and TinyML. <a href="#fig-cloud-edge-tinyml-comparison" class="quarto-xref">Figure&nbsp;<span>2.1</span></a> illustrates the spectrum of distributed intelligence across these approaches, providing a visual comparison of their characteristics. We will examine the unique characteristics, advantages, and challenges of each approach, as depicted in the figure. Additionally, we will discuss the emerging trends and technologies that are shaping the future of machine learning deployment, considering how they might influence the balance between these three paradigms.</p>
+<p>To better understand the dramatic differences between these ML deployment options, <a href="#tbl-representative-systems" class="quarto-xref">Table&nbsp;<span>2.1</span></a> provides examples of representative hardware platforms for each category. These examples illustrate the vast range of computational resources, power requirements, and cost considerations across the ML systems spectrum. As we explore each paradigm in detail, you can refer back to these concrete examples to better understand the practical implications of each approach.</p>
+<div id="tbl-representative-systems" class="hover striped quarto-float quarto-figure quarto-figure-center anchored">
+<figure class="quarto-float quarto-float-tbl figure">
+<figcaption class="quarto-float-caption-top quarto-float-caption quarto-float-tbl" id="tbl-representative-systems-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
+Table&nbsp;2.1: Representative hardware platforms across the ML systems spectrum, showing typical specifications and capabilities for each category.
+</figcaption>
+<div aria-describedby="tbl-representative-systems-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
+<table class="table-hover table-striped caption-top table">
+<colgroup>
+<col style="width: 8%">
+<col style="width: 13%">
+<col style="width: 21%">
+<col style="width: 9%">
+<col style="width: 10%">
+<col style="width: 6%">
+<col style="width: 7%">
+<col style="width: 18%">
+</colgroup>
+<thead>
+<tr class="header">
+<th>Category</th>
+<th style="text-align: left;">Example Device</th>
+<th style="text-align: left;">Processor</th>
+<th style="text-align: left;">Memory</th>
+<th style="text-align: left;">Storage</th>
+<th style="text-align: left;">Power</th>
+<th style="text-align: left;">Price Range</th>
+<th style="text-align: left;">Example Models/Tasks</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td>Cloud ML</td>
+<td style="text-align: left;">NVIDIA DGX A100</td>
+<td style="text-align: left;">8x NVIDIA A100 GPUs (40GB/80GB)</td>
+<td style="text-align: left;">1TB System RAM</td>
+<td style="text-align: left;">15TB NVMe SSD</td>
+<td style="text-align: left;">6.5kW</td>
+<td style="text-align: left;">$200K+</td>
+<td style="text-align: left;">Large language models (GPT-3), real-time video processing</td>
+</tr>
+<tr class="even">
+<td></td>
+<td style="text-align: left;">Google TPU v4 Pod</td>
+<td style="text-align: left;">4096 TPU v4 chips</td>
+<td style="text-align: left;">128TB+</td>
+<td style="text-align: left;">Networked storage</td>
+<td style="text-align: left;">~MW</td>
+<td style="text-align: left;">Pay-per-use</td>
+<td style="text-align: left;">Training foundation models, large-scale ML research</td>
+</tr>
+<tr class="odd">
+<td>Edge ML</td>
+<td style="text-align: left;">NVIDIA Jetson AGX Orin</td>
+<td style="text-align: left;">12-core Arm® Cortex®-A78AE, NVIDIA Ampere GPU</td>
+<td style="text-align: left;">32GB LPDDR5</td>
+<td style="text-align: left;">64GB eMMC</td>
+<td style="text-align: left;">15-60W</td>
+<td style="text-align: left;">$899</td>
+<td style="text-align: left;">Computer vision, robotics, autonomous systems</td>
+</tr>
+<tr class="even">
+<td></td>
+<td style="text-align: left;">Intel NUC 12 Pro</td>
+<td style="text-align: left;">Intel Core i7-1260P, Intel Iris Xe</td>
+<td style="text-align: left;">32GB DDR4</td>
+<td style="text-align: left;">1TB SSD</td>
+<td style="text-align: left;">28W</td>
+<td style="text-align: left;">$750</td>
+<td style="text-align: left;">Edge AI servers, industrial automation</td>
+</tr>
+<tr class="odd">
+<td>Mobile ML</td>
+<td style="text-align: left;">iPhone 15 Pro</td>
+<td style="text-align: left;">A17 Pro (6-core CPU, 6-core GPU)</td>
+<td style="text-align: left;">8GB RAM</td>
+<td style="text-align: left;">128GB-1TB</td>
+<td style="text-align: left;">3-5W</td>
+<td style="text-align: left;">$999+</td>
+<td style="text-align: left;">Face ID, computational photography, voice recognition</td>
+</tr>
+<tr class="even">
+<td>TinyML</td>
+<td style="text-align: left;">Arduino Nano 33 BLE Sense</td>
+<td style="text-align: left;">Arm Cortex-M4 @ 64MHz</td>
+<td style="text-align: left;">256KB RAM</td>
+<td style="text-align: left;">1MB Flash</td>
+<td style="text-align: left;">0.02-0.04W</td>
+<td style="text-align: left;">$35</td>
+<td style="text-align: left;">Gesture recognition, voice detection</td>
+</tr>
+<tr class="odd">
+<td></td>
+<td style="text-align: left;">ESP32-CAM</td>
+<td style="text-align: left;">Dual-core @ 240MHz</td>
+<td style="text-align: left;">520KB RAM</td>
+<td style="text-align: left;">4MB Flash</td>
+<td style="text-align: left;">0.05-0.25W</td>
+<td style="text-align: left;">$10</td>
+<td style="text-align: left;">Image classification, motion detection</td>
+</tr>
+</tbody>
+</table>
+</div>
+</figure>
+</div>
+<p>This chapter explores the landscape of contemporary machine learning systems, covering four key approaches: Cloud ML, Edge ML, and TinyML. <a href="#fig-cloud-edge-tinyml-comparison" class="quarto-xref">Figure&nbsp;<span>2.1</span></a> illustrates the spectrum of distributed intelligence across these approaches, providing a visual comparison of their characteristics. We will examine the unique characteristics, advantages, and challenges of each approach, as depicted in the figure. Additionally, we will discuss the emerging trends and technologies that are shaping the future of machine learning deployment, considering how they might influence the balance between these three paradigms.</p>
 <div id="fig-cloud-edge-tinyml-comparison" class="quarto-float quarto-figure quarto-figure-center anchored">
 <figure class="quarto-float quarto-float-fig figure">
 <div aria-describedby="fig-cloud-edge-tinyml-comparison-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
@@ -676,20 +815,20 @@ <h2 data-number="2.1" class="anchored" data-anchor-id="overview"><span class="he
 </figcaption>
 </figure>
 </div>
-<p>The evolution of machine learning systems can be seen as a progression from centralized to distributed computing paradigms:</p>
-<ol type="1">
-<li><p><strong>Cloud ML:</strong> Initially, ML was predominantly cloud-based. Powerful servers in data centers were used to train and run large ML models. This approach leverages vast computational resources and storage capacities, enabling the development of complex models trained on massive datasets. Cloud ML excels at tasks requiring extensive processing power and is ideal for applications where real-time responsiveness isn’t critical.</p></li>
-<li><p><strong>Edge ML:</strong> As the need for real-time, low-latency processing grew, Edge ML emerged. This paradigm brings inference capabilities closer to the data source, typically on edge devices such as smartphones, smart cameras, or IoT gateways. Edge ML reduces latency, enhances privacy by keeping data local, and can operate with intermittent cloud connectivity. It’s particularly useful for applications requiring quick responses or handling sensitive data.</p></li>
-<li><p><strong>TinyML:</strong> The latest development in this progression is TinyML, which enables ML models to run on extremely resource-constrained microcontrollers and small embedded systems. TinyML allows for on-device inference without relying on connectivity to the cloud or edge, opening up new possibilities for intelligent, battery-operated devices. This approach is crucial for applications where size, power consumption, and cost are critical factors.</p></li>
-</ol>
+<p>The evolution of machine learning systems can be seen as a progression from centralized to increasingly distributed and specialized computing paradigms:</p>
+<p><strong>Cloud ML:</strong> Initially, ML was predominantly cloud-based. Powerful, scalable servers in data centers are used to train and run large ML models. This approach leverages vast computational resources and storage capacities, enabling the development of complex models trained on massive datasets. Cloud ML excels at tasks requiring extensive processing power, distributed training of large models, and is ideal for applications where real-time responsiveness isn’t critical. Popular platforms like AWS SageMaker, Google Cloud AI, and Azure ML offer flexible, scalable solutions for model development, training, and deployment. Cloud ML can handle models with billions of parameters, training on petabytes of data, but may incur latencies of 100-500ms for online inference due to network delays.</p>
+<p><strong>Edge ML:</strong> As the need for real-time, low-latency processing grew, Edge ML emerged. This paradigm brings inference capabilities closer to the data source, typically on edge devices such as industrial gateways, smart cameras, autonomous vehicles, or IoT hubs. Edge ML reduces latency (often to less than 50ms), enhances privacy by keeping data local, and can operate with intermittent cloud connectivity. It’s particularly useful for applications requiring quick responses or handling sensitive data in industrial or enterprise settings. Frameworks like NVIDIA Jetson or Google’s Edge TPU enable powerful ML capabilities on edge devices. Edge ML plays a crucial role in IoT ecosystems, enabling real-time decision making and reducing bandwidth usage by processing data locally.</p>
+<p><strong>Mobile ML:</strong> Building on edge computing concepts, Mobile ML focuses on leveraging the computational capabilities of smartphones and tablets. This approach enables personalized, responsive applications while reducing reliance on constant network connectivity. Mobile ML offers a balance between the power of edge computing and the ubiquity of personal devices. It utilizes on-device sensors (e.g., cameras, GPS, accelerometers) for unique ML applications. Frameworks like TensorFlow Lite and Core ML allow developers to deploy optimized models on mobile devices, with inference times often under 30ms for common tasks. Mobile ML enhances privacy by keeping personal data on the device and can operate offline, but must balance model performance with device resource constraints (typically 4-8GB RAM, 100-200GB storage).</p>
+<p><strong>TinyML:</strong> The latest development in this progression is TinyML, which enables ML models to run on extremely resource-constrained microcontrollers and small embedded systems. TinyML allows for on-device inference without relying on connectivity to the cloud, edge, or even the processing power of mobile devices. This approach is crucial for applications where size, power consumption, and cost are critical factors. TinyML devices typically operate with less than 1MB of RAM and flash memory, consuming only milliwatts of power, enabling battery life of months or years. Applications include wake word detection, gesture recognition, and predictive maintenance in industrial settings. Platforms like Arduino Nano 33 BLE Sense and STM32 microcontrollers, coupled with frameworks like TensorFlow Lite for Microcontrollers, enable ML on these tiny devices. However, TinyML requires significant model optimization and quantization to fit within these constraints.</p>
 <p>Each of these paradigms has its own strengths and is suited to different use cases:</p>
 <ul>
 <li>Cloud ML remains essential for tasks requiring massive computational power or large-scale data analysis.</li>
-<li>Edge ML is ideal for applications needing low-latency responses or local data processing.</li>
+<li>Edge ML is ideal for applications needing low-latency responses or local data processing in industrial or enterprise environments.</li>
+<li>Mobile ML is suited for personalized, responsive applications on smartphones and tablets.</li>
 <li>TinyML enables AI capabilities in small, power-efficient devices, expanding the reach of ML to new domains.</li>
 </ul>
-<p>The progression from Cloud to Edge to TinyML reflects a broader trend in computing towards more distributed, localized processing. This evolution is driven by the need for faster response times, improved privacy, reduced bandwidth usage, and the ability to operate in environments with limited or no connectivity.</p>
-<p><a href="#fig-vMLsizes" class="quarto-xref">Figure&nbsp;<span>2.2</span></a> illustrates the key differences between Cloud ML, Edge ML, and TinyML in terms of hardware, latency, connectivity, power requirements, and model complexity. As we move from Cloud to Edge to TinyML, we see a dramatic reduction in available resources, which presents significant challenges for deploying sophisticated machine learning models. This resource disparity becomes particularly apparent when attempting to deploy deep learning models on microcontrollers, the primary hardware platform for TinyML. These tiny devices have severely constrained memory and storage capacities, which are often insufficient for conventional deep learning models. We will learn to put these things into perspective in this chapter.</p>
+<p>This progression reflects a broader trend in computing towards more distributed, localized, and specialized processing. The evolution is driven by the need for faster response times, improved privacy, reduced bandwidth usage, and the ability to operate in environments with limited or no connectivity, while also catering to the specific capabilities and constraints of different types of devices.</p>
+<p><a href="#fig-vMLsizes" class="quarto-xref">Figure&nbsp;<span>2.2</span></a> illustrates the key differences between Cloud ML, Edge ML, Mobile ML, and TinyML in terms of hardware, latency, connectivity, power requirements, and model complexity. As we move from Cloud to Edge to TinyML, we see a dramatic reduction in available resources, which presents significant challenges for deploying sophisticated machine learning models. This resource disparity becomes particularly apparent when attempting to deploy deep learning models on microcontrollers, the primary hardware platform for TinyML. These tiny devices have severely constrained memory and storage capacities, which are often insufficient for conventional deep learning models. We will learn to put these things into perspective in this chapter.</p>
 <div id="fig-vMLsizes" class="quarto-float quarto-figure quarto-figure-center anchored page-columns page-full">
 <figure class="quarto-float quarto-float-fig figure page-columns page-full">
 <div aria-describedby="fig-vMLsizes-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
@@ -705,7 +844,7 @@ <h2 data-number="2.1" class="anchored" data-anchor-id="overview"><span class="he
 </section>
 <section id="cloud-ml" class="level2" data-number="2.2">
 <h2 data-number="2.2" class="anchored" data-anchor-id="cloud-ml"><span class="header-section-number">2.2</span> Cloud ML</h2>
-<p>Cloud ML leverages powerful servers in the cloud for training and running large, complex ML models and relies on internet connectivity. <a href="#fig-cloud-ml" class="quarto-xref">Figure&nbsp;<span>2.3</span></a> provides an overview of Cloud ML’s capabilities which we will discuss in greater detail throughout this section.</p>
+<p>Cloud Machine Learning (Cloud ML) is a subfield of machine learning that leverages the power and scalability of cloud computing infrastructure to develop, train, and deploy machine learning models. By utilizing the vast computational resources available in the cloud, Cloud ML enables the efficient handling of large-scale datasets and complex machine learning algorithms. <a href="#fig-cloud-ml" class="quarto-xref">Figure&nbsp;<span>2.3</span></a> provides an overview of Cloud ML’s capabilities which we will discuss in greater detail throughout this section.</p>
 <div id="fig-cloud-ml" class="quarto-float quarto-figure quarto-figure-center anchored">
 <figure class="quarto-float quarto-float-fig figure">
 <div aria-describedby="fig-cloud-ml-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
@@ -718,10 +857,6 @@ <h2 data-number="2.2" class="anchored" data-anchor-id="cloud-ml"><span class="he
 </div>
 <section id="characteristics" class="level3" data-number="2.2.1">
 <h3 data-number="2.2.1" class="anchored" data-anchor-id="characteristics"><span class="header-section-number">2.2.1</span> Characteristics</h3>
-<section id="definition-of-cloud-ml" class="level4">
-<h4 class="anchored" data-anchor-id="definition-of-cloud-ml">Definition of Cloud ML</h4>
-<p>Cloud Machine Learning (Cloud ML) is a subfield of machine learning that leverages the power and scalability of cloud computing infrastructure to develop, train, and deploy machine learning models. By utilizing the vast computational resources available in the cloud, Cloud ML enables the efficient handling of large-scale datasets and complex machine learning algorithms.</p>
-</section>
 <section id="centralized-infrastructure" class="level4">
 <h4 class="anchored" data-anchor-id="centralized-infrastructure">Centralized Infrastructure</h4>
 <p>One of the key characteristics of Cloud ML is its centralized infrastructure. <a href="#fig-cloudml-example" class="quarto-xref">Figure&nbsp;<span>2.4</span></a> illustrates this concept with an example from Google’s Cloud TPU data center. Cloud service providers offer a virtual platform that consists of high-capacity servers, expansive storage solutions, and robust networking architectures, all housed in data centers distributed across the globe. As shown in the figure, these centralized facilities can be massive in scale, housing rows upon rows of specialized hardware. This centralized setup allows for the pooling and efficient management of computational resources, making it easier to scale machine learning projects as needed.</p>
@@ -831,10 +966,6 @@ <h4 class="anchored" data-anchor-id="security-and-anomaly-detection">Security an
 </section>
 <section id="edge-ml" class="level2" data-number="2.3">
 <h2 data-number="2.3" class="anchored" data-anchor-id="edge-ml"><span class="header-section-number">2.3</span> Edge ML</h2>
-<section id="characteristics-1" class="level3" data-number="2.3.1">
-<h3 data-number="2.3.1" class="anchored" data-anchor-id="characteristics-1"><span class="header-section-number">2.3.1</span> Characteristics</h3>
-<section id="definition-of-edge-ml" class="level4">
-<h4 class="anchored" data-anchor-id="definition-of-edge-ml">Definition of Edge ML</h4>
 <p>Edge Machine Learning (Edge ML) runs machine learning algorithms directly on endpoint devices or closer to where the data is generated rather than relying on centralized cloud servers. This approach brings computation closer to the data source, reducing the need to send large volumes of data over networks, often resulting in lower latency and improved data privacy. <a href="#fig-edge-ml" class="quarto-xref">Figure&nbsp;<span>2.5</span></a> provides an overview of this section.</p>
 <div id="fig-edge-ml" class="quarto-float quarto-figure quarto-figure-center anchored">
 <figure class="quarto-float quarto-float-fig figure">
@@ -846,7 +977,8 @@ <h4 class="anchored" data-anchor-id="definition-of-edge-ml">Definition of Edge M
 </figcaption>
 </figure>
 </div>
-</section>
+<section id="characteristics-1" class="level3" data-number="2.3.1">
+<h3 data-number="2.3.1" class="anchored" data-anchor-id="characteristics-1"><span class="header-section-number">2.3.1</span> Characteristics</h3>
 <section id="decentralized-data-processing" class="level4">
 <h4 class="anchored" data-anchor-id="decentralized-data-processing">Decentralized Data Processing</h4>
 <p>In Edge ML, data processing happens in a decentralized fashion, as illustrated in <a href="#fig-edgeml-example" class="quarto-xref">Figure&nbsp;<span>2.6</span></a>. Instead of sending data to remote servers, the data is processed locally on devices like smartphones, tablets, or Internet of Things (IoT) devices. The figure showcases various examples of these edge devices, including wearables, industrial sensors, and smart home appliances. This local processing allows devices to make quick decisions based on the data they collect without relying heavily on a central server’s resources.</p>
@@ -914,12 +1046,73 @@ <h4 class="anchored" data-anchor-id="industrial-iot">Industrial IoT</h4>
 </section>
 </section>
 </section>
-<section id="tiny-ml" class="level2" data-number="2.4">
-<h2 data-number="2.4" class="anchored" data-anchor-id="tiny-ml"><span class="header-section-number">2.4</span> Tiny ML</h2>
+<section id="mobile-ml" class="level2" data-number="2.4">
+<h2 data-number="2.4" class="anchored" data-anchor-id="mobile-ml"><span class="header-section-number">2.4</span> Mobile ML</h2>
+<p>Mobile Machine Learning (Mobile ML) represents a specialized branch of Edge ML that focuses on deploying and running machine learning models directly on mobile devices like smartphones and tablets. This approach leverages the computational capabilities of modern mobile processors to perform ML tasks locally, offering a balance between the power of edge computing and the ubiquity of personal devices.</p>
 <section id="characteristics-2" class="level3" data-number="2.4.1">
 <h3 data-number="2.4.1" class="anchored" data-anchor-id="characteristics-2"><span class="header-section-number">2.4.1</span> Characteristics</h3>
-<section id="definition-of-tinyml" class="level4">
-<h4 class="anchored" data-anchor-id="definition-of-tinyml">Definition of TinyML</h4>
+<section id="on-device-processing" class="level4">
+<h4 class="anchored" data-anchor-id="on-device-processing">On-Device Processing</h4>
+<p>Mobile ML utilizes the processing power of mobile devices’ System-on-Chip (SoC) architectures, including specialized Neural Processing Units (NPUs) and AI accelerators. This enables efficient execution of ML models directly on the device, allowing for real-time processing of data from device sensors like cameras, microphones, and motion sensors without constant cloud connectivity.</p>
+</section>
+<section id="optimized-frameworks" class="level4">
+<h4 class="anchored" data-anchor-id="optimized-frameworks">Optimized Frameworks</h4>
+<p>Mobile ML is supported by specialized frameworks and tools designed specifically for mobile deployment, such as TensorFlow Lite for Android devices and Core ML for iOS devices. These frameworks are optimized for mobile hardware and provide efficient model compression and quantization techniques to ensure smooth performance within mobile resource constraints.</p>
+</section>
+</section>
+<section id="benefits-2" class="level3" data-number="2.4.2">
+<h3 data-number="2.4.2" class="anchored" data-anchor-id="benefits-2"><span class="header-section-number">2.4.2</span> Benefits</h3>
+<section id="real-time-processing" class="level4">
+<h4 class="anchored" data-anchor-id="real-time-processing">Real-Time Processing</h4>
+<p>Mobile ML enables real-time processing of data directly on mobile devices, eliminating the need for constant server communication. This results in faster response times for applications requiring immediate feedback, such as real-time translation, face detection, or gesture recognition.</p>
+</section>
+<section id="privacy-preservation" class="level4">
+<h4 class="anchored" data-anchor-id="privacy-preservation">Privacy Preservation</h4>
+<p>By processing data locally on the device, Mobile ML helps maintain user privacy. Sensitive information doesn’t need to leave the device, reducing the risk of data breaches and addressing privacy concerns, particularly important for applications handling personal data.</p>
+</section>
+<section id="offline-functionality" class="level4">
+<h4 class="anchored" data-anchor-id="offline-functionality">Offline Functionality</h4>
+<p>Mobile ML applications can function without constant internet connectivity, making them reliable in areas with poor network coverage or when users are offline. This ensures consistent performance and user experience regardless of network conditions.</p>
+</section>
+</section>
+<section id="challenges-2" class="level3" data-number="2.4.3">
+<h3 data-number="2.4.3" class="anchored" data-anchor-id="challenges-2"><span class="header-section-number">2.4.3</span> Challenges</h3>
+<section id="resource-constraints" class="level4">
+<h4 class="anchored" data-anchor-id="resource-constraints">Resource Constraints</h4>
+<p>Despite modern mobile devices being powerful, they still face resource constraints compared to cloud servers. Mobile ML must operate within limited RAM, storage, and processing power, requiring careful optimization of models and efficient resource management.</p>
+</section>
+<section id="battery-life-impact" class="level4">
+<h4 class="anchored" data-anchor-id="battery-life-impact">Battery Life Impact</h4>
+<p>ML operations can be computationally intensive, potentially impacting device battery life. Developers must balance model complexity and performance with power consumption to ensure reasonable battery life for users.</p>
+</section>
+<section id="model-size-limitations" class="level4">
+<h4 class="anchored" data-anchor-id="model-size-limitations">Model Size Limitations</h4>
+<p>Mobile devices have limited storage space, necessitating careful consideration of model size. This often requires model compression and quantization techniques, which can affect model accuracy and performance.</p>
+</section>
+</section>
+<section id="example-use-cases-2" class="level3" data-number="2.4.4">
+<h3 data-number="2.4.4" class="anchored" data-anchor-id="example-use-cases-2"><span class="header-section-number">2.4.4</span> Example Use Cases</h3>
+<section id="computer-vision-applications" class="level4">
+<h4 class="anchored" data-anchor-id="computer-vision-applications">Computer Vision Applications</h4>
+<p>Mobile ML has revolutionized how we use cameras on mobile devices, enabling sophisticated computer vision applications that process visual data in real-time. Modern smartphone cameras now incorporate ML models that can detect faces, analyze scenes, and apply complex filters instantaneously. These models work directly on the camera feed to enable features like portrait mode photography, where ML algorithms separate foreground subjects from backgrounds. Document scanning applications use ML to detect paper edges, correct perspective, and enhance text readability, while augmented reality applications use ML-powered object detection to accurately place virtual objects in the real world.</p>
+</section>
+<section id="natural-language-processing" class="level4">
+<h4 class="anchored" data-anchor-id="natural-language-processing">Natural Language Processing</h4>
+<p>Natural language processing on mobile devices has transformed how we interact with our phones and communicate with others. Speech recognition models run directly on device, enabling voice assistants to respond quickly to commands even without internet connectivity. Real-time translation applications can now translate conversations and text without sending data to the cloud, preserving privacy and working reliably regardless of network conditions. Mobile keyboards have become increasingly intelligent, using ML to predict not just the next word but entire phrases based on the user’s writing style and context, while maintaining all learning and personalization locally on the device.</p>
+</section>
+<section id="health-and-fitness-monitoring" class="level4">
+<h4 class="anchored" data-anchor-id="health-and-fitness-monitoring">Health and Fitness Monitoring</h4>
+<p>Mobile ML has enabled smartphones and tablets to become sophisticated health monitoring devices. Through clever use of existing sensors combined with ML models, mobile devices can now track physical activity, analyze sleep patterns, and monitor vital signs. For example, cameras can measure heart rate by detecting subtle color changes in the user’s skin, while accelerometers and ML models work together to recognize specific exercises and analyze workout form. These applications process sensitive health data directly on the device, ensuring privacy while providing users with real-time feedback and personalized health insights.</p>
+</section>
+<section id="personalization-and-user-experience" class="level4">
+<h4 class="anchored" data-anchor-id="personalization-and-user-experience">Personalization and User Experience</h4>
+<p>Perhaps the most pervasive but least visible application of Mobile ML lies in how it personalizes and enhances the overall user experience. ML models continuously analyze how users interact with their devices to optimize everything from battery usage to interface layouts. These models learn individual usage patterns to predict which apps users are likely to open next, preload content they might want to see, and adjust system settings like screen brightness and audio levels based on environmental conditions and user preferences. This creates a deeply personalized experience that adapts to each user’s needs while maintaining privacy by keeping all learning and adaptation on the device itself.</p>
+<p>These applications demonstrate how Mobile ML bridges the gap between cloud-based solutions and edge computing, providing efficient, privacy-conscious, and user-friendly machine learning capabilities on personal mobile devices. The continuous advancement in mobile hardware capabilities and optimization techniques continues to expand the possibilities for Mobile ML applications.</p>
+</section>
+</section>
+</section>
+<section id="tiny-ml" class="level2" data-number="2.5">
+<h2 data-number="2.5" class="anchored" data-anchor-id="tiny-ml"><span class="header-section-number">2.5</span> Tiny ML</h2>
 <p>TinyML sits at the crossroads of embedded systems and machine learning, representing a burgeoning field that brings smart algorithms directly to tiny microcontrollers and sensors. These microcontrollers operate under severe resource constraints, particularly regarding memory, storage, and computational power. <a href="#fig-tiny-ml" class="quarto-xref">Figure&nbsp;<span>2.7</span></a> encapsulates the key aspects of TinyML discussed in this section.</p>
 <div id="fig-tiny-ml" class="quarto-float quarto-figure quarto-figure-center anchored">
 <figure class="quarto-float quarto-float-fig figure">
@@ -931,10 +1124,11 @@ <h4 class="anchored" data-anchor-id="definition-of-tinyml">Definition of TinyML<
 </figcaption>
 </figure>
 </div>
-</section>
+<section id="characteristics-3" class="level3" data-number="2.5.1">
+<h3 data-number="2.5.1" class="anchored" data-anchor-id="characteristics-3"><span class="header-section-number">2.5.1</span> Characteristics</h3>
 <section id="on-device-machine-learning" class="level4">
 <h4 class="anchored" data-anchor-id="on-device-machine-learning">On-Device Machine Learning</h4>
-<p>In TinyML, the focus is on on-device machine learning. This means that machine learning models are deployed and trained on the device, eliminating the need for external servers or cloud infrastructures. This allows TinyML to enable intelligent decision-making right where the data is generated, making real-time insights and actions possible, even in settings where connectivity is limited or unavailable.</p>
+<p>In TinyML, the focus, much like in Mobile ML, is on on-device machine learning. This means that machine learning models are deployed and trained on the device, eliminating the need for external servers or cloud infrastructures. This allows TinyML to enable intelligent decision-making right where the data is generated, making real-time insights and actions possible, even in settings where connectivity is limited or unavailable.</p>
 </section>
 <section id="low-power-and-resource-constrained-environments" class="level4">
 <h4 class="anchored" data-anchor-id="low-power-and-resource-constrained-environments">Low Power and Resource-Constrained Environments</h4>
@@ -968,8 +1162,8 @@ <h4 class="anchored" data-anchor-id="low-power-and-resource-constrained-environm
 </div>
 </section>
 </section>
-<section id="benefits-2" class="level3" data-number="2.4.2">
-<h3 data-number="2.4.2" class="anchored" data-anchor-id="benefits-2"><span class="header-section-number">2.4.2</span> Benefits</h3>
+<section id="benefits-3" class="level3" data-number="2.5.2">
+<h3 data-number="2.5.2" class="anchored" data-anchor-id="benefits-3"><span class="header-section-number">2.5.2</span> Benefits</h3>
 <section id="extremely-low-latency" class="level4">
 <h4 class="anchored" data-anchor-id="extremely-low-latency">Extremely Low Latency</h4>
 <p>One of the standout benefits of TinyML is its ability to offer ultra-low latency. Since computation occurs directly on the device, the time required to send data to external servers and receive a response is eliminated. This is crucial in applications requiring immediate decision-making, enabling quick responses to changing conditions.</p>
@@ -983,8 +1177,8 @@ <h4 class="anchored" data-anchor-id="energy-efficiency">Energy Efficiency</h4>
 <p>TinyML operates within an energy-efficient framework, a necessity given its resource-constrained environments. By employing lean algorithms and optimized computational methods, TinyML ensures that devices can execute complex tasks without rapidly depleting battery life, making it a sustainable option for long-term deployments.</p>
 </section>
 </section>
-<section id="challenges-2" class="level3" data-number="2.4.3">
-<h3 data-number="2.4.3" class="anchored" data-anchor-id="challenges-2"><span class="header-section-number">2.4.3</span> Challenges</h3>
+<section id="challenges-3" class="level3" data-number="2.5.3">
+<h3 data-number="2.5.3" class="anchored" data-anchor-id="challenges-3"><span class="header-section-number">2.5.3</span> Challenges</h3>
 <section id="limited-computational-capabilities" class="level4">
 <h4 class="anchored" data-anchor-id="limited-computational-capabilities">Limited Computational Capabilities</h4>
 <p>However, the shift to TinyML comes with its set of hurdles. The primary limitation is the devices’ constrained computational capabilities. The need to operate within such limits means that deployed models must be simplified, which could affect the accuracy and sophistication of the solutions.</p>
@@ -998,8 +1192,8 @@ <h4 class="anchored" data-anchor-id="model-optimization-and-compression">Model O
 <p>A central challenge in TinyML is model optimization and compression. Creating machine learning models that can operate effectively within the limited memory and computational power of microcontrollers requires innovative approaches to model design. Developers often face the challenge of striking a delicate balance and optimizing models to maintain effectiveness while fitting within stringent resource constraints.</p>
 </section>
 </section>
-<section id="example-use-cases-2" class="level3" data-number="2.4.4">
-<h3 data-number="2.4.4" class="anchored" data-anchor-id="example-use-cases-2"><span class="header-section-number">2.4.4</span> Example Use Cases</h3>
+<section id="example-use-cases-3" class="level3" data-number="2.5.4">
+<h3 data-number="2.5.4" class="anchored" data-anchor-id="example-use-cases-3"><span class="header-section-number">2.5.4</span> Example Use Cases</h3>
 <section id="wearable-devices" class="level4">
 <h4 class="anchored" data-anchor-id="wearable-devices">Wearable Devices</h4>
 <p>In wearables, TinyML opens the door to smarter, more responsive gadgets. From fitness trackers offering real-time workout feedback to smart glasses processing visual data on the fly, TinyML transforms how we engage with wearable tech, delivering personalized experiences directly from the device.</p>
@@ -1019,107 +1213,208 @@ <h4 class="anchored" data-anchor-id="environmental-monitoring">Environmental Mon
 </section>
 </section>
 </section>
-<section id="comparison" class="level2" data-number="2.5">
-<h2 data-number="2.5" class="anchored" data-anchor-id="comparison"><span class="header-section-number">2.5</span> Comparison</h2>
-<p>Let’s bring together the different ML variants we’ve explored individually for a comprehensive view. <a href="#fig-venn-diagram" class="quarto-xref">Figure&nbsp;<span>2.9</span></a> illustrates the relationships and overlaps between Cloud ML, Edge ML, and TinyML using a Venn diagram. This visual representation effectively highlights the unique characteristics of each approach while also showing areas of commonality. Each ML paradigm has its own distinct features, but there are also intersections where these approaches share certain attributes or capabilities. This diagram helps us understand how these variants relate to each other in the broader landscape of machine learning implementations.</p>
-<div id="fig-venn-diagram" class="quarto-float quarto-figure quarto-figure-center anchored">
+<section id="hybrid-ml" class="level2" data-number="2.6">
+<h2 data-number="2.6" class="anchored" data-anchor-id="hybrid-ml"><span class="header-section-number">2.6</span> Hybrid ML</h2>
+<p>While we’ve examined Cloud ML, Edge ML, Mobile ML, and TinyML as distinct approaches, the reality of modern ML deployments is more nuanced. Systems architects often combine these paradigms to create solutions that leverage the strengths of each approach while mitigating their individual limitations. Understanding how these systems can work together opens up new possibilities for building more efficient and effective ML applications.</p>
+<section id="train-serve-split" class="level3" data-number="2.6.1">
+<h3 data-number="2.6.1" class="anchored" data-anchor-id="train-serve-split"><span class="header-section-number">2.6.1</span> Train-Serve Split</h3>
+<p>One of the most common hybrid patterns is the train-serve split, where model training occurs in the cloud but inference happens on edge, mobile, or tiny devices. This pattern takes advantage of the cloud’s vast computational resources for the training phase while benefiting from the low latency and privacy advantages of on-device inference. For example, smart home devices often use models trained on large datasets in the cloud but run inference locally to ensure quick response times and protect user privacy. In practice, this might involve training models on powerful systems like the NVIDIA DGX A100, leveraging its 8 A100 GPUs and terabyte-scale memory, before deploying optimized versions to edge devices like the NVIDIA Jetson AGX Orin for efficient inference. Similarly, mobile vision models for computational photography are typically trained on powerful cloud infrastructure but deployed to run efficiently on phone hardware.</p>
+</section>
+<section id="hierarchical-processing" class="level3" data-number="2.6.2">
+<h3 data-number="2.6.2" class="anchored" data-anchor-id="hierarchical-processing"><span class="header-section-number">2.6.2</span> Hierarchical Processing</h3>
+<p>Hierarchical processing creates a multi-tier system where data and intelligence flow between different levels of the ML stack. In industrial IoT applications, tiny sensors might perform basic anomaly detection, edge devices aggregate and analyze data from multiple sensors, and cloud systems handle complex analytics and model updates. For instance, we might see ESP32-CAM devices performing basic image classification at the sensor level with their minimal 520KB RAM, feeding data up to Jetson AGX Orin devices for more sophisticated computer vision tasks, and ultimately connecting to cloud infrastructure for complex analytics and model updates.</p>
+<p>This hierarchy allows each tier to handle tasks appropriate to its capabilities—TinyML devices handle immediate, simple decisions; edge devices manage local coordination; and cloud systems tackle complex analytics and learning tasks. Smart city installations often use this pattern, with street-level sensors feeding data to neighborhood-level edge processors, which in turn connect to city-wide cloud analytics.</p>
+</section>
+<section id="federated-learning" class="level3" data-number="2.6.3">
+<h3 data-number="2.6.3" class="anchored" data-anchor-id="federated-learning"><span class="header-section-number">2.6.3</span> Federated Learning</h3>
+<p>Federated learning represents a sophisticated hybrid approach where model training is distributed across many edge or mobile devices while maintaining privacy. Devices learn from local data and share model updates, rather than raw data, with cloud servers that aggregate these updates into an improved global model. This pattern is particularly powerful for applications like keyboard prediction on mobile devices or healthcare analytics, where privacy is paramount but benefits from collective learning are valuable. The cloud coordinates the learning process without directly accessing sensitive data, while devices benefit from the collective intelligence of the network.</p>
+</section>
+<section id="progressive-deployment" class="level3" data-number="2.6.4">
+<h3 data-number="2.6.4" class="anchored" data-anchor-id="progressive-deployment"><span class="header-section-number">2.6.4</span> Progressive Deployment</h3>
+<p>Progressive deployment strategies adapt models for different computational tiers, creating a cascade of increasingly lightweight versions. A model might start as a large, complex version in the cloud, then be progressively compressed and optimized for edge servers, mobile devices, and finally tiny sensors. Voice assistant systems often employ this pattern—full natural language processing runs in the cloud, while simplified wake-word detection runs on-device. This allows the system to balance capability and resource constraints across the ML stack.</p>
+</section>
+<section id="collaborative-learning" class="level3" data-number="2.6.5">
+<h3 data-number="2.6.5" class="anchored" data-anchor-id="collaborative-learning"><span class="header-section-number">2.6.5</span> Collaborative Learning</h3>
+<p>Collaborative learning enables peer-to-peer learning between devices at the same tier, often complementing hierarchical structures. Autonomous vehicle fleets, for example, might share learning about road conditions or traffic patterns directly between vehicles while also communicating with cloud infrastructure. This horizontal collaboration allows systems to share time-sensitive information and learn from each other’s experiences without always routing through central servers.</p>
+<p>These hybrid patterns demonstrate how modern ML systems are evolving beyond simple client-server architectures into rich, multi-tier systems that combine the strengths of different approaches. By understanding these patterns, system architects can design solutions that effectively balance competing demands for computation, latency, privacy, and power efficiency. The future of ML systems likely lies not in choosing between cloud, edge, mobile, or tiny approaches, but in creatively combining them to build more capable and efficient systems.</p>
+</section>
+<section id="real-world-integration-patterns" class="level3" data-number="2.6.6">
+<h3 data-number="2.6.6" class="anchored" data-anchor-id="real-world-integration-patterns"><span class="header-section-number">2.6.6</span> Real-World Integration Patterns</h3>
+<p>In practice, ML systems rarely operate in isolation. Instead, they form interconnected networks where each paradigm—Cloud, Edge, Mobile, and TinyML—plays a specific role while communicating with other parts of the system. These interactions follow distinct patterns that emerge from the inherent strengths and limitations of each approach.</p>
+<p>Cloud systems excel at training and analytics but require significant infrastructure. Edge systems provide local processing power and reduced latency. Mobile devices offer personal computing capabilities and user interaction. TinyML enables intelligence in the smallest devices and sensors.</p>
+<div id="fig-hybrid" class="quarto-float quarto-figure quarto-figure-center anchored">
 <figure class="quarto-float quarto-float-fig figure">
-<div aria-describedby="fig-venn-diagram-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
-<a href="images/png/venndiagram.png" class="lightbox" data-gallery="quarto-lightbox-gallery-10" title="Figure&nbsp;2.9: ML Venn diagram. Source: arXiv"><img src="images/png/venndiagram.png" class="img-fluid figure-img"></a>
+<div aria-describedby="fig-hybrid-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
+<a href="./images/png/hybrid.png" class="lightbox" data-gallery="quarto-lightbox-gallery-10" title="Figure&nbsp;2.9: Example interaction patterns between ML paradigms, showing data flows, model deployment, and processing relationships across Cloud, Edge, Mobile, and TinyML systems."><img src="./images/png/hybrid.png" class="img-fluid figure-img"></a>
 </div>
-<figcaption class="quarto-float-caption-bottom quarto-float-caption quarto-float-fig" id="fig-venn-diagram-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
-Figure&nbsp;2.9: ML Venn diagram. Source: <a href="https://arxiv.org/html/2403.19076v1">arXiv</a>
+<figcaption class="quarto-float-caption-bottom quarto-float-caption quarto-float-fig" id="fig-hybrid-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
+Figure&nbsp;2.9: Example interaction patterns between ML paradigms, showing data flows, model deployment, and processing relationships across Cloud, Edge, Mobile, and TinyML systems.
 </figcaption>
 </figure>
 </div>
-<p>For a more detailed comparison of these ML variants, we can refer to <a href="#tbl-big_vs_tiny" class="quarto-xref">Table&nbsp;<span>2.1</span></a>. This table offers a comprehensive analysis of Cloud ML, Edge ML, and TinyML based on various features and aspects. By examining these different characteristics side by side, we gain a clearer perspective on the unique advantages and distinguishing factors of each approach. This detailed comparison, combined with the visual overview provided by the Venn diagram, aids in making informed decisions based on the specific needs and constraints of a given application or project.</p>
+<p><a href="#fig-hybrid" class="quarto-xref">Figure&nbsp;<span>2.9</span></a> illustrates the key interactions between these different ML paradigms. Notice how data flows upward from sensors through processing layers to cloud analytics, while model deployments flow downward from cloud training to various inference points. The interactions aren’t strictly hierarchical—mobile devices might communicate directly with both cloud services and tiny sensors, while edge systems can assist mobile devices with complex processing tasks. To understand how these interactions manifest in real applications, let’s explore several common scenarios using <a href="#fig-hybrid" class="quarto-xref">Figure&nbsp;<span>2.9</span></a>:</p>
+<ul>
+<li><p><strong>Model Deployment Scenario:</strong> A company develops a computer vision model for defect detection. After training in the cloud, optimized versions are deployed to edge servers in factories, quality control tablets on the production floor, and tiny cameras embedded in the production line. This showcases how a single ML solution can be distributed across different computational tiers for optimal performance.</p></li>
+<li><p><strong>Data Flow and Analysis Scenario:</strong> In a smart agriculture system, soil sensors (TinyML) collect moisture and nutrient data, sending results to edge processors in local stations. These process the data and forward insights to the cloud for farm-wide analytics, while also sharing results with farmers’ mobile apps. This demonstrates the hierarchical flow of data from sensors to cloud analytics.</p></li>
+<li><p><strong>Edge-Mobile Assistance Scenario:</strong> When a mobile app needs to perform complex image processing that exceeds the phone’s capabilities, it connects to a nearby edge server. The edge system helps process the heavier computational tasks, sending back results to enhance the mobile app’s performance. This shows how different ML tiers can cooperate to handle demanding tasks.</p></li>
+<li><p><strong>TinyML-Mobile Integration Scenario:</strong> A fitness tracker uses TinyML to continuously monitor activity patterns and vital signs. It synchronizes this processed data with the user’s smartphone, which combines it with other health data before sending consolidated updates to the cloud for long-term health analysis. This illustrates the common pattern of tiny devices using mobile devices as gateways to larger networks.</p></li>
+<li><p><strong>Multi-Layer Processing Scenario:</strong> In a smart retail environment, tiny sensors monitor inventory levels, sending inference results to both edge systems for immediate stock management and mobile devices for staff notifications. The edge systems process this data alongside other store metrics, while the cloud analyzes trends across all store locations. This shows how multiple ML tiers can work together in a complete solution.</p></li>
+</ul>
+<p>These real-world patterns demonstrate how different ML paradigms naturally complement each other in practice. While each approach has its own strengths, their true power emerges when they work together as an integrated system. By understanding these patterns, system architects can better design solutions that effectively leverage the capabilities of each ML tier while managing their respective constraints.</p>
+</section>
+</section>
+<section id="comparison" class="level2" data-number="2.7">
+<h2 data-number="2.7" class="anchored" data-anchor-id="comparison"><span class="header-section-number">2.7</span> Comparison</h2>
+<p>Let’s bring together the different ML variants we’ve explored individually for a comprehensive view. For a detailed comparison of these ML variants, we can refer to <a href="#tbl-big_vs_tiny" class="quarto-xref">Table&nbsp;<span>2.2</span></a>. This table offers a comprehensive analysis of Cloud ML, Edge ML, and TinyML based on various features and aspects. By examining these different characteristics side by side, we gain a clearer perspective on the unique advantages and distinguishing factors of each approach. This detailed comparison, combined with the visual overview provided by the Venn diagram, aids in making informed decisions based on the specific needs and constraints of a given application or project.</p>
 <div id="tbl-big_vs_tiny" class="hover striped quarto-float quarto-figure quarto-figure-center anchored">
 <figure class="quarto-float quarto-float-tbl figure">
 <figcaption class="quarto-float-caption-top quarto-float-caption quarto-float-tbl" id="tbl-big_vs_tiny-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
-Table&nbsp;2.1: Comparison of feature aspects across Cloud ML, Edge ML, and TinyML.
+Table&nbsp;2.2: Comparison of feature aspects across Cloud ML, Edge ML, and TinyML.
 </figcaption>
 <div aria-describedby="tbl-big_vs_tiny-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
 <table class="table-hover table-striped caption-top table">
 <colgroup>
-<col style="width: 13%">
-<col style="width: 28%">
-<col style="width: 28%">
-<col style="width: 28%">
+<col style="width: 10%">
+<col style="width: 22%">
+<col style="width: 22%">
+<col style="width: 22%">
+<col style="width: 22%">
 </colgroup>
 <thead>
 <tr class="header">
 <th style="text-align: left;">Aspect</th>
 <th style="text-align: left;">Cloud ML</th>
 <th style="text-align: left;">Edge ML</th>
+<th style="text-align: left;">Mobile ML</th>
 <th style="text-align: left;">TinyML</th>
 </tr>
 </thead>
 <tbody>
 <tr class="odd">
 <td style="text-align: left;">Processing Location</td>
-<td style="text-align: left;">Centralized servers (Data Centers)</td>
-<td style="text-align: left;">Local devices (closer to data sources)</td>
-<td style="text-align: left;">On-device (microcontrollers, embedded systems)</td>
+<td style="text-align: left;">Centralized cloud servers (Data Centers)</td>
+<td style="text-align: left;">Local edge devices (gateways, servers)</td>
+<td style="text-align: left;">Smartphones and tablets</td>
+<td style="text-align: left;">Ultra-low-power microcontrollers and embedded systems</td>
 </tr>
 <tr class="even">
 <td style="text-align: left;">Latency</td>
-<td style="text-align: left;">High (Depends on internet connectivity)</td>
-<td style="text-align: left;">Moderate (Reduced latency compared to Cloud ML)</td>
-<td style="text-align: left;">Low (Immediate processing without network delay)</td>
+<td style="text-align: left;">High (100ms-1000ms+)</td>
+<td style="text-align: left;">Moderate (10-100ms)</td>
+<td style="text-align: left;">Low-Moderate (5-50ms)</td>
+<td style="text-align: left;">Very Low (1-10ms)</td>
 </tr>
 <tr class="odd">
 <td style="text-align: left;">Data Privacy</td>
-<td style="text-align: left;">Moderate (Data transmitted over networks)</td>
-<td style="text-align: left;">High (Data remains on local networks)</td>
-<td style="text-align: left;">Very High (Data processed on-device, not transmitted)</td>
+<td style="text-align: left;">Basic-Moderate (Data leaves device)</td>
+<td style="text-align: left;">High (Data stays in local network)</td>
+<td style="text-align: left;">High (Data stays on phone)</td>
+<td style="text-align: left;">Very High (Data never leaves sensor)</td>
 </tr>
 <tr class="even">
 <td style="text-align: left;">Computational Power</td>
-<td style="text-align: left;">High (Utilizes powerful data center infrastructure)</td>
-<td style="text-align: left;">Moderate (Utilizes local device capabilities)</td>
-<td style="text-align: left;">Low (Limited to the power of the embedded system)</td>
+<td style="text-align: left;">Very High (Multiple GPUs/TPUs)</td>
+<td style="text-align: left;">High (Edge GPUs)</td>
+<td style="text-align: left;">Moderate (Mobile NPUs/GPUs)</td>
+<td style="text-align: left;">Very Low (MCU/tiny processors)</td>
 </tr>
 <tr class="odd">
 <td style="text-align: left;">Energy Consumption</td>
-<td style="text-align: left;">High (Data centers consume significant energy)</td>
-<td style="text-align: left;">Moderate (Less than data centers, more than TinyML)</td>
-<td style="text-align: left;">Low (Highly energy-efficient, designed for low power)</td>
+<td style="text-align: left;">Very High (kW-MW range)</td>
+<td style="text-align: left;">High (100s W)</td>
+<td style="text-align: left;">Moderate (1-10W)</td>
+<td style="text-align: left;">Very Low (mW range)</td>
 </tr>
 <tr class="even">
 <td style="text-align: left;">Scalability</td>
-<td style="text-align: left;">High (Easy to scale with additional server resources)</td>
-<td style="text-align: left;">Moderate (Depends on local device capabilities)</td>
-<td style="text-align: left;">Low (Limited by the hardware resources of the device)</td>
+<td style="text-align: left;">Excellent (virtually unlimited)</td>
+<td style="text-align: left;">Good (limited by edge hardware)</td>
+<td style="text-align: left;">Moderate (per-device scaling)</td>
+<td style="text-align: left;">Limited (fixed hardware)</td>
 </tr>
 <tr class="odd">
 <td style="text-align: left;">Cost</td>
-<td style="text-align: left;">High (Recurring costs for server usage, maintenance)</td>
-<td style="text-align: left;">Variable (Depends on the complexity of local setup)</td>
-<td style="text-align: left;">Low (Primarily upfront costs for hardware components)</td>
+<td style="text-align: left;">High ($1000s+/month)</td>
+<td style="text-align: left;">Moderate ($100s-1000s)</td>
+<td style="text-align: left;">Low ($0-10s)</td>
+<td style="text-align: left;">Very Low ($1-10s)</td>
 </tr>
 <tr class="even">
-<td style="text-align: left;">Connectivity</td>
-<td style="text-align: left;">High (Requires stable internet connectivity)</td>
-<td style="text-align: left;">Low (Can operate with intermittent connectivity)</td>
-<td style="text-align: left;">Very Low (Can operate without any network connectivity)</td>
+<td style="text-align: left;">Connectivity Required</td>
+<td style="text-align: left;">Constant high-bandwidth</td>
+<td style="text-align: left;">Intermittent</td>
+<td style="text-align: left;">Optional</td>
+<td style="text-align: left;">None</td>
 </tr>
 <tr class="odd">
 <td style="text-align: left;">Real-time Processing</td>
-<td style="text-align: left;">Moderate (Can be affected by network latency)</td>
-<td style="text-align: left;">High (Capable of real-time processing locally)</td>
-<td style="text-align: left;">Very High (Immediate processing with minimal latency)</td>
+<td style="text-align: left;">Dependent on network</td>
+<td style="text-align: left;">Good</td>
+<td style="text-align: left;">Very Good</td>
+<td style="text-align: left;">Excellent</td>
 </tr>
 <tr class="even">
-<td style="text-align: left;">Application Examples</td>
-<td style="text-align: left;">Big Data Analysis, Virtual Assistants</td>
-<td style="text-align: left;">Autonomous Vehicles, Smart Homes</td>
-<td style="text-align: left;">Wearables, Sensor Networks</td>
+<td style="text-align: left;">Storage Capacity</td>
+<td style="text-align: left;">Unlimited (petabytes+)</td>
+<td style="text-align: left;">Large (terabytes)</td>
+<td style="text-align: left;">Moderate (gigabytes)</td>
+<td style="text-align: left;">Very Limited (kilobytes-megabytes)</td>
 </tr>
 <tr class="odd">
-<td style="text-align: left;">Complexity</td>
-<td style="text-align: left;">Moderate to High (Requires knowledge in cloud computing)</td>
-<td style="text-align: left;">Moderate (Requires knowledge in local network setup)</td>
-<td style="text-align: left;">Moderate to High (Requires expertise in embedded systems)</td>
+<td style="text-align: left;">Primary Use Cases</td>
+<td style="text-align: left;">Big Data Analytics, Training, Complex AI Models</td>
+<td style="text-align: left;">Smart Manufacturing, Video Analytics, IoT Hubs</td>
+<td style="text-align: left;">AR/VR Apps, Mobile Gaming, Photo/Video Processing</td>
+<td style="text-align: left;">Sensor Processing, Gesture Detection, Keyword Spotting</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">Development Complexity</td>
+<td style="text-align: left;">High (cloud expertise needed)</td>
+<td style="text-align: left;">Moderate-High (edge+networking)</td>
+<td style="text-align: left;">Moderate (mobile SDKs)</td>
+<td style="text-align: left;">High (embedded expertise)</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">Deployment Speed</td>
+<td style="text-align: left;">Fast</td>
+<td style="text-align: left;">Moderate</td>
+<td style="text-align: left;">Fast</td>
+<td style="text-align: left;">Slow</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">Hardware Requirements</td>
+<td style="text-align: left;">Cloud infrastructure</td>
+<td style="text-align: left;">Edge servers/gateways</td>
+<td style="text-align: left;">Modern smartphones</td>
+<td style="text-align: left;">MCUs/embedded systems</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">Framework Support</td>
+<td style="text-align: left;">All ML frameworks</td>
+<td style="text-align: left;">Most frameworks</td>
+<td style="text-align: left;">Mobile-optimized (TFLite, CoreML)</td>
+<td style="text-align: left;">TinyML frameworks</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">Model Size Limits</td>
+<td style="text-align: left;">None</td>
+<td style="text-align: left;">Several GB</td>
+<td style="text-align: left;">10s-100s MB</td>
+<td style="text-align: left;">Bytes-KB range</td>
+</tr>
+<tr class="odd">
+<td style="text-align: left;">Battery Impact</td>
+<td style="text-align: left;">N/A</td>
+<td style="text-align: left;">N/A</td>
+<td style="text-align: left;">Moderate</td>
+<td style="text-align: left;">Minimal</td>
+</tr>
+<tr class="even">
+<td style="text-align: left;">Offline Capability</td>
+<td style="text-align: left;">None</td>
+<td style="text-align: left;">Good</td>
+<td style="text-align: left;">Excellent</td>
+<td style="text-align: left;">Complete</td>
 </tr>
 </tbody>
 </table>
@@ -1127,14 +1422,14 @@ <h2 data-number="2.5" class="anchored" data-anchor-id="comparison"><span class="
 </figure>
 </div>
 </section>
-<section id="conclusion" class="level2" data-number="2.6">
-<h2 data-number="2.6" class="anchored" data-anchor-id="conclusion"><span class="header-section-number">2.6</span> Conclusion</h2>
+<section id="conclusion" class="level2" data-number="2.8">
+<h2 data-number="2.8" class="anchored" data-anchor-id="conclusion"><span class="header-section-number">2.8</span> Conclusion</h2>
 <p>In this chapter, we’ve offered a panoramic view of the evolving landscape of machine learning, covering cloud, edge, and tiny ML paradigms. Cloud-based machine learning leverages the immense computational resources of cloud platforms to enable powerful and accurate models but comes with limitations, including latency and privacy concerns. Edge ML mitigates these limitations by bringing inference directly to edge devices, offering lower latency and reduced connectivity needs. TinyML takes this further by miniaturizing ML models to run directly on highly resource-constrained devices, opening up a new category of intelligent applications.</p>
 <p>Each approach has its tradeoffs, including model complexity, latency, privacy, and hardware costs. Over time, we anticipate converging these embedded ML approaches, with cloud pre-training facilitating more sophisticated edge and tiny ML implementations. Advances like federated learning and on-device learning will enable embedded devices to refine their models by learning from real-world data.</p>
 <p>The embedded ML landscape is rapidly evolving and poised to enable intelligent applications across a broad spectrum of devices and use cases. This chapter serves as a snapshot of the current state of embedded ML. As algorithms, hardware, and connectivity continue to improve, we can expect embedded devices of all sizes to become increasingly capable, unlocking transformative new applications for artificial intelligence.</p>
 </section>
-<section id="sec-ml-systems-resource" class="level2" data-number="2.7">
-<h2 data-number="2.7" class="anchored" data-anchor-id="sec-ml-systems-resource"><span class="header-section-number">2.7</span> Resources</h2>
+<section id="sec-ml-systems-resource" class="level2" data-number="2.9">
+<h2 data-number="2.9" class="anchored" data-anchor-id="sec-ml-systems-resource"><span class="header-section-number">2.9</span> Resources</h2>
 <p>Here is a curated list of resources to support students and instructors in their learning and teaching journeys. We are continuously working on expanding this collection and will be adding new exercises soon.</p>
 <div class="callout callout-style-default callout-note callout-titled">
 <div class="callout-header d-flex align-content-center" data-bs-toggle="collapse" data-bs-target=".callout-3-contents" aria-controls="callout-3" aria-expanded="true" aria-label="Toggle callout">
diff --git a/docs/index.html b/docs/index.html
index 87aac7c6..58a1bd33 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -7,7 +7,7 @@
 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
 
 <meta name="author" content="Vijay Janapa Reddi">
-<meta name="dcterms.date" content="2024-12-04">
+<meta name="dcterms.date" content="2024-12-05">
 
 <title>Machine Learning Systems</title>
 <style>
@@ -537,7 +537,7 @@ <h1 class="title">Machine Learning Systems</h1>
     <div>
     <div class="quarto-title-meta-heading">Last Updated</div>
     <div class="quarto-title-meta-contents">
-      <p class="date">December 4, 2024</p>
+      <p class="date">December 5, 2024</p>
     </div>
   </div>
   
diff --git a/docs/search.json b/docs/search.json
index d72d96af..46ef2ba0 100644
--- a/docs/search.json
+++ b/docs/search.json
@@ -434,7 +434,7 @@
     "href": "contents/core/ml_systems/ml_systems.html",
     "title": "2  ML Systems",
     "section": "",
-    "text": "2.1 Overview\nResources: Slides, Videos, Exercises\nMachine learning (ML) systems, built on the foundation of computing systems, hold the potential to transform our world. These systems, with their specialized roles and real-time computational capabilities, represent a critical junction where data and computation meet on a micro-scale. They are specifically tailored to optimize performance, energy usage, and spatial efficiency—key factors essential for the successful implementation of ML systems.\nAs this chapter progresses, we will explore ML systems’ complex and fascinating world. We’ll gain insights into their structural design and operational features and understand their key role in powering ML applications. Starting with the basics of microcontroller units, we will examine the interfaces and peripherals that improve their functionalities. This chapter is designed to be a comprehensive guide that explains the nuanced aspects of different ML systems.\nML is rapidly evolving, with new paradigms reshaping how models are developed, trained, and deployed. The field is experiencing significant innovation driven by advancements in hardware, software, and algorithmic techniques. These developments are enabling machine learning to be applied in diverse settings, from large-scale cloud infrastructures to edge devices and even tiny, resource-constrained environments.\nModern machine learning systems span a spectrum of deployment options, each with its own set of characteristics and use cases. At one end, we have cloud-based ML, which leverages powerful centralized computing resources for complex, data-intensive tasks. Moving along the spectrum, we encounter edge ML, which brings computation closer to the data source for reduced latency and improved privacy. At the far end, we find TinyML, which enables machine learning on extremely low-power devices with severe memory and processing constraints.\nThis chapter explores the landscape of contemporary machine learning systems, covering three key approaches: Cloud ML, Edge ML, and TinyML. Figure 2.1 illustrates the spectrum of distributed intelligence across these approaches, providing a visual comparison of their characteristics. We will examine the unique characteristics, advantages, and challenges of each approach, as depicted in the figure. Additionally, we will discuss the emerging trends and technologies that are shaping the future of machine learning deployment, considering how they might influence the balance between these three paradigms.\nThe evolution of machine learning systems can be seen as a progression from centralized to distributed computing paradigms:\nEach of these paradigms has its own strengths and is suited to different use cases:\nThe progression from Cloud to Edge to TinyML reflects a broader trend in computing towards more distributed, localized processing. This evolution is driven by the need for faster response times, improved privacy, reduced bandwidth usage, and the ability to operate in environments with limited or no connectivity.\nFigure 2.2 illustrates the key differences between Cloud ML, Edge ML, and TinyML in terms of hardware, latency, connectivity, power requirements, and model complexity. As we move from Cloud to Edge to TinyML, we see a dramatic reduction in available resources, which presents significant challenges for deploying sophisticated machine learning models. This resource disparity becomes particularly apparent when attempting to deploy deep learning models on microcontrollers, the primary hardware platform for TinyML. These tiny devices have severely constrained memory and storage capacities, which are often insufficient for conventional deep learning models. We will learn to put these things into perspective in this chapter.",
+    "text": "2.1 Overview\nResources: Slides, Videos, Exercises\nMachine learning (ML) systems, built on the foundation of computing systems, hold the potential to transform our world. These systems, with their specialized roles and real-time computational capabilities, represent a critical junction where data and computation meet on a micro-scale. They are specifically tailored to optimize performance, energy usage, and spatial efficiency—key factors essential for the successful implementation of ML systems.\nAs this chapter progresses, we will explore ML systems’ complex and fascinating world. We’ll gain insights into their structural design and operational features and understand their key role in powering ML applications. Starting with the basics of microcontroller units, we will examine the interfaces and peripherals that improve their functionalities. This chapter is designed to be a comprehensive guide that explains the nuanced aspects of different ML systems.\nML is rapidly evolving, with new paradigms reshaping how models are developed, trained, and deployed. The field is experiencing significant innovation driven by advancements in hardware, software, and algorithmic techniques. These developments are enabling machine learning to be applied in diverse settings, from large-scale cloud infrastructures to edge devices and even tiny, resource-constrained environments.\nModern machine learning systems span a spectrum of deployment options, each with its own set of characteristics and use cases. At one end, we have cloud-based ML, which leverages powerful centralized computing resources for complex, data-intensive tasks. Moving along the spectrum, we encounter edge ML, which brings computation closer to the data source for reduced latency and improved privacy. At the far end, we find TinyML, which enables machine learning on extremely low-power devices with severe memory and processing constraints.\nTo better understand the dramatic differences between these ML deployment options, Table 2.1 provides examples of representative hardware platforms for each category. These examples illustrate the vast range of computational resources, power requirements, and cost considerations across the ML systems spectrum. As we explore each paradigm in detail, you can refer back to these concrete examples to better understand the practical implications of each approach.\nThis chapter explores the landscape of contemporary machine learning systems, covering four key approaches: Cloud ML, Edge ML, and TinyML. Figure 2.1 illustrates the spectrum of distributed intelligence across these approaches, providing a visual comparison of their characteristics. We will examine the unique characteristics, advantages, and challenges of each approach, as depicted in the figure. Additionally, we will discuss the emerging trends and technologies that are shaping the future of machine learning deployment, considering how they might influence the balance between these three paradigms.\nThe evolution of machine learning systems can be seen as a progression from centralized to increasingly distributed and specialized computing paradigms:\nCloud ML: Initially, ML was predominantly cloud-based. Powerful, scalable servers in data centers are used to train and run large ML models. This approach leverages vast computational resources and storage capacities, enabling the development of complex models trained on massive datasets. Cloud ML excels at tasks requiring extensive processing power, distributed training of large models, and is ideal for applications where real-time responsiveness isn’t critical. Popular platforms like AWS SageMaker, Google Cloud AI, and Azure ML offer flexible, scalable solutions for model development, training, and deployment. Cloud ML can handle models with billions of parameters, training on petabytes of data, but may incur latencies of 100-500ms for online inference due to network delays.\nEdge ML: As the need for real-time, low-latency processing grew, Edge ML emerged. This paradigm brings inference capabilities closer to the data source, typically on edge devices such as industrial gateways, smart cameras, autonomous vehicles, or IoT hubs. Edge ML reduces latency (often to less than 50ms), enhances privacy by keeping data local, and can operate with intermittent cloud connectivity. It’s particularly useful for applications requiring quick responses or handling sensitive data in industrial or enterprise settings. Frameworks like NVIDIA Jetson or Google’s Edge TPU enable powerful ML capabilities on edge devices. Edge ML plays a crucial role in IoT ecosystems, enabling real-time decision making and reducing bandwidth usage by processing data locally.\nMobile ML: Building on edge computing concepts, Mobile ML focuses on leveraging the computational capabilities of smartphones and tablets. This approach enables personalized, responsive applications while reducing reliance on constant network connectivity. Mobile ML offers a balance between the power of edge computing and the ubiquity of personal devices. It utilizes on-device sensors (e.g., cameras, GPS, accelerometers) for unique ML applications. Frameworks like TensorFlow Lite and Core ML allow developers to deploy optimized models on mobile devices, with inference times often under 30ms for common tasks. Mobile ML enhances privacy by keeping personal data on the device and can operate offline, but must balance model performance with device resource constraints (typically 4-8GB RAM, 100-200GB storage).\nTinyML: The latest development in this progression is TinyML, which enables ML models to run on extremely resource-constrained microcontrollers and small embedded systems. TinyML allows for on-device inference without relying on connectivity to the cloud, edge, or even the processing power of mobile devices. This approach is crucial for applications where size, power consumption, and cost are critical factors. TinyML devices typically operate with less than 1MB of RAM and flash memory, consuming only milliwatts of power, enabling battery life of months or years. Applications include wake word detection, gesture recognition, and predictive maintenance in industrial settings. Platforms like Arduino Nano 33 BLE Sense and STM32 microcontrollers, coupled with frameworks like TensorFlow Lite for Microcontrollers, enable ML on these tiny devices. However, TinyML requires significant model optimization and quantization to fit within these constraints.\nEach of these paradigms has its own strengths and is suited to different use cases:\nThis progression reflects a broader trend in computing towards more distributed, localized, and specialized processing. The evolution is driven by the need for faster response times, improved privacy, reduced bandwidth usage, and the ability to operate in environments with limited or no connectivity, while also catering to the specific capabilities and constraints of different types of devices.\nFigure 2.2 illustrates the key differences between Cloud ML, Edge ML, Mobile ML, and TinyML in terms of hardware, latency, connectivity, power requirements, and model complexity. As we move from Cloud to Edge to TinyML, we see a dramatic reduction in available resources, which presents significant challenges for deploying sophisticated machine learning models. This resource disparity becomes particularly apparent when attempting to deploy deep learning models on microcontrollers, the primary hardware platform for TinyML. These tiny devices have severely constrained memory and storage capacities, which are often insufficient for conventional deep learning models. We will learn to put these things into perspective in this chapter.",
     "crumbs": [
       "<span class='chapter-number'>2</span>  <span class='chapter-title'>ML Systems</span>"
     ]
@@ -444,7 +444,7 @@
     "href": "contents/core/ml_systems/ml_systems.html#overview",
     "title": "2  ML Systems",
     "section": "",
-    "text": "Figure 2.1: Cloud vs. Edge vs. TinyML: The Spectrum of Distributed Intelligence. Source: ABI Research – TinyML.\n\n\n\n\n\nCloud ML: Initially, ML was predominantly cloud-based. Powerful servers in data centers were used to train and run large ML models. This approach leverages vast computational resources and storage capacities, enabling the development of complex models trained on massive datasets. Cloud ML excels at tasks requiring extensive processing power and is ideal for applications where real-time responsiveness isn’t critical.\nEdge ML: As the need for real-time, low-latency processing grew, Edge ML emerged. This paradigm brings inference capabilities closer to the data source, typically on edge devices such as smartphones, smart cameras, or IoT gateways. Edge ML reduces latency, enhances privacy by keeping data local, and can operate with intermittent cloud connectivity. It’s particularly useful for applications requiring quick responses or handling sensitive data.\nTinyML: The latest development in this progression is TinyML, which enables ML models to run on extremely resource-constrained microcontrollers and small embedded systems. TinyML allows for on-device inference without relying on connectivity to the cloud or edge, opening up new possibilities for intelligent, battery-operated devices. This approach is crucial for applications where size, power consumption, and cost are critical factors.\n\n\n\nCloud ML remains essential for tasks requiring massive computational power or large-scale data analysis.\nEdge ML is ideal for applications needing low-latency responses or local data processing.\nTinyML enables AI capabilities in small, power-efficient devices, expanding the reach of ML to new domains.\n\n\n\n\n\n\n\n\n\nFigure 2.2: From cloud GPUs to microcontrollers: Navigating the memory and storage landscape across computing devices. Source: (Lin et al. 2023)\n\n\nLin, Ji, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, and Song Han. 2023. “Tiny Machine Learning: Progress and Futures Feature.” IEEE Circuits Syst. Mag. 23 (3): 8–34. https://doi.org/10.1109/mcas.2023.3302182.",
+    "text": "Table 2.1: Representative hardware platforms across the ML systems spectrum, showing typical specifications and capabilities for each category.\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nCategory\nExample Device\nProcessor\nMemory\nStorage\nPower\nPrice Range\nExample Models/Tasks\n\n\n\n\nCloud ML\nNVIDIA DGX A100\n8x NVIDIA A100 GPUs (40GB/80GB)\n1TB System RAM\n15TB NVMe SSD\n6.5kW\n$200K+\nLarge language models (GPT-3), real-time video processing\n\n\n\nGoogle TPU v4 Pod\n4096 TPU v4 chips\n128TB+\nNetworked storage\n~MW\nPay-per-use\nTraining foundation models, large-scale ML research\n\n\nEdge ML\nNVIDIA Jetson AGX Orin\n12-core Arm® Cortex®-A78AE, NVIDIA Ampere GPU\n32GB LPDDR5\n64GB eMMC\n15-60W\n$899\nComputer vision, robotics, autonomous systems\n\n\n\nIntel NUC 12 Pro\nIntel Core i7-1260P, Intel Iris Xe\n32GB DDR4\n1TB SSD\n28W\n$750\nEdge AI servers, industrial automation\n\n\nMobile ML\niPhone 15 Pro\nA17 Pro (6-core CPU, 6-core GPU)\n8GB RAM\n128GB-1TB\n3-5W\n$999+\nFace ID, computational photography, voice recognition\n\n\nTinyML\nArduino Nano 33 BLE Sense\nArm Cortex-M4 @ 64MHz\n256KB RAM\n1MB Flash\n0.02-0.04W\n$35\nGesture recognition, voice detection\n\n\n\nESP32-CAM\nDual-core @ 240MHz\n520KB RAM\n4MB Flash\n0.05-0.25W\n$10\nImage classification, motion detection\n\n\n\n\n\n\n\n\n\n\n\n\n\nFigure 2.1: Cloud vs. Edge vs. TinyML: The Spectrum of Distributed Intelligence. Source: ABI Research – TinyML.\n\n\n\n\n\n\n\n\n\n\nCloud ML remains essential for tasks requiring massive computational power or large-scale data analysis.\nEdge ML is ideal for applications needing low-latency responses or local data processing in industrial or enterprise environments.\nMobile ML is suited for personalized, responsive applications on smartphones and tablets.\nTinyML enables AI capabilities in small, power-efficient devices, expanding the reach of ML to new domains.\n\n\n\n\n\n\n\n\n\nFigure 2.2: From cloud GPUs to microcontrollers: Navigating the memory and storage landscape across computing devices. Source: (Lin et al. 2023)\n\n\nLin, Ji, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, and Song Han. 2023. “Tiny Machine Learning: Progress and Futures Feature.” IEEE Circuits Syst. Mag. 23 (3): 8–34. https://doi.org/10.1109/mcas.2023.3302182.",
     "crumbs": [
       "<span class='chapter-number'>2</span>  <span class='chapter-title'>ML Systems</span>"
     ]
@@ -454,7 +454,7 @@
     "href": "contents/core/ml_systems/ml_systems.html#cloud-ml",
     "title": "2  ML Systems",
     "section": "2.2 Cloud ML",
-    "text": "2.2 Cloud ML\nCloud ML leverages powerful servers in the cloud for training and running large, complex ML models and relies on internet connectivity. Figure 2.3 provides an overview of Cloud ML’s capabilities which we will discuss in greater detail throughout this section.\n\n\n\n\n\n\nFigure 2.3: Section overview for Cloud ML.\n\n\n\n\n2.2.1 Characteristics\n\nDefinition of Cloud ML\nCloud Machine Learning (Cloud ML) is a subfield of machine learning that leverages the power and scalability of cloud computing infrastructure to develop, train, and deploy machine learning models. By utilizing the vast computational resources available in the cloud, Cloud ML enables the efficient handling of large-scale datasets and complex machine learning algorithms.\n\n\nCentralized Infrastructure\nOne of the key characteristics of Cloud ML is its centralized infrastructure. Figure 2.4 illustrates this concept with an example from Google’s Cloud TPU data center. Cloud service providers offer a virtual platform that consists of high-capacity servers, expansive storage solutions, and robust networking architectures, all housed in data centers distributed across the globe. As shown in the figure, these centralized facilities can be massive in scale, housing rows upon rows of specialized hardware. This centralized setup allows for the pooling and efficient management of computational resources, making it easier to scale machine learning projects as needed.\n\n\n\n\n\n\nFigure 2.4: Cloud TPU data center at Google. Source: Google.\n\n\n\n\n\nScalable Data Processing and Model Training\nCloud ML excels in its ability to process and analyze massive volumes of data. The centralized infrastructure is designed to handle complex computations and model training tasks that require significant computational power. By leveraging the scalability of the cloud, machine learning models can be trained on vast amounts of data, leading to improved learning capabilities and predictive performance.\n\n\nFlexible Deployment and Accessibility\nAnother advantage of Cloud ML is the flexibility it offers in terms of deployment and accessibility. Once a machine learning model is trained and validated, it can be easily deployed and made accessible to users through cloud-based services. This allows for seamless integration of machine learning capabilities into various applications and services, regardless of the user’s location or device.\n\n\nCollaboration and Resource Sharing\nCloud ML promotes collaboration and resource sharing among teams and organizations. The centralized nature of the cloud infrastructure enables multiple users to access and work on the same machine learning projects simultaneously. This collaborative approach facilitates knowledge sharing, accelerates the development process, and optimizes resource utilization.\n\n\nCost-Effectiveness and Scalability\nBy leveraging the pay-as-you-go pricing model offered by cloud service providers, Cloud ML allows organizations to avoid the upfront costs associated with building and maintaining their own machine learning infrastructure. The ability to scale resources up or down based on demand ensures cost-effectiveness and flexibility in managing machine learning projects.\nCloud ML has revolutionized the way machine learning is approached, making it more accessible, scalable, and efficient. It has opened up new possibilities for organizations to harness the power of machine learning without the need for significant investments in hardware and infrastructure.\n\n\n\n2.2.2 Benefits\nCloud ML offers several significant benefits that make it a powerful choice for machine learning projects:\n\nImmense Computational Power\nOne of the key advantages of Cloud ML is its ability to provide vast computational resources. The cloud infrastructure is designed to handle complex algorithms and process large datasets efficiently. This is particularly beneficial for machine learning models that require significant computational power, such as deep learning networks or models trained on massive datasets. By leveraging the cloud’s computational capabilities, organizations can overcome the limitations of local hardware setups and scale their machine learning projects to meet demanding requirements.\n\n\nDynamic Scalability\nCloud ML offers dynamic scalability, allowing organizations to easily adapt to changing computational needs. As the volume of data grows or the complexity of machine learning models increases, the cloud infrastructure can seamlessly scale up or down to accommodate these changes. This flexibility ensures consistent performance and enables organizations to handle varying workloads without the need for extensive hardware investments. With Cloud ML, resources can be allocated on-demand, providing a cost-effective and efficient solution for managing machine learning projects.\n\n\nAccess to Advanced Tools and Algorithms\nCloud ML platforms provide access to a wide range of advanced tools and algorithms specifically designed for machine learning. These tools often include pre-built libraries, frameworks, and APIs that simplify the development and deployment of machine learning models. Developers can leverage these resources to accelerate the building, training, and optimization of sophisticated models. By utilizing the latest advancements in machine learning algorithms and techniques, organizations can stay at the forefront of innovation and achieve better results in their machine learning projects.\n\n\nCollaborative Environment\nCloud ML fosters a collaborative environment that enables teams to work together seamlessly. The centralized nature of the cloud infrastructure allows multiple users to access and contribute to the same machine learning projects simultaneously. This collaborative approach facilitates knowledge sharing, promotes cross-functional collaboration, and accelerates the development and iteration of machine learning models. Teams can easily share code, datasets, and results, enabling efficient collaboration and driving innovation across the organization.\n\n\nCost-Effectiveness\nAdopting Cloud ML can be a cost-effective solution for organizations, especially compared to building and maintaining an on-premises machine learning infrastructure. Cloud service providers offer flexible pricing models, such as pay-as-you-go or subscription-based plans, allowing organizations to pay only for the resources they consume. This eliminates the need for upfront capital investments in hardware and infrastructure, reducing the overall cost of implementing machine learning projects. Additionally, the scalability of Cloud ML ensures that organizations can optimize their resource usage and avoid over provisioning, further enhancing cost-efficiency.\nThe benefits of Cloud ML, including its immense computational power, dynamic scalability, access to advanced tools and algorithms, collaborative environment, and cost-effectiveness, make it a compelling choice for organizations looking to harness the potential of machine learning. By leveraging the capabilities of the cloud, organizations can accelerate their machine learning initiatives, drive innovation, and gain a competitive edge in today’s data-driven landscape.\n\n\n\n2.2.3 Challenges\nWhile Cloud ML offers numerous benefits, it also comes with certain challenges that organizations need to consider:\n\nLatency Issues\nOne of the main challenges of Cloud ML is the potential for latency issues, especially in applications that require real-time responses. Since data needs to be sent from the data source to centralized cloud servers for processing and then back to the application, there can be delays introduced by network transmission. This latency can be a significant drawback in time-sensitive scenarios, such as autonomous vehicles, real-time fraud detection, or industrial control systems, where immediate decision-making is critical. Developers need to carefully design their systems to minimize latency and ensure acceptable response times.\n\n\nData Privacy and Security Concerns\nCentralizing data processing and storage in the cloud can raise concerns about data privacy and security. When sensitive data is transmitted and stored in remote data centers, it becomes vulnerable to potential cyber-attacks and unauthorized access. Cloud data centers can become attractive targets for hackers seeking to exploit vulnerabilities and gain access to valuable information. Organizations need to invest in robust security measures, such as encryption, access controls, and continuous monitoring, to protect their data in the cloud. Compliance with data privacy regulations, such as GDPR or HIPAA, also becomes a critical consideration when handling sensitive data in the cloud.\n\n\nCost Considerations\nAs data processing needs grow, the costs associated with using cloud services can escalate. While Cloud ML offers scalability and flexibility, organizations dealing with large data volumes may face increasing costs as they consume more cloud resources. The pay-as-you-go pricing model of cloud services means that costs can quickly add up, especially for compute-intensive tasks like model training and inference. Organizations need to carefully monitor and optimize their cloud usage to ensure cost-effectiveness. They may need to consider strategies such as data compression, efficient algorithm design, and resource allocation optimization to minimize costs while still achieving desired performance.\n\n\nDependency on Internet Connectivity\nCloud ML relies on stable and reliable internet connectivity to function effectively. Since data needs to be transmitted to and from the cloud, any disruptions or limitations in network connectivity can impact the performance and availability of the machine learning system. This dependency on internet connectivity can be a challenge in scenarios where network access is limited, unreliable, or expensive. Organizations need to ensure robust network infrastructure and consider failover mechanisms or offline capabilities to mitigate the impact of connectivity issues.\n\n\nVendor Lock-In\nWhen adopting Cloud ML, organizations often become dependent on the specific tools, APIs, and services provided by their chosen cloud vendor. This vendor lock-in can make it difficult to switch providers or migrate to different platforms in the future. Organizations may face challenges in terms of portability, interoperability, and cost when considering a change in their cloud ML provider. It is important to carefully evaluate vendor offerings, consider long-term strategic goals, and plan for potential migration scenarios to minimize the risks associated with vendor lock-in.\nAddressing these challenges requires careful planning, architectural design, and risk mitigation strategies. Organizations need to weigh the benefits of Cloud ML against the potential challenges and make informed decisions based on their specific requirements, data sensitivity, and business objectives. By proactively addressing these challenges, organizations can effectively leverage the power of Cloud ML while ensuring data privacy, security, cost-effectiveness, and overall system reliability.\n\n\n\n2.2.4 Example Use Cases\nCloud ML has found widespread adoption across various domains, revolutionizing the way businesses operate and users interact with technology. Let’s explore some notable examples of Cloud ML in action:\n\nVirtual Assistants\nCloud ML plays a crucial role in powering virtual assistants like Siri and Alexa. These systems leverage the immense computational capabilities of the cloud to process and analyze voice inputs in real-time. By harnessing the power of natural language processing and machine learning algorithms, virtual assistants can understand user queries, extract relevant information, and generate intelligent and personalized responses. The cloud’s scalability and processing power enable these assistants to handle a vast number of user interactions simultaneously, providing a seamless and responsive user experience.\n\n\nRecommendation Systems\nCloud ML forms the backbone of advanced recommendation systems used by platforms like Netflix and Amazon. These systems use the cloud’s ability to process and analyze massive datasets to uncover patterns, preferences, and user behavior. By leveraging collaborative filtering and other machine learning techniques, recommendation systems can offer personalized content or product suggestions tailored to each user’s interests. The cloud’s scalability allows these systems to continuously update and refine their recommendations based on the ever-growing amount of user data, enhancing user engagement and satisfaction.\n\n\nFraud Detection\nIn the financial industry, Cloud ML has revolutionized fraud detection systems. By leveraging the cloud’s computational power, these systems can analyze vast amounts of transactional data in real-time to identify potential fraudulent activities. Machine learning algorithms trained on historical fraud patterns can detect anomalies and suspicious behavior, enabling financial institutions to take proactive measures to prevent fraud and minimize financial losses. The cloud’s ability to process and store large volumes of data makes it an ideal platform for implementing robust and scalable fraud detection systems.\n\n\nPersonalized User Experiences\nCloud ML is deeply integrated into our online experiences, shaping the way we interact with digital platforms. From personalized ads on social media feeds to predictive text features in email services, Cloud ML powers smart algorithms that enhance user engagement and convenience. It enables e-commerce sites to recommend products based on a user’s browsing and purchase history, fine-tunes search engines to deliver accurate and relevant results, and automates the tagging and categorization of photos on platforms like Facebook. By leveraging the cloud’s computational resources, these systems can continuously learn and adapt to user preferences, providing a more intuitive and personalized user experience.\n\n\nSecurity and Anomaly Detection\nCloud ML plays a role in bolstering user security by powering anomaly detection systems. These systems continuously monitor user activities and system logs to identify unusual patterns or suspicious behavior. By analyzing vast amounts of data in real-time, Cloud ML algorithms can detect potential cyber threats, such as unauthorized access attempts, malware infections, or data breaches. The cloud’s scalability and processing power enable these systems to handle the increasing complexity and volume of security data, providing a proactive approach to protecting users and systems from potential threats.",
+    "text": "2.2 Cloud ML\nCloud Machine Learning (Cloud ML) is a subfield of machine learning that leverages the power and scalability of cloud computing infrastructure to develop, train, and deploy machine learning models. By utilizing the vast computational resources available in the cloud, Cloud ML enables the efficient handling of large-scale datasets and complex machine learning algorithms. Figure 2.3 provides an overview of Cloud ML’s capabilities which we will discuss in greater detail throughout this section.\n\n\n\n\n\n\nFigure 2.3: Section overview for Cloud ML.\n\n\n\n\n2.2.1 Characteristics\n\nCentralized Infrastructure\nOne of the key characteristics of Cloud ML is its centralized infrastructure. Figure 2.4 illustrates this concept with an example from Google’s Cloud TPU data center. Cloud service providers offer a virtual platform that consists of high-capacity servers, expansive storage solutions, and robust networking architectures, all housed in data centers distributed across the globe. As shown in the figure, these centralized facilities can be massive in scale, housing rows upon rows of specialized hardware. This centralized setup allows for the pooling and efficient management of computational resources, making it easier to scale machine learning projects as needed.\n\n\n\n\n\n\nFigure 2.4: Cloud TPU data center at Google. Source: Google.\n\n\n\n\n\nScalable Data Processing and Model Training\nCloud ML excels in its ability to process and analyze massive volumes of data. The centralized infrastructure is designed to handle complex computations and model training tasks that require significant computational power. By leveraging the scalability of the cloud, machine learning models can be trained on vast amounts of data, leading to improved learning capabilities and predictive performance.\n\n\nFlexible Deployment and Accessibility\nAnother advantage of Cloud ML is the flexibility it offers in terms of deployment and accessibility. Once a machine learning model is trained and validated, it can be easily deployed and made accessible to users through cloud-based services. This allows for seamless integration of machine learning capabilities into various applications and services, regardless of the user’s location or device.\n\n\nCollaboration and Resource Sharing\nCloud ML promotes collaboration and resource sharing among teams and organizations. The centralized nature of the cloud infrastructure enables multiple users to access and work on the same machine learning projects simultaneously. This collaborative approach facilitates knowledge sharing, accelerates the development process, and optimizes resource utilization.\n\n\nCost-Effectiveness and Scalability\nBy leveraging the pay-as-you-go pricing model offered by cloud service providers, Cloud ML allows organizations to avoid the upfront costs associated with building and maintaining their own machine learning infrastructure. The ability to scale resources up or down based on demand ensures cost-effectiveness and flexibility in managing machine learning projects.\nCloud ML has revolutionized the way machine learning is approached, making it more accessible, scalable, and efficient. It has opened up new possibilities for organizations to harness the power of machine learning without the need for significant investments in hardware and infrastructure.\n\n\n\n2.2.2 Benefits\nCloud ML offers several significant benefits that make it a powerful choice for machine learning projects:\n\nImmense Computational Power\nOne of the key advantages of Cloud ML is its ability to provide vast computational resources. The cloud infrastructure is designed to handle complex algorithms and process large datasets efficiently. This is particularly beneficial for machine learning models that require significant computational power, such as deep learning networks or models trained on massive datasets. By leveraging the cloud’s computational capabilities, organizations can overcome the limitations of local hardware setups and scale their machine learning projects to meet demanding requirements.\n\n\nDynamic Scalability\nCloud ML offers dynamic scalability, allowing organizations to easily adapt to changing computational needs. As the volume of data grows or the complexity of machine learning models increases, the cloud infrastructure can seamlessly scale up or down to accommodate these changes. This flexibility ensures consistent performance and enables organizations to handle varying workloads without the need for extensive hardware investments. With Cloud ML, resources can be allocated on-demand, providing a cost-effective and efficient solution for managing machine learning projects.\n\n\nAccess to Advanced Tools and Algorithms\nCloud ML platforms provide access to a wide range of advanced tools and algorithms specifically designed for machine learning. These tools often include pre-built libraries, frameworks, and APIs that simplify the development and deployment of machine learning models. Developers can leverage these resources to accelerate the building, training, and optimization of sophisticated models. By utilizing the latest advancements in machine learning algorithms and techniques, organizations can stay at the forefront of innovation and achieve better results in their machine learning projects.\n\n\nCollaborative Environment\nCloud ML fosters a collaborative environment that enables teams to work together seamlessly. The centralized nature of the cloud infrastructure allows multiple users to access and contribute to the same machine learning projects simultaneously. This collaborative approach facilitates knowledge sharing, promotes cross-functional collaboration, and accelerates the development and iteration of machine learning models. Teams can easily share code, datasets, and results, enabling efficient collaboration and driving innovation across the organization.\n\n\nCost-Effectiveness\nAdopting Cloud ML can be a cost-effective solution for organizations, especially compared to building and maintaining an on-premises machine learning infrastructure. Cloud service providers offer flexible pricing models, such as pay-as-you-go or subscription-based plans, allowing organizations to pay only for the resources they consume. This eliminates the need for upfront capital investments in hardware and infrastructure, reducing the overall cost of implementing machine learning projects. Additionally, the scalability of Cloud ML ensures that organizations can optimize their resource usage and avoid over provisioning, further enhancing cost-efficiency.\nThe benefits of Cloud ML, including its immense computational power, dynamic scalability, access to advanced tools and algorithms, collaborative environment, and cost-effectiveness, make it a compelling choice for organizations looking to harness the potential of machine learning. By leveraging the capabilities of the cloud, organizations can accelerate their machine learning initiatives, drive innovation, and gain a competitive edge in today’s data-driven landscape.\n\n\n\n2.2.3 Challenges\nWhile Cloud ML offers numerous benefits, it also comes with certain challenges that organizations need to consider:\n\nLatency Issues\nOne of the main challenges of Cloud ML is the potential for latency issues, especially in applications that require real-time responses. Since data needs to be sent from the data source to centralized cloud servers for processing and then back to the application, there can be delays introduced by network transmission. This latency can be a significant drawback in time-sensitive scenarios, such as autonomous vehicles, real-time fraud detection, or industrial control systems, where immediate decision-making is critical. Developers need to carefully design their systems to minimize latency and ensure acceptable response times.\n\n\nData Privacy and Security Concerns\nCentralizing data processing and storage in the cloud can raise concerns about data privacy and security. When sensitive data is transmitted and stored in remote data centers, it becomes vulnerable to potential cyber-attacks and unauthorized access. Cloud data centers can become attractive targets for hackers seeking to exploit vulnerabilities and gain access to valuable information. Organizations need to invest in robust security measures, such as encryption, access controls, and continuous monitoring, to protect their data in the cloud. Compliance with data privacy regulations, such as GDPR or HIPAA, also becomes a critical consideration when handling sensitive data in the cloud.\n\n\nCost Considerations\nAs data processing needs grow, the costs associated with using cloud services can escalate. While Cloud ML offers scalability and flexibility, organizations dealing with large data volumes may face increasing costs as they consume more cloud resources. The pay-as-you-go pricing model of cloud services means that costs can quickly add up, especially for compute-intensive tasks like model training and inference. Organizations need to carefully monitor and optimize their cloud usage to ensure cost-effectiveness. They may need to consider strategies such as data compression, efficient algorithm design, and resource allocation optimization to minimize costs while still achieving desired performance.\n\n\nDependency on Internet Connectivity\nCloud ML relies on stable and reliable internet connectivity to function effectively. Since data needs to be transmitted to and from the cloud, any disruptions or limitations in network connectivity can impact the performance and availability of the machine learning system. This dependency on internet connectivity can be a challenge in scenarios where network access is limited, unreliable, or expensive. Organizations need to ensure robust network infrastructure and consider failover mechanisms or offline capabilities to mitigate the impact of connectivity issues.\n\n\nVendor Lock-In\nWhen adopting Cloud ML, organizations often become dependent on the specific tools, APIs, and services provided by their chosen cloud vendor. This vendor lock-in can make it difficult to switch providers or migrate to different platforms in the future. Organizations may face challenges in terms of portability, interoperability, and cost when considering a change in their cloud ML provider. It is important to carefully evaluate vendor offerings, consider long-term strategic goals, and plan for potential migration scenarios to minimize the risks associated with vendor lock-in.\nAddressing these challenges requires careful planning, architectural design, and risk mitigation strategies. Organizations need to weigh the benefits of Cloud ML against the potential challenges and make informed decisions based on their specific requirements, data sensitivity, and business objectives. By proactively addressing these challenges, organizations can effectively leverage the power of Cloud ML while ensuring data privacy, security, cost-effectiveness, and overall system reliability.\n\n\n\n2.2.4 Example Use Cases\nCloud ML has found widespread adoption across various domains, revolutionizing the way businesses operate and users interact with technology. Let’s explore some notable examples of Cloud ML in action:\n\nVirtual Assistants\nCloud ML plays a crucial role in powering virtual assistants like Siri and Alexa. These systems leverage the immense computational capabilities of the cloud to process and analyze voice inputs in real-time. By harnessing the power of natural language processing and machine learning algorithms, virtual assistants can understand user queries, extract relevant information, and generate intelligent and personalized responses. The cloud’s scalability and processing power enable these assistants to handle a vast number of user interactions simultaneously, providing a seamless and responsive user experience.\n\n\nRecommendation Systems\nCloud ML forms the backbone of advanced recommendation systems used by platforms like Netflix and Amazon. These systems use the cloud’s ability to process and analyze massive datasets to uncover patterns, preferences, and user behavior. By leveraging collaborative filtering and other machine learning techniques, recommendation systems can offer personalized content or product suggestions tailored to each user’s interests. The cloud’s scalability allows these systems to continuously update and refine their recommendations based on the ever-growing amount of user data, enhancing user engagement and satisfaction.\n\n\nFraud Detection\nIn the financial industry, Cloud ML has revolutionized fraud detection systems. By leveraging the cloud’s computational power, these systems can analyze vast amounts of transactional data in real-time to identify potential fraudulent activities. Machine learning algorithms trained on historical fraud patterns can detect anomalies and suspicious behavior, enabling financial institutions to take proactive measures to prevent fraud and minimize financial losses. The cloud’s ability to process and store large volumes of data makes it an ideal platform for implementing robust and scalable fraud detection systems.\n\n\nPersonalized User Experiences\nCloud ML is deeply integrated into our online experiences, shaping the way we interact with digital platforms. From personalized ads on social media feeds to predictive text features in email services, Cloud ML powers smart algorithms that enhance user engagement and convenience. It enables e-commerce sites to recommend products based on a user’s browsing and purchase history, fine-tunes search engines to deliver accurate and relevant results, and automates the tagging and categorization of photos on platforms like Facebook. By leveraging the cloud’s computational resources, these systems can continuously learn and adapt to user preferences, providing a more intuitive and personalized user experience.\n\n\nSecurity and Anomaly Detection\nCloud ML plays a role in bolstering user security by powering anomaly detection systems. These systems continuously monitor user activities and system logs to identify unusual patterns or suspicious behavior. By analyzing vast amounts of data in real-time, Cloud ML algorithms can detect potential cyber threats, such as unauthorized access attempts, malware infections, or data breaches. The cloud’s scalability and processing power enable these systems to handle the increasing complexity and volume of security data, providing a proactive approach to protecting users and systems from potential threats.",
     "crumbs": [
       "<span class='chapter-number'>2</span>  <span class='chapter-title'>ML Systems</span>"
     ]
@@ -464,7 +464,17 @@
     "href": "contents/core/ml_systems/ml_systems.html#edge-ml",
     "title": "2  ML Systems",
     "section": "2.3 Edge ML",
-    "text": "2.3 Edge ML\n\n2.3.1 Characteristics\n\nDefinition of Edge ML\nEdge Machine Learning (Edge ML) runs machine learning algorithms directly on endpoint devices or closer to where the data is generated rather than relying on centralized cloud servers. This approach brings computation closer to the data source, reducing the need to send large volumes of data over networks, often resulting in lower latency and improved data privacy. Figure 2.5 provides an overview of this section.\n\n\n\n\n\n\nFigure 2.5: Section overview for Edge ML.\n\n\n\n\n\nDecentralized Data Processing\nIn Edge ML, data processing happens in a decentralized fashion, as illustrated in Figure 2.6. Instead of sending data to remote servers, the data is processed locally on devices like smartphones, tablets, or Internet of Things (IoT) devices. The figure showcases various examples of these edge devices, including wearables, industrial sensors, and smart home appliances. This local processing allows devices to make quick decisions based on the data they collect without relying heavily on a central server’s resources.\n\n\n\n\n\n\nFigure 2.6: Edge ML Examples. Source: Edge Impulse.\n\n\n\n\n\nLocal Data Storage and Computation\nLocal data storage and computation are key features of Edge ML. This setup ensures that data can be stored and analyzed directly on the devices, thereby maintaining the privacy of the data and reducing the need for constant internet connectivity. Moreover, this often leads to more efficient computation, as data doesn’t have to travel long distances, and computations are performed with a more nuanced understanding of the local context, which can sometimes result in more insightful analyses.\n\n\n\n2.3.2 Benefits\n\nReduced Latency\nOne of Edge ML’s main advantages is the significant latency reduction compared to Cloud ML. This reduced latency can be a critical benefit in situations where milliseconds count, such as in autonomous vehicles, where quick decision-making can mean the difference between safety and an accident.\n\n\nEnhanced Data Privacy\nEdge ML also offers improved data privacy, as data is primarily stored and processed locally. This minimizes the risk of data breaches that are more common in centralized data storage solutions. Sensitive information can be kept more secure, as it’s not sent over networks that could be intercepted.\n\n\nLower Bandwidth Usage\nOperating closer to the data source means less data must be sent over networks, reducing bandwidth usage. This can result in cost savings and efficiency gains, especially in environments where bandwidth is limited or costly.\n\n\n\n2.3.3 Challenges\n\nLimited Computational Resources Compared to Cloud ML\nHowever, Edge ML has its challenges. One of the main concerns is the limited computational resources compared to cloud-based solutions. Endpoint devices may have a different processing power or storage capacity than cloud servers, limiting the complexity of the machine learning models that can be deployed.\n\n\nComplexity in Managing Edge Nodes\nManaging a network of edge nodes can introduce complexity, especially regarding coordination, updates, and maintenance. Ensuring all nodes operate seamlessly and are up-to-date with the latest algorithms and security protocols can be a logistical challenge.\n\n\nSecurity Concerns at the Edge Nodes\nWhile Edge ML offers enhanced data privacy, edge nodes can sometimes be more vulnerable to physical and cyber-attacks. Developing robust security protocols that protect data at each node without compromising the system’s efficiency remains a significant challenge in deploying Edge ML solutions.\n\n\n\n2.3.4 Example Use Cases\nEdge ML has many applications, from autonomous vehicles and smart homes to industrial Internet of Things (IoT). These examples were chosen to highlight scenarios where real-time data processing, reduced latency, and enhanced privacy are not just beneficial but often critical to the operation and success of these technologies. They demonstrate the role that Edge ML can play in driving advancements in various sectors, fostering innovation, and paving the way for more intelligent, responsive, and adaptive systems.\n\nAutonomous Vehicles\nAutonomous vehicles stand as a prime example of Edge ML’s potential. These vehicles rely heavily on real-time data processing to navigate and make decisions. Localized machine learning models assist in quickly analyzing data from various sensors to make immediate driving decisions, ensuring safety and smooth operation.\n\n\nSmart Homes and Buildings\nEdge ML plays a crucial role in efficiently managing various systems in smart homes and buildings, from lighting and heating to security. By processing data locally, these systems can operate more responsively and harmoniously with the occupants’ habits and preferences, creating a more comfortable living environment.\n\n\nIndustrial IoT\nThe Industrial IoT leverages Edge ML to monitor and control complex industrial processes. Here, machine learning models can analyze data from numerous sensors in real-time, enabling predictive maintenance, optimizing operations, and enhancing safety measures. This revolution in industrial automation and efficiency is transforming manufacturing and production across various sectors.\nThe applicability of Edge ML is vast and not limited to these examples. Various other sectors, including healthcare, agriculture, and urban planning, are exploring and integrating Edge ML to develop innovative solutions responsive to real-world needs and challenges, heralding a new era of smart, interconnected systems.",
+    "text": "2.3 Edge ML\nEdge Machine Learning (Edge ML) runs machine learning algorithms directly on endpoint devices or closer to where the data is generated rather than relying on centralized cloud servers. This approach brings computation closer to the data source, reducing the need to send large volumes of data over networks, often resulting in lower latency and improved data privacy. Figure 2.5 provides an overview of this section.\n\n\n\n\n\n\nFigure 2.5: Section overview for Edge ML.\n\n\n\n\n2.3.1 Characteristics\n\nDecentralized Data Processing\nIn Edge ML, data processing happens in a decentralized fashion, as illustrated in Figure 2.6. Instead of sending data to remote servers, the data is processed locally on devices like smartphones, tablets, or Internet of Things (IoT) devices. The figure showcases various examples of these edge devices, including wearables, industrial sensors, and smart home appliances. This local processing allows devices to make quick decisions based on the data they collect without relying heavily on a central server’s resources.\n\n\n\n\n\n\nFigure 2.6: Edge ML Examples. Source: Edge Impulse.\n\n\n\n\n\nLocal Data Storage and Computation\nLocal data storage and computation are key features of Edge ML. This setup ensures that data can be stored and analyzed directly on the devices, thereby maintaining the privacy of the data and reducing the need for constant internet connectivity. Moreover, this often leads to more efficient computation, as data doesn’t have to travel long distances, and computations are performed with a more nuanced understanding of the local context, which can sometimes result in more insightful analyses.\n\n\n\n2.3.2 Benefits\n\nReduced Latency\nOne of Edge ML’s main advantages is the significant latency reduction compared to Cloud ML. This reduced latency can be a critical benefit in situations where milliseconds count, such as in autonomous vehicles, where quick decision-making can mean the difference between safety and an accident.\n\n\nEnhanced Data Privacy\nEdge ML also offers improved data privacy, as data is primarily stored and processed locally. This minimizes the risk of data breaches that are more common in centralized data storage solutions. Sensitive information can be kept more secure, as it’s not sent over networks that could be intercepted.\n\n\nLower Bandwidth Usage\nOperating closer to the data source means less data must be sent over networks, reducing bandwidth usage. This can result in cost savings and efficiency gains, especially in environments where bandwidth is limited or costly.\n\n\n\n2.3.3 Challenges\n\nLimited Computational Resources Compared to Cloud ML\nHowever, Edge ML has its challenges. One of the main concerns is the limited computational resources compared to cloud-based solutions. Endpoint devices may have a different processing power or storage capacity than cloud servers, limiting the complexity of the machine learning models that can be deployed.\n\n\nComplexity in Managing Edge Nodes\nManaging a network of edge nodes can introduce complexity, especially regarding coordination, updates, and maintenance. Ensuring all nodes operate seamlessly and are up-to-date with the latest algorithms and security protocols can be a logistical challenge.\n\n\nSecurity Concerns at the Edge Nodes\nWhile Edge ML offers enhanced data privacy, edge nodes can sometimes be more vulnerable to physical and cyber-attacks. Developing robust security protocols that protect data at each node without compromising the system’s efficiency remains a significant challenge in deploying Edge ML solutions.\n\n\n\n2.3.4 Example Use Cases\nEdge ML has many applications, from autonomous vehicles and smart homes to industrial Internet of Things (IoT). These examples were chosen to highlight scenarios where real-time data processing, reduced latency, and enhanced privacy are not just beneficial but often critical to the operation and success of these technologies. They demonstrate the role that Edge ML can play in driving advancements in various sectors, fostering innovation, and paving the way for more intelligent, responsive, and adaptive systems.\n\nAutonomous Vehicles\nAutonomous vehicles stand as a prime example of Edge ML’s potential. These vehicles rely heavily on real-time data processing to navigate and make decisions. Localized machine learning models assist in quickly analyzing data from various sensors to make immediate driving decisions, ensuring safety and smooth operation.\n\n\nSmart Homes and Buildings\nEdge ML plays a crucial role in efficiently managing various systems in smart homes and buildings, from lighting and heating to security. By processing data locally, these systems can operate more responsively and harmoniously with the occupants’ habits and preferences, creating a more comfortable living environment.\n\n\nIndustrial IoT\nThe Industrial IoT leverages Edge ML to monitor and control complex industrial processes. Here, machine learning models can analyze data from numerous sensors in real-time, enabling predictive maintenance, optimizing operations, and enhancing safety measures. This revolution in industrial automation and efficiency is transforming manufacturing and production across various sectors.\nThe applicability of Edge ML is vast and not limited to these examples. Various other sectors, including healthcare, agriculture, and urban planning, are exploring and integrating Edge ML to develop innovative solutions responsive to real-world needs and challenges, heralding a new era of smart, interconnected systems.",
+    "crumbs": [
+      "<span class='chapter-number'>2</span>  <span class='chapter-title'>ML Systems</span>"
+    ]
+  },
+  {
+    "objectID": "contents/core/ml_systems/ml_systems.html#mobile-ml",
+    "href": "contents/core/ml_systems/ml_systems.html#mobile-ml",
+    "title": "2  ML Systems",
+    "section": "2.4 Mobile ML",
+    "text": "2.4 Mobile ML\nMobile Machine Learning (Mobile ML) represents a specialized branch of Edge ML that focuses on deploying and running machine learning models directly on mobile devices like smartphones and tablets. This approach leverages the computational capabilities of modern mobile processors to perform ML tasks locally, offering a balance between the power of edge computing and the ubiquity of personal devices.\n\n2.4.1 Characteristics\n\nOn-Device Processing\nMobile ML utilizes the processing power of mobile devices’ System-on-Chip (SoC) architectures, including specialized Neural Processing Units (NPUs) and AI accelerators. This enables efficient execution of ML models directly on the device, allowing for real-time processing of data from device sensors like cameras, microphones, and motion sensors without constant cloud connectivity.\n\n\nOptimized Frameworks\nMobile ML is supported by specialized frameworks and tools designed specifically for mobile deployment, such as TensorFlow Lite for Android devices and Core ML for iOS devices. These frameworks are optimized for mobile hardware and provide efficient model compression and quantization techniques to ensure smooth performance within mobile resource constraints.\n\n\n\n2.4.2 Benefits\n\nReal-Time Processing\nMobile ML enables real-time processing of data directly on mobile devices, eliminating the need for constant server communication. This results in faster response times for applications requiring immediate feedback, such as real-time translation, face detection, or gesture recognition.\n\n\nPrivacy Preservation\nBy processing data locally on the device, Mobile ML helps maintain user privacy. Sensitive information doesn’t need to leave the device, reducing the risk of data breaches and addressing privacy concerns, particularly important for applications handling personal data.\n\n\nOffline Functionality\nMobile ML applications can function without constant internet connectivity, making them reliable in areas with poor network coverage or when users are offline. This ensures consistent performance and user experience regardless of network conditions.\n\n\n\n2.4.3 Challenges\n\nResource Constraints\nDespite modern mobile devices being powerful, they still face resource constraints compared to cloud servers. Mobile ML must operate within limited RAM, storage, and processing power, requiring careful optimization of models and efficient resource management.\n\n\nBattery Life Impact\nML operations can be computationally intensive, potentially impacting device battery life. Developers must balance model complexity and performance with power consumption to ensure reasonable battery life for users.\n\n\nModel Size Limitations\nMobile devices have limited storage space, necessitating careful consideration of model size. This often requires model compression and quantization techniques, which can affect model accuracy and performance.\n\n\n\n2.4.4 Example Use Cases\n\nComputer Vision Applications\nMobile ML has revolutionized how we use cameras on mobile devices, enabling sophisticated computer vision applications that process visual data in real-time. Modern smartphone cameras now incorporate ML models that can detect faces, analyze scenes, and apply complex filters instantaneously. These models work directly on the camera feed to enable features like portrait mode photography, where ML algorithms separate foreground subjects from backgrounds. Document scanning applications use ML to detect paper edges, correct perspective, and enhance text readability, while augmented reality applications use ML-powered object detection to accurately place virtual objects in the real world.\n\n\nNatural Language Processing\nNatural language processing on mobile devices has transformed how we interact with our phones and communicate with others. Speech recognition models run directly on device, enabling voice assistants to respond quickly to commands even without internet connectivity. Real-time translation applications can now translate conversations and text without sending data to the cloud, preserving privacy and working reliably regardless of network conditions. Mobile keyboards have become increasingly intelligent, using ML to predict not just the next word but entire phrases based on the user’s writing style and context, while maintaining all learning and personalization locally on the device.\n\n\nHealth and Fitness Monitoring\nMobile ML has enabled smartphones and tablets to become sophisticated health monitoring devices. Through clever use of existing sensors combined with ML models, mobile devices can now track physical activity, analyze sleep patterns, and monitor vital signs. For example, cameras can measure heart rate by detecting subtle color changes in the user’s skin, while accelerometers and ML models work together to recognize specific exercises and analyze workout form. These applications process sensitive health data directly on the device, ensuring privacy while providing users with real-time feedback and personalized health insights.\n\n\nPersonalization and User Experience\nPerhaps the most pervasive but least visible application of Mobile ML lies in how it personalizes and enhances the overall user experience. ML models continuously analyze how users interact with their devices to optimize everything from battery usage to interface layouts. These models learn individual usage patterns to predict which apps users are likely to open next, preload content they might want to see, and adjust system settings like screen brightness and audio levels based on environmental conditions and user preferences. This creates a deeply personalized experience that adapts to each user’s needs while maintaining privacy by keeping all learning and adaptation on the device itself.\nThese applications demonstrate how Mobile ML bridges the gap between cloud-based solutions and edge computing, providing efficient, privacy-conscious, and user-friendly machine learning capabilities on personal mobile devices. The continuous advancement in mobile hardware capabilities and optimization techniques continues to expand the possibilities for Mobile ML applications.",
     "crumbs": [
       "<span class='chapter-number'>2</span>  <span class='chapter-title'>ML Systems</span>"
     ]
@@ -473,8 +483,18 @@
     "objectID": "contents/core/ml_systems/ml_systems.html#tiny-ml",
     "href": "contents/core/ml_systems/ml_systems.html#tiny-ml",
     "title": "2  ML Systems",
-    "section": "2.4 Tiny ML",
-    "text": "2.4 Tiny ML\n\n2.4.1 Characteristics\n\nDefinition of TinyML\nTinyML sits at the crossroads of embedded systems and machine learning, representing a burgeoning field that brings smart algorithms directly to tiny microcontrollers and sensors. These microcontrollers operate under severe resource constraints, particularly regarding memory, storage, and computational power. Figure 2.7 encapsulates the key aspects of TinyML discussed in this section.\n\n\n\n\n\n\nFigure 2.7: Section overview for Tiny ML.\n\n\n\n\n\nOn-Device Machine Learning\nIn TinyML, the focus is on on-device machine learning. This means that machine learning models are deployed and trained on the device, eliminating the need for external servers or cloud infrastructures. This allows TinyML to enable intelligent decision-making right where the data is generated, making real-time insights and actions possible, even in settings where connectivity is limited or unavailable.\n\n\nLow Power and Resource-Constrained Environments\nTinyML excels in low-power and resource-constrained settings. These environments require highly optimized solutions that function within the available resources. Figure 2.8 showcases an example TinyML device kit, illustrating the compact nature of these systems. These devices can typically fit in the palm of your hand or, in some cases, are even as small as a fingernail. TinyML meets the need for efficiency through specialized algorithms and models designed to deliver decent performance while consuming minimal energy, thus ensuring extended operational periods, even in battery-powered devices like those shown.\n\n\n\n\n\n\nFigure 2.8: Examples of TinyML device kits. Source: Widening Access to Applied Machine Learning with TinyML.\n\n\n\n\n\n\n\n\n\nExercise 2.1: TinyML with Arduino\n\n\n\n\n\nGet ready to bring machine learning to the smallest of devices! In the embedded machine learning world, TinyML is where resource constraints meet ingenuity. This Colab notebook will walk you through building a gesture recognition model designed on an Arduino board. You’ll learn how to train a small but effective neural network, optimize it for minimal memory usage, and deploy it to your microcontroller. If you’re excited about making everyday objects smarter, this is where it begins!\n\n\n\n\n\n\n\n2.4.2 Benefits\n\nExtremely Low Latency\nOne of the standout benefits of TinyML is its ability to offer ultra-low latency. Since computation occurs directly on the device, the time required to send data to external servers and receive a response is eliminated. This is crucial in applications requiring immediate decision-making, enabling quick responses to changing conditions.\n\n\nHigh Data Security\nTinyML inherently enhances data security. Because data processing and analysis happen on the device, the risk of data interception during transmission is virtually eliminated. This localized approach to data management ensures that sensitive information stays on the device, strengthening user data security.\n\n\nEnergy Efficiency\nTinyML operates within an energy-efficient framework, a necessity given its resource-constrained environments. By employing lean algorithms and optimized computational methods, TinyML ensures that devices can execute complex tasks without rapidly depleting battery life, making it a sustainable option for long-term deployments.\n\n\n\n2.4.3 Challenges\n\nLimited Computational Capabilities\nHowever, the shift to TinyML comes with its set of hurdles. The primary limitation is the devices’ constrained computational capabilities. The need to operate within such limits means that deployed models must be simplified, which could affect the accuracy and sophistication of the solutions.\n\n\nComplex Development Cycle\nTinyML also introduces a complicated development cycle. Crafting lightweight and effective models demands a deep understanding of machine learning principles and expertise in embedded systems. This complexity calls for a collaborative development approach, where multi-domain expertise is essential for success.\n\n\nModel Optimization and Compression\nA central challenge in TinyML is model optimization and compression. Creating machine learning models that can operate effectively within the limited memory and computational power of microcontrollers requires innovative approaches to model design. Developers often face the challenge of striking a delicate balance and optimizing models to maintain effectiveness while fitting within stringent resource constraints.\n\n\n\n2.4.4 Example Use Cases\n\nWearable Devices\nIn wearables, TinyML opens the door to smarter, more responsive gadgets. From fitness trackers offering real-time workout feedback to smart glasses processing visual data on the fly, TinyML transforms how we engage with wearable tech, delivering personalized experiences directly from the device.\n\n\nPredictive Maintenance\nIn industrial settings, TinyML plays a significant role in predictive maintenance. By deploying TinyML algorithms on sensors that monitor equipment health, companies can preemptively identify potential issues, reducing downtime and preventing costly breakdowns. On-site data analysis ensures quick responses, potentially stopping minor issues from becoming major problems.\n\n\nAnomaly Detection\nTinyML can be employed to create anomaly detection models that identify unusual data patterns. For instance, a smart factory could use TinyML to monitor industrial processes and spot anomalies, helping prevent accidents and improve product quality. Similarly, a security company could use TinyML to monitor network traffic for unusual patterns, aiding in detecting and preventing cyber-attacks. TinyML could monitor patient data for anomalies in healthcare, aiding early disease detection and better patient treatment.\n\n\nEnvironmental Monitoring\nIn environmental monitoring, TinyML enables real-time data analysis from various field-deployed sensors. These could range from city air quality monitoring to wildlife tracking in protected areas. Through TinyML, data can be processed locally, allowing for quick responses to changing conditions and providing a nuanced understanding of environmental patterns, crucial for informed decision-making.\nIn summary, TinyML serves as a trailblazer in the evolution of machine learning, fostering innovation across various fields by bringing intelligence directly to the edge. Its potential to transform our interaction with technology and the world is immense, promising a future where devices are connected, intelligent, and capable of making real-time decisions and responses.",
+    "section": "2.5 Tiny ML",
+    "text": "2.5 Tiny ML\nTinyML sits at the crossroads of embedded systems and machine learning, representing a burgeoning field that brings smart algorithms directly to tiny microcontrollers and sensors. These microcontrollers operate under severe resource constraints, particularly regarding memory, storage, and computational power. Figure 2.7 encapsulates the key aspects of TinyML discussed in this section.\n\n\n\n\n\n\nFigure 2.7: Section overview for Tiny ML.\n\n\n\n\n2.5.1 Characteristics\n\nOn-Device Machine Learning\nIn TinyML, the focus, much like in Mobile ML, is on on-device machine learning. This means that machine learning models are deployed and trained on the device, eliminating the need for external servers or cloud infrastructures. This allows TinyML to enable intelligent decision-making right where the data is generated, making real-time insights and actions possible, even in settings where connectivity is limited or unavailable.\n\n\nLow Power and Resource-Constrained Environments\nTinyML excels in low-power and resource-constrained settings. These environments require highly optimized solutions that function within the available resources. Figure 2.8 showcases an example TinyML device kit, illustrating the compact nature of these systems. These devices can typically fit in the palm of your hand or, in some cases, are even as small as a fingernail. TinyML meets the need for efficiency through specialized algorithms and models designed to deliver decent performance while consuming minimal energy, thus ensuring extended operational periods, even in battery-powered devices like those shown.\n\n\n\n\n\n\nFigure 2.8: Examples of TinyML device kits. Source: Widening Access to Applied Machine Learning with TinyML.\n\n\n\n\n\n\n\n\n\nExercise 2.1: TinyML with Arduino\n\n\n\n\n\nGet ready to bring machine learning to the smallest of devices! In the embedded machine learning world, TinyML is where resource constraints meet ingenuity. This Colab notebook will walk you through building a gesture recognition model designed on an Arduino board. You’ll learn how to train a small but effective neural network, optimize it for minimal memory usage, and deploy it to your microcontroller. If you’re excited about making everyday objects smarter, this is where it begins!\n\n\n\n\n\n\n\n2.5.2 Benefits\n\nExtremely Low Latency\nOne of the standout benefits of TinyML is its ability to offer ultra-low latency. Since computation occurs directly on the device, the time required to send data to external servers and receive a response is eliminated. This is crucial in applications requiring immediate decision-making, enabling quick responses to changing conditions.\n\n\nHigh Data Security\nTinyML inherently enhances data security. Because data processing and analysis happen on the device, the risk of data interception during transmission is virtually eliminated. This localized approach to data management ensures that sensitive information stays on the device, strengthening user data security.\n\n\nEnergy Efficiency\nTinyML operates within an energy-efficient framework, a necessity given its resource-constrained environments. By employing lean algorithms and optimized computational methods, TinyML ensures that devices can execute complex tasks without rapidly depleting battery life, making it a sustainable option for long-term deployments.\n\n\n\n2.5.3 Challenges\n\nLimited Computational Capabilities\nHowever, the shift to TinyML comes with its set of hurdles. The primary limitation is the devices’ constrained computational capabilities. The need to operate within such limits means that deployed models must be simplified, which could affect the accuracy and sophistication of the solutions.\n\n\nComplex Development Cycle\nTinyML also introduces a complicated development cycle. Crafting lightweight and effective models demands a deep understanding of machine learning principles and expertise in embedded systems. This complexity calls for a collaborative development approach, where multi-domain expertise is essential for success.\n\n\nModel Optimization and Compression\nA central challenge in TinyML is model optimization and compression. Creating machine learning models that can operate effectively within the limited memory and computational power of microcontrollers requires innovative approaches to model design. Developers often face the challenge of striking a delicate balance and optimizing models to maintain effectiveness while fitting within stringent resource constraints.\n\n\n\n2.5.4 Example Use Cases\n\nWearable Devices\nIn wearables, TinyML opens the door to smarter, more responsive gadgets. From fitness trackers offering real-time workout feedback to smart glasses processing visual data on the fly, TinyML transforms how we engage with wearable tech, delivering personalized experiences directly from the device.\n\n\nPredictive Maintenance\nIn industrial settings, TinyML plays a significant role in predictive maintenance. By deploying TinyML algorithms on sensors that monitor equipment health, companies can preemptively identify potential issues, reducing downtime and preventing costly breakdowns. On-site data analysis ensures quick responses, potentially stopping minor issues from becoming major problems.\n\n\nAnomaly Detection\nTinyML can be employed to create anomaly detection models that identify unusual data patterns. For instance, a smart factory could use TinyML to monitor industrial processes and spot anomalies, helping prevent accidents and improve product quality. Similarly, a security company could use TinyML to monitor network traffic for unusual patterns, aiding in detecting and preventing cyber-attacks. TinyML could monitor patient data for anomalies in healthcare, aiding early disease detection and better patient treatment.\n\n\nEnvironmental Monitoring\nIn environmental monitoring, TinyML enables real-time data analysis from various field-deployed sensors. These could range from city air quality monitoring to wildlife tracking in protected areas. Through TinyML, data can be processed locally, allowing for quick responses to changing conditions and providing a nuanced understanding of environmental patterns, crucial for informed decision-making.\nIn summary, TinyML serves as a trailblazer in the evolution of machine learning, fostering innovation across various fields by bringing intelligence directly to the edge. Its potential to transform our interaction with technology and the world is immense, promising a future where devices are connected, intelligent, and capable of making real-time decisions and responses.",
+    "crumbs": [
+      "<span class='chapter-number'>2</span>  <span class='chapter-title'>ML Systems</span>"
+    ]
+  },
+  {
+    "objectID": "contents/core/ml_systems/ml_systems.html#hybrid-ml",
+    "href": "contents/core/ml_systems/ml_systems.html#hybrid-ml",
+    "title": "2  ML Systems",
+    "section": "2.6 Hybrid ML",
+    "text": "2.6 Hybrid ML\nWhile we’ve examined Cloud ML, Edge ML, Mobile ML, and TinyML as distinct approaches, the reality of modern ML deployments is more nuanced. Systems architects often combine these paradigms to create solutions that leverage the strengths of each approach while mitigating their individual limitations. Understanding how these systems can work together opens up new possibilities for building more efficient and effective ML applications.\n\n2.6.1 Train-Serve Split\nOne of the most common hybrid patterns is the train-serve split, where model training occurs in the cloud but inference happens on edge, mobile, or tiny devices. This pattern takes advantage of the cloud’s vast computational resources for the training phase while benefiting from the low latency and privacy advantages of on-device inference. For example, smart home devices often use models trained on large datasets in the cloud but run inference locally to ensure quick response times and protect user privacy. In practice, this might involve training models on powerful systems like the NVIDIA DGX A100, leveraging its 8 A100 GPUs and terabyte-scale memory, before deploying optimized versions to edge devices like the NVIDIA Jetson AGX Orin for efficient inference. Similarly, mobile vision models for computational photography are typically trained on powerful cloud infrastructure but deployed to run efficiently on phone hardware.\n\n\n2.6.2 Hierarchical Processing\nHierarchical processing creates a multi-tier system where data and intelligence flow between different levels of the ML stack. In industrial IoT applications, tiny sensors might perform basic anomaly detection, edge devices aggregate and analyze data from multiple sensors, and cloud systems handle complex analytics and model updates. For instance, we might see ESP32-CAM devices performing basic image classification at the sensor level with their minimal 520KB RAM, feeding data up to Jetson AGX Orin devices for more sophisticated computer vision tasks, and ultimately connecting to cloud infrastructure for complex analytics and model updates.\nThis hierarchy allows each tier to handle tasks appropriate to its capabilities—TinyML devices handle immediate, simple decisions; edge devices manage local coordination; and cloud systems tackle complex analytics and learning tasks. Smart city installations often use this pattern, with street-level sensors feeding data to neighborhood-level edge processors, which in turn connect to city-wide cloud analytics.\n\n\n2.6.3 Federated Learning\nFederated learning represents a sophisticated hybrid approach where model training is distributed across many edge or mobile devices while maintaining privacy. Devices learn from local data and share model updates, rather than raw data, with cloud servers that aggregate these updates into an improved global model. This pattern is particularly powerful for applications like keyboard prediction on mobile devices or healthcare analytics, where privacy is paramount but benefits from collective learning are valuable. The cloud coordinates the learning process without directly accessing sensitive data, while devices benefit from the collective intelligence of the network.\n\n\n2.6.4 Progressive Deployment\nProgressive deployment strategies adapt models for different computational tiers, creating a cascade of increasingly lightweight versions. A model might start as a large, complex version in the cloud, then be progressively compressed and optimized for edge servers, mobile devices, and finally tiny sensors. Voice assistant systems often employ this pattern—full natural language processing runs in the cloud, while simplified wake-word detection runs on-device. This allows the system to balance capability and resource constraints across the ML stack.\n\n\n2.6.5 Collaborative Learning\nCollaborative learning enables peer-to-peer learning between devices at the same tier, often complementing hierarchical structures. Autonomous vehicle fleets, for example, might share learning about road conditions or traffic patterns directly between vehicles while also communicating with cloud infrastructure. This horizontal collaboration allows systems to share time-sensitive information and learn from each other’s experiences without always routing through central servers.\nThese hybrid patterns demonstrate how modern ML systems are evolving beyond simple client-server architectures into rich, multi-tier systems that combine the strengths of different approaches. By understanding these patterns, system architects can design solutions that effectively balance competing demands for computation, latency, privacy, and power efficiency. The future of ML systems likely lies not in choosing between cloud, edge, mobile, or tiny approaches, but in creatively combining them to build more capable and efficient systems.\n\n\n2.6.6 Real-World Integration Patterns\nIn practice, ML systems rarely operate in isolation. Instead, they form interconnected networks where each paradigm—Cloud, Edge, Mobile, and TinyML—plays a specific role while communicating with other parts of the system. These interactions follow distinct patterns that emerge from the inherent strengths and limitations of each approach.\nCloud systems excel at training and analytics but require significant infrastructure. Edge systems provide local processing power and reduced latency. Mobile devices offer personal computing capabilities and user interaction. TinyML enables intelligence in the smallest devices and sensors.\n\n\n\n\n\n\nFigure 2.9: Example interaction patterns between ML paradigms, showing data flows, model deployment, and processing relationships across Cloud, Edge, Mobile, and TinyML systems.\n\n\n\nFigure 2.9 illustrates the key interactions between these different ML paradigms. Notice how data flows upward from sensors through processing layers to cloud analytics, while model deployments flow downward from cloud training to various inference points. The interactions aren’t strictly hierarchical—mobile devices might communicate directly with both cloud services and tiny sensors, while edge systems can assist mobile devices with complex processing tasks. To understand how these interactions manifest in real applications, let’s explore several common scenarios using Figure 2.9:\n\nModel Deployment Scenario: A company develops a computer vision model for defect detection. After training in the cloud, optimized versions are deployed to edge servers in factories, quality control tablets on the production floor, and tiny cameras embedded in the production line. This showcases how a single ML solution can be distributed across different computational tiers for optimal performance.\nData Flow and Analysis Scenario: In a smart agriculture system, soil sensors (TinyML) collect moisture and nutrient data, sending results to edge processors in local stations. These process the data and forward insights to the cloud for farm-wide analytics, while also sharing results with farmers’ mobile apps. This demonstrates the hierarchical flow of data from sensors to cloud analytics.\nEdge-Mobile Assistance Scenario: When a mobile app needs to perform complex image processing that exceeds the phone’s capabilities, it connects to a nearby edge server. The edge system helps process the heavier computational tasks, sending back results to enhance the mobile app’s performance. This shows how different ML tiers can cooperate to handle demanding tasks.\nTinyML-Mobile Integration Scenario: A fitness tracker uses TinyML to continuously monitor activity patterns and vital signs. It synchronizes this processed data with the user’s smartphone, which combines it with other health data before sending consolidated updates to the cloud for long-term health analysis. This illustrates the common pattern of tiny devices using mobile devices as gateways to larger networks.\nMulti-Layer Processing Scenario: In a smart retail environment, tiny sensors monitor inventory levels, sending inference results to both edge systems for immediate stock management and mobile devices for staff notifications. The edge systems process this data alongside other store metrics, while the cloud analyzes trends across all store locations. This shows how multiple ML tiers can work together in a complete solution.\n\nThese real-world patterns demonstrate how different ML paradigms naturally complement each other in practice. While each approach has its own strengths, their true power emerges when they work together as an integrated system. By understanding these patterns, system architects can better design solutions that effectively leverage the capabilities of each ML tier while managing their respective constraints.",
     "crumbs": [
       "<span class='chapter-number'>2</span>  <span class='chapter-title'>ML Systems</span>"
     ]
@@ -483,8 +503,8 @@
     "objectID": "contents/core/ml_systems/ml_systems.html#comparison",
     "href": "contents/core/ml_systems/ml_systems.html#comparison",
     "title": "2  ML Systems",
-    "section": "2.5 Comparison",
-    "text": "2.5 Comparison\nLet’s bring together the different ML variants we’ve explored individually for a comprehensive view. Figure 2.9 illustrates the relationships and overlaps between Cloud ML, Edge ML, and TinyML using a Venn diagram. This visual representation effectively highlights the unique characteristics of each approach while also showing areas of commonality. Each ML paradigm has its own distinct features, but there are also intersections where these approaches share certain attributes or capabilities. This diagram helps us understand how these variants relate to each other in the broader landscape of machine learning implementations.\n\n\n\n\n\n\nFigure 2.9: ML Venn diagram. Source: arXiv\n\n\n\nFor a more detailed comparison of these ML variants, we can refer to Table 2.1. This table offers a comprehensive analysis of Cloud ML, Edge ML, and TinyML based on various features and aspects. By examining these different characteristics side by side, we gain a clearer perspective on the unique advantages and distinguishing factors of each approach. This detailed comparison, combined with the visual overview provided by the Venn diagram, aids in making informed decisions based on the specific needs and constraints of a given application or project.\n\n\n\nTable 2.1: Comparison of feature aspects across Cloud ML, Edge ML, and TinyML.\n\n\n\n\n\n\n\n\n\n\n\nAspect\nCloud ML\nEdge ML\nTinyML\n\n\n\n\nProcessing Location\nCentralized servers (Data Centers)\nLocal devices (closer to data sources)\nOn-device (microcontrollers, embedded systems)\n\n\nLatency\nHigh (Depends on internet connectivity)\nModerate (Reduced latency compared to Cloud ML)\nLow (Immediate processing without network delay)\n\n\nData Privacy\nModerate (Data transmitted over networks)\nHigh (Data remains on local networks)\nVery High (Data processed on-device, not transmitted)\n\n\nComputational Power\nHigh (Utilizes powerful data center infrastructure)\nModerate (Utilizes local device capabilities)\nLow (Limited to the power of the embedded system)\n\n\nEnergy Consumption\nHigh (Data centers consume significant energy)\nModerate (Less than data centers, more than TinyML)\nLow (Highly energy-efficient, designed for low power)\n\n\nScalability\nHigh (Easy to scale with additional server resources)\nModerate (Depends on local device capabilities)\nLow (Limited by the hardware resources of the device)\n\n\nCost\nHigh (Recurring costs for server usage, maintenance)\nVariable (Depends on the complexity of local setup)\nLow (Primarily upfront costs for hardware components)\n\n\nConnectivity\nHigh (Requires stable internet connectivity)\nLow (Can operate with intermittent connectivity)\nVery Low (Can operate without any network connectivity)\n\n\nReal-time Processing\nModerate (Can be affected by network latency)\nHigh (Capable of real-time processing locally)\nVery High (Immediate processing with minimal latency)\n\n\nApplication Examples\nBig Data Analysis, Virtual Assistants\nAutonomous Vehicles, Smart Homes\nWearables, Sensor Networks\n\n\nComplexity\nModerate to High (Requires knowledge in cloud computing)\nModerate (Requires knowledge in local network setup)\nModerate to High (Requires expertise in embedded systems)",
+    "section": "2.7 Comparison",
+    "text": "2.7 Comparison\nLet’s bring together the different ML variants we’ve explored individually for a comprehensive view. For a detailed comparison of these ML variants, we can refer to Table 2.2. This table offers a comprehensive analysis of Cloud ML, Edge ML, and TinyML based on various features and aspects. By examining these different characteristics side by side, we gain a clearer perspective on the unique advantages and distinguishing factors of each approach. This detailed comparison, combined with the visual overview provided by the Venn diagram, aids in making informed decisions based on the specific needs and constraints of a given application or project.\n\n\n\nTable 2.2: Comparison of feature aspects across Cloud ML, Edge ML, and TinyML.\n\n\n\n\n\n\n\n\n\n\n\n\nAspect\nCloud ML\nEdge ML\nMobile ML\nTinyML\n\n\n\n\nProcessing Location\nCentralized cloud servers (Data Centers)\nLocal edge devices (gateways, servers)\nSmartphones and tablets\nUltra-low-power microcontrollers and embedded systems\n\n\nLatency\nHigh (100ms-1000ms+)\nModerate (10-100ms)\nLow-Moderate (5-50ms)\nVery Low (1-10ms)\n\n\nData Privacy\nBasic-Moderate (Data leaves device)\nHigh (Data stays in local network)\nHigh (Data stays on phone)\nVery High (Data never leaves sensor)\n\n\nComputational Power\nVery High (Multiple GPUs/TPUs)\nHigh (Edge GPUs)\nModerate (Mobile NPUs/GPUs)\nVery Low (MCU/tiny processors)\n\n\nEnergy Consumption\nVery High (kW-MW range)\nHigh (100s W)\nModerate (1-10W)\nVery Low (mW range)\n\n\nScalability\nExcellent (virtually unlimited)\nGood (limited by edge hardware)\nModerate (per-device scaling)\nLimited (fixed hardware)\n\n\nCost\nHigh ($1000s+/month)\nModerate ($100s-1000s)\nLow ($0-10s)\nVery Low ($1-10s)\n\n\nConnectivity Required\nConstant high-bandwidth\nIntermittent\nOptional\nNone\n\n\nReal-time Processing\nDependent on network\nGood\nVery Good\nExcellent\n\n\nStorage Capacity\nUnlimited (petabytes+)\nLarge (terabytes)\nModerate (gigabytes)\nVery Limited (kilobytes-megabytes)\n\n\nPrimary Use Cases\nBig Data Analytics, Training, Complex AI Models\nSmart Manufacturing, Video Analytics, IoT Hubs\nAR/VR Apps, Mobile Gaming, Photo/Video Processing\nSensor Processing, Gesture Detection, Keyword Spotting\n\n\nDevelopment Complexity\nHigh (cloud expertise needed)\nModerate-High (edge+networking)\nModerate (mobile SDKs)\nHigh (embedded expertise)\n\n\nDeployment Speed\nFast\nModerate\nFast\nSlow\n\n\nHardware Requirements\nCloud infrastructure\nEdge servers/gateways\nModern smartphones\nMCUs/embedded systems\n\n\nFramework Support\nAll ML frameworks\nMost frameworks\nMobile-optimized (TFLite, CoreML)\nTinyML frameworks\n\n\nModel Size Limits\nNone\nSeveral GB\n10s-100s MB\nBytes-KB range\n\n\nBattery Impact\nN/A\nN/A\nModerate\nMinimal\n\n\nOffline Capability\nNone\nGood\nExcellent\nComplete",
     "crumbs": [
       "<span class='chapter-number'>2</span>  <span class='chapter-title'>ML Systems</span>"
     ]
@@ -493,8 +513,8 @@
     "objectID": "contents/core/ml_systems/ml_systems.html#conclusion",
     "href": "contents/core/ml_systems/ml_systems.html#conclusion",
     "title": "2  ML Systems",
-    "section": "2.6 Conclusion",
-    "text": "2.6 Conclusion\nIn this chapter, we’ve offered a panoramic view of the evolving landscape of machine learning, covering cloud, edge, and tiny ML paradigms. Cloud-based machine learning leverages the immense computational resources of cloud platforms to enable powerful and accurate models but comes with limitations, including latency and privacy concerns. Edge ML mitigates these limitations by bringing inference directly to edge devices, offering lower latency and reduced connectivity needs. TinyML takes this further by miniaturizing ML models to run directly on highly resource-constrained devices, opening up a new category of intelligent applications.\nEach approach has its tradeoffs, including model complexity, latency, privacy, and hardware costs. Over time, we anticipate converging these embedded ML approaches, with cloud pre-training facilitating more sophisticated edge and tiny ML implementations. Advances like federated learning and on-device learning will enable embedded devices to refine their models by learning from real-world data.\nThe embedded ML landscape is rapidly evolving and poised to enable intelligent applications across a broad spectrum of devices and use cases. This chapter serves as a snapshot of the current state of embedded ML. As algorithms, hardware, and connectivity continue to improve, we can expect embedded devices of all sizes to become increasingly capable, unlocking transformative new applications for artificial intelligence.",
+    "section": "2.8 Conclusion",
+    "text": "2.8 Conclusion\nIn this chapter, we’ve offered a panoramic view of the evolving landscape of machine learning, covering cloud, edge, and tiny ML paradigms. Cloud-based machine learning leverages the immense computational resources of cloud platforms to enable powerful and accurate models but comes with limitations, including latency and privacy concerns. Edge ML mitigates these limitations by bringing inference directly to edge devices, offering lower latency and reduced connectivity needs. TinyML takes this further by miniaturizing ML models to run directly on highly resource-constrained devices, opening up a new category of intelligent applications.\nEach approach has its tradeoffs, including model complexity, latency, privacy, and hardware costs. Over time, we anticipate converging these embedded ML approaches, with cloud pre-training facilitating more sophisticated edge and tiny ML implementations. Advances like federated learning and on-device learning will enable embedded devices to refine their models by learning from real-world data.\nThe embedded ML landscape is rapidly evolving and poised to enable intelligent applications across a broad spectrum of devices and use cases. This chapter serves as a snapshot of the current state of embedded ML. As algorithms, hardware, and connectivity continue to improve, we can expect embedded devices of all sizes to become increasingly capable, unlocking transformative new applications for artificial intelligence.",
     "crumbs": [
       "<span class='chapter-number'>2</span>  <span class='chapter-title'>ML Systems</span>"
     ]
@@ -503,8 +523,8 @@
     "objectID": "contents/core/ml_systems/ml_systems.html#sec-ml-systems-resource",
     "href": "contents/core/ml_systems/ml_systems.html#sec-ml-systems-resource",
     "title": "2  ML Systems",
-    "section": "2.7 Resources",
-    "text": "2.7 Resources\nHere is a curated list of resources to support students and instructors in their learning and teaching journeys. We are continuously working on expanding this collection and will be adding new exercises soon.\n\n\n\n\n\n\nSlides\n\n\n\n\n\nThese slides are a valuable tool for instructors to deliver lectures and for students to review the material at their own pace. We encourage students and instructors to leverage these slides to improve their understanding and facilitate effective knowledge transfer.\n\nEmbedded Systems Overview.\nEmbedded Computer Hardware.\nEmbedded I/O.\nEmbedded systems software.\nEmbedded ML software.\nEmbedded Inference.\nTinyML on Microcontrollers.\nTinyML as a Service (TinyMLaaS):\n\nTinyMLaaS: Introduction.\nTinyMLaaS: Design Overview.\n\n\n\n\n\n\n\n\n\n\n\nVideos\n\n\n\n\n\n\nComing soon.\n\n\n\n\n\n\n\n\n\n\nExercises\n\n\n\n\n\nTo reinforce the concepts covered in this chapter, we have curated a set of exercises that challenge students to apply their knowledge and deepen their understanding.\n\nComing soon.",
+    "section": "2.9 Resources",
+    "text": "2.9 Resources\nHere is a curated list of resources to support students and instructors in their learning and teaching journeys. We are continuously working on expanding this collection and will be adding new exercises soon.\n\n\n\n\n\n\nSlides\n\n\n\n\n\nThese slides are a valuable tool for instructors to deliver lectures and for students to review the material at their own pace. We encourage students and instructors to leverage these slides to improve their understanding and facilitate effective knowledge transfer.\n\nEmbedded Systems Overview.\nEmbedded Computer Hardware.\nEmbedded I/O.\nEmbedded systems software.\nEmbedded ML software.\nEmbedded Inference.\nTinyML on Microcontrollers.\nTinyML as a Service (TinyMLaaS):\n\nTinyMLaaS: Introduction.\nTinyMLaaS: Design Overview.\n\n\n\n\n\n\n\n\n\n\n\nVideos\n\n\n\n\n\n\nComing soon.\n\n\n\n\n\n\n\n\n\n\nExercises\n\n\n\n\n\nTo reinforce the concepts covered in this chapter, we have curated a set of exercises that challenge students to apply their knowledge and deepen their understanding.\n\nComing soon.",
     "crumbs": [
       "<span class='chapter-number'>2</span>  <span class='chapter-title'>ML Systems</span>"
     ]