Skip to content

Commit

Permalink
Push dev branch build
Browse files Browse the repository at this point in the history
  • Loading branch information
Naeemkh committed Aug 15, 2024
1 parent 845c6d3 commit cacddef
Show file tree
Hide file tree
Showing 4 changed files with 68 additions and 32 deletions.
8 changes: 4 additions & 4 deletions docs/contents/frameworks/frameworks.html
Original file line number Diff line number Diff line change
Expand Up @@ -1647,7 +1647,7 @@ <h3 data-number="6.8.3" class="anchored" data-anchor-id="library"><span class="h
<section id="choosing-the-right-framework" class="level2" data-number="6.9">
<h2 data-number="6.9" class="anchored" data-anchor-id="choosing-the-right-framework"><span class="header-section-number">6.9</span> Choosing the Right Framework</h2>
<p>Choosing the right machine learning framework for a given application requires carefully evaluating models, hardware, and software considerations. By analyzing these three aspects—models, hardware, and software—ML engineers can select the optimal framework and customize it as needed for efficient and performant on-device ML applications. The goal is to balance model complexity, hardware limitations, and software integration to design a tailored ML pipeline for embedded and edge devices.</p>
<div id="fig-tf-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-caption="TensorFlow Framework Comparison - General" data-align="center">
<div id="fig-tf-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-align="center" data-caption="TensorFlow Framework Comparison - General">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-tf-comparison-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="images/png/image4.png" style="width:100.0%" data-align="center" data-caption="TensorFlow Framework Comparison - General" class="figure-img">
Expand All @@ -1663,7 +1663,7 @@ <h3 data-number="6.9.1" class="anchored" data-anchor-id="model"><span class="hea
</section>
<section id="software" class="level3" data-number="6.9.2">
<h3 data-number="6.9.2" class="anchored" data-anchor-id="software"><span class="header-section-number">6.9.2</span> Software</h3>
<div id="fig-tf-sw-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-caption="TensorFlow Framework Comparison - Model" data-align="center">
<div id="fig-tf-sw-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-align="center" data-caption="TensorFlow Framework Comparison - Model">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-tf-sw-comparison-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="images/png/image5.png" style="width:100.0%" data-align="center" data-caption="TensorFlow Framework Comparison - Model" class="figure-img">
Expand All @@ -1677,7 +1677,7 @@ <h3 data-number="6.9.2" class="anchored" data-anchor-id="software"><span class="
</section>
<section id="hardware" class="level3" data-number="6.9.3">
<h3 data-number="6.9.3" class="anchored" data-anchor-id="hardware"><span class="header-section-number">6.9.3</span> Hardware</h3>
<div id="fig-tf-hw-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-caption="TensorFlow Framework Comparison - Hardware" data-align="center">
<div id="fig-tf-hw-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-align="center" data-caption="TensorFlow Framework Comparison - Hardware">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-tf-hw-comparison-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="images/png/image3.png" style="width:100.0%" data-align="center" data-caption="TensorFlow Framework Comparison - Hardware" class="figure-img">
Expand Down Expand Up @@ -1723,7 +1723,7 @@ <h2 data-number="6.10" class="anchored" data-anchor-id="future-trends-in-ml-fram
<section id="decomposition" class="level3" data-number="6.10.1">
<h3 data-number="6.10.1" class="anchored" data-anchor-id="decomposition"><span class="header-section-number">6.10.1</span> Decomposition</h3>
<p>Currently, the ML system stack consists of four abstractions as shown in <a href="#fig-mlsys-stack" class="quarto-xref">Figure&nbsp;<span>6.11</span></a>, namely (1) computational graphs, (2) tensor programs, (3) libraries and runtimes, and (4) hardware primitives.</p>
<div id="fig-mlsys-stack" class="quarto-float quarto-figure quarto-figure-center anchored" data-caption="Four Abstractions in Current ML System Stack" data-align="center">
<div id="fig-mlsys-stack" class="quarto-float quarto-figure quarto-figure-center anchored" data-align="center" data-caption="Four Abstractions in Current ML System Stack">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-mlsys-stack-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="images/png/image8.png" class="img-fluid figure-img" data-align="center" data-caption="Four Abstractions in Current ML System Stack">
Expand Down
86 changes: 61 additions & 25 deletions docs/contents/hw_acceleration/hw_acceleration.html
Original file line number Diff line number Diff line change
Expand Up @@ -1292,48 +1292,84 @@ <h5 class="anchored" data-anchor-id="power-inefficiency-under-heavy-workloads">P
</section>
<section id="comparison" class="level3" data-number="10.3.6">
<h3 data-number="10.3.6" class="anchored" data-anchor-id="comparison"><span class="header-section-number">10.3.6</span> Comparison</h3>
<table class="caption-top table">
<p><a href="#tbl-accelerator-comparison" class="quarto-xref">Table&nbsp;<span>10.2</span></a> Compare the different types of hardware features.</p>
<div id="tbl-accelerator-comparison" class="striped hover quarto-float quarto-figure quarto-figure-center anchored">
<figure class="quarto-float quarto-float-tbl figure">
<figcaption class="quarto-float-caption-top quarto-float-caption quarto-float-tbl" id="tbl-accelerator-comparison-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
Table&nbsp;10.2: Comparison of different hardware accelerators for AI workloads.
</figcaption>
<div aria-describedby="tbl-accelerator-comparison-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<table class="table-striped table-hover caption-top table">
<colgroup>
<col style="width: 25%">
<col style="width: 25%">
<col style="width: 25%">
<col style="width: 25%">
<col style="width: 11%">
<col style="width: 30%">
<col style="width: 28%">
<col style="width: 29%">
</colgroup>
<thead>
<tr class="header">
<th>Accelerator</th>
<th>Description</th>
<th>Key Advantages</th>
<th>Key Disadvantages</th>
<th style="text-align: left;">Accelerator</th>
<th style="text-align: left;">Description</th>
<th style="text-align: left;">Key Advantages</th>
<th style="text-align: left;">Key Disadvantages</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>ASICs</td>
<td>Custom ICs designed for target workloads like AI inference</td>
<td>Maximizes perf/watt. <br> Optimized for tensor ops<br> Low latency on-chip memory</td>
<td>Fixed architecture lacks flexibility<br> High NRE cost<br> Long design cycles</td>
<td style="text-align: left;">ASICs</td>
<td style="text-align: left;">Custom ICs designed for target workloads like AI inference</td>
<td style="text-align: left;"><ul>
<li>Maximizes perf/watt.</li>
<li>Optimized for tensor ops</li>
<li>Low latency on-chip memory</li>
</ul></td>
<td style="text-align: left;"><ul>
<li>Fixed architecture lacks flexibility</li>
<li>High NRE cost</li>
<li>Long design cycles</li>
</ul></td>
</tr>
<tr class="even">
<td>FPGAs</td>
<td>Reconfigurable fabric with programmable logic and routing</td>
<td>Flexible architecture<br> Low latency memory access</td>
<td>Lower perf/watt than ASICs<br> Complex programming</td>
<td style="text-align: left;">FPGAs</td>
<td style="text-align: left;">Reconfigurable fabric with programmable logic and routing</td>
<td style="text-align: left;"><ul>
<li>Flexible architecture</li>
<li>Low latency memory access</li>
</ul></td>
<td style="text-align: left;"><ul>
<li>Lower perf/watt than ASICs</li>
<li>Complex programming</li>
</ul></td>
</tr>
<tr class="odd">
<td>GPUs</td>
<td>Originally for graphics, now used for neural network acceleration</td>
<td>High throughput<br> Parallel scalability<br> Software ecosystem with CUDA</td>
<td>Not as power efficient as ASICs. <br> Require high memory bandwidth</td>
<td style="text-align: left;">GPUs</td>
<td style="text-align: left;">Originally for graphics, now used for neural network acceleration</td>
<td style="text-align: left;"><ul>
<li>High throughput</li>
<li>Parallel scalability</li>
<li>Software ecosystem with CUDA</li>
</ul></td>
<td style="text-align: left;"><ul>
<li>Not as power efficient as ASICs</li>
<li>Require high memory bandwidth</li>
</ul></td>
</tr>
<tr class="even">
<td>CPUs</td>
<td>General purpose processors</td>
<td>Programmability<br> Ubiquitous availability</td>
<td>Lower performance for AI workloads</td>
<td style="text-align: left;">CPUs</td>
<td style="text-align: left;">General purpose processors</td>
<td style="text-align: left;"><ul>
<li>Programmability</li>
<li>Ubiquitous availability</li>
</ul></td>
<td style="text-align: left;"><ul>
<li>Lower performance for AI workloads</li>
</ul></td>
</tr>
</tbody>
</table>
</div>
</figure>
</div>
<p>In general, CPUs provide a readily available baseline, GPUs deliver broadly accessible acceleration, FPGAs offer programmability, and ASICs maximize efficiency for fixed functions. The optimal choice depends on the target application’s scale, cost, flexibility, and other requirements.</p>
<p>Although first developed for data center deployment, Google has also put considerable effort into developing <a href="https://cloud.google.com/edge-tpu">Edge TPUs</a>. These Edge TPUs maintain the inspiration from systolic arrays but are tailored to the limited resources accessible at the edge.</p>
</section>
Expand Down
2 changes: 1 addition & 1 deletion docs/references.html
Original file line number Diff line number Diff line change
Expand Up @@ -1350,7 +1350,7 @@ <h1 class="title">References</h1>
in Theoretical Computer Science</em> 9 (3-4): 211–407. <a href="https://doi.org/10.1561/0400000042">https://doi.org/10.1561/0400000042</a>.
</div>
<div id="ref-ebrahimi2014review" class="csl-entry" role="listitem">
Ebrahimi, Khosrow, Gerard F. Jones, and Amy S. Fleischer. 2014. <span>A
Ebrahimi, Khosrow, Gerard F. Jones, and Amy S. Fleischer. 2014. <span>��A
Review of Data Center Cooling Technology, Operating Conditions and the
Corresponding Low-Grade Waste Heat Recovery Opportunities.”</span>
<em>Renewable Sustainable Energy Rev.</em> 31 (March): 622–38. <a href="https://doi.org/10.1016/j.rser.2013.12.007">https://doi.org/10.1016/j.rser.2013.12.007</a>.
Expand Down
4 changes: 2 additions & 2 deletions docs/search.json

Large diffs are not rendered by default.

0 comments on commit cacddef

Please sign in to comment.