Skip to content

Commit

Permalink
deploy: 9cd5084
Browse files Browse the repository at this point in the history
  • Loading branch information
baniasbaabe committed Apr 28, 2024
1 parent 9c0bdf0 commit 427a6d6
Show file tree
Hide file tree
Showing 7 changed files with 151 additions and 27 deletions.
42 changes: 42 additions & 0 deletions _sources/book/llm/Chapter.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,48 @@
"\n",
"print(json.dumps(results, indent=3))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Embed Any Type of File"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These days, everything is about Embeddings and LLMs.\n",
"\n",
"The Python library `embed-anything` makes it easy to generate embeddings from multiple sources like image, video, or audio.\n",
"\n",
"It's built in Rust so it executes fast."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install embed-anything"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import embed_anything\n",
"\n",
"data = embed_anything.embed_file(\"filename.pdf\", embeder= \"Bert\")\n",
"embeddings = np.array([data.embedding for data in data])\n",
"\n",
"data = embed_anything.embed_directory(\"test_files\", embeder= \"Clip\")\n",
"embeddings = np.array([data.embedding for data in data])"
]
}
],
"metadata": {
Expand Down
12 changes: 5 additions & 7 deletions _sources/book/machinelearning/featureselection.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -247,20 +247,18 @@
]
},
{
"cell_type": "code",
"execution_count": null,
"cell_type": "markdown",
"metadata": {},
"outputs": [],
"source": [
"Do you want to do Feature Selection automatically?\n",
"\n",
"Try mrmr.\n",
"Try `mrmr`.\n",
"\n",
"mrmr (minimum-Redundancy-Maximum-Relevance) is a minimal-optimal feature selection algorithm at scale.\n",
"`mrmr` (minimum-Redundancy-Maximum-Relevance) is a minimal-optimal feature selection algorithm at scale.\n",
"\n",
"It means mrmr will find the smallest relevant subset of features your ML Model needs.\n",
"It means `mrmr` will find the smallest relevant subset of features your ML Model needs.\n",
"\n",
"mrmr supports common tools like Pandas, Polars and Spark.\n",
"`mrmr` supports common tools like Pandas, Polars and Spark.\n",
"\n",
"See below how we want to select the best K features.\n",
"\n",
Expand Down
41 changes: 41 additions & 0 deletions _sources/book/polars/Chapter.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,47 @@
" pl.col(\"actual\").num_ext.binary_metrics_combo(pl.col(\"predicted\")).alias(\"combo\")\n",
").unnest(\"combo\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Plugin for Fitting Linear Models"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In Polars, you can fit linear models with the `polars-ols` extension.\n",
"\n",
"You can use ordinary, weighted or regularized least squares like Lasso or Elastic Net.\n",
"\n",
"It can be 2x-88x times faster than popular libraries like sklearn or statsmodels."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install polars-ols"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import polars as pl\n",
"import polars_ols as pls\n",
"\n",
"lasso_expr = pl.col(\"y\").least_squares.lasso(\"x1\", \"x2\", alpha=0.0001, add_intercept=True).over(\"group\")\n",
"\n",
"predictions = df.with_columns(lasso_expr.round(2).alias(\"predictions_lasso\"))"
]
}
],
"metadata": {
Expand Down
28 changes: 28 additions & 0 deletions book/llm/Chapter.html
Original file line number Diff line number Diff line change
Expand Up @@ -425,6 +425,7 @@ <h2> Contents </h2>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#one-function-call-to-any-llm-with-litellm">6.1.2. One-Function Call to Any LLM with <code class="docutils literal notranslate"><span class="pre">litellm</span></code></a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#safeguard-your-llms-with-llmguard">6.1.3. Safeguard Your LLMs with <code class="docutils literal notranslate"><span class="pre">LLMGuard</span></code></a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#evaluate-llms-with-uptrain">6.1.4. Evaluate LLMs with <code class="docutils literal notranslate"><span class="pre">uptrain</span></code></a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#embed-any-type-of-file">6.1.5. Embed Any Type of File</a></li>
</ul>
</nav>
</div>
Expand Down Expand Up @@ -605,6 +606,32 @@ <h2><span class="section-number">6.1.4. </span>Evaluate LLMs with <code class="d
</div>
</div>
</section>
<section id="embed-any-type-of-file">
<h2><span class="section-number">6.1.5. </span>Embed Any Type of File<a class="headerlink" href="#embed-any-type-of-file" title="Permalink to this heading">#</a></h2>
<p>These days, everything is about Embeddings and LLMs.</p>
<p>The Python library <code class="docutils literal notranslate"><span class="pre">embed-anything</span></code> makes it easy to generate embeddings from multiple sources like image, video, or audio.</p>
<p>It’s built in Rust so it executes fast.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span>!pip install embed-anything
</pre></div>
</div>
</div>
</div>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">embed_anything</span>

<span class="n">data</span> <span class="o">=</span> <span class="n">embed_anything</span><span class="o">.</span><span class="n">embed_file</span><span class="p">(</span><span class="s2">&quot;filename.pdf&quot;</span><span class="p">,</span> <span class="n">embeder</span><span class="o">=</span> <span class="s2">&quot;Bert&quot;</span><span class="p">)</span>
<span class="n">embeddings</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="n">data</span><span class="o">.</span><span class="n">embedding</span> <span class="k">for</span> <span class="n">data</span> <span class="ow">in</span> <span class="n">data</span><span class="p">])</span>

<span class="n">data</span> <span class="o">=</span> <span class="n">embed_anything</span><span class="o">.</span><span class="n">embed_directory</span><span class="p">(</span><span class="s2">&quot;test_files&quot;</span><span class="p">,</span> <span class="n">embeder</span><span class="o">=</span> <span class="s2">&quot;Clip&quot;</span><span class="p">)</span>
<span class="n">embeddings</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="n">data</span><span class="o">.</span><span class="n">embedding</span> <span class="k">for</span> <span class="n">data</span> <span class="ow">in</span> <span class="n">data</span><span class="p">])</span>
</pre></div>
</div>
</div>
</div>
</section>
</section>

<script type="text/x-thebe-config">
Expand Down Expand Up @@ -678,6 +705,7 @@ <h2><span class="section-number">6.1.4. </span>Evaluate LLMs with <code class="d
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#one-function-call-to-any-llm-with-litellm">6.1.2. One-Function Call to Any LLM with <code class="docutils literal notranslate"><span class="pre">litellm</span></code></a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#safeguard-your-llms-with-llmguard">6.1.3. Safeguard Your LLMs with <code class="docutils literal notranslate"><span class="pre">LLMGuard</span></code></a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#evaluate-llms-with-uptrain">6.1.4. Evaluate LLMs with <code class="docutils literal notranslate"><span class="pre">uptrain</span></code></a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#embed-any-type-of-file">6.1.5. Embed Any Type of File</a></li>
</ul>
</nav></div>

Expand Down
26 changes: 7 additions & 19 deletions book/machinelearning/featureselection.html
Original file line number Diff line number Diff line change
Expand Up @@ -586,25 +586,13 @@ <h2><span class="section-number">5.3.4. </span>Find the Most Predictive Variable
</section>
<section id="feature-selection-at-scale-with-mrmr">
<h2><span class="section-number">5.3.5. </span>Feature Selection at Scale with <code class="docutils literal notranslate"><span class="pre">mrmr</span></code><a class="headerlink" href="#feature-selection-at-scale-with-mrmr" title="Permalink to this heading">#</a></h2>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span>Do you want to do Feature Selection automatically?

Try mrmr.

mrmr (minimum-Redundancy-Maximum-Relevance) is a minimal-optimal feature selection algorithm at scale.

It means mrmr will find the smallest relevant subset of features your ML Model needs.

mrmr supports common tools like Pandas, Polars and Spark.

See below how we want to select the best K features.

The output is a ranked list of the relevant features.
</pre></div>
</div>
</div>
</div>
<p>Do you want to do Feature Selection automatically?</p>
<p>Try <code class="docutils literal notranslate"><span class="pre">mrmr</span></code>.</p>
<p><code class="docutils literal notranslate"><span class="pre">mrmr</span></code> (minimum-Redundancy-Maximum-Relevance) is a minimal-optimal feature selection algorithm at scale.</p>
<p>It means <code class="docutils literal notranslate"><span class="pre">mrmr</span></code> will find the smallest relevant subset of features your ML Model needs.</p>
<p><code class="docutils literal notranslate"><span class="pre">mrmr</span></code> supports common tools like Pandas, Polars and Spark.</p>
<p>See below how we want to select the best K features.</p>
<p>The output is a ranked list of the relevant features.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span>!pip install mrmr_selection
Expand Down
27 changes: 27 additions & 0 deletions book/polars/Chapter.html
Original file line number Diff line number Diff line change
Expand Up @@ -422,6 +422,7 @@ <h2> Contents </h2>
<nav aria-label="Page">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#plugin-for-data-science-functions">9.1.1. Plugin for Data Science Functions</a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#plugin-for-fitting-linear-models">9.1.2. Plugin for Fitting Linear Models</a></li>
</ul>
</nav>
</div>
Expand Down Expand Up @@ -470,6 +471,31 @@ <h2><span class="section-number">9.1.1. </span>Plugin for Data Science Functions
</div>
</div>
</section>
<section id="plugin-for-fitting-linear-models">
<h2><span class="section-number">9.1.2. </span>Plugin for Fitting Linear Models<a class="headerlink" href="#plugin-for-fitting-linear-models" title="Permalink to this heading">#</a></h2>
<p>In Polars, you can fit linear models with the <code class="docutils literal notranslate"><span class="pre">polars-ols</span></code> extension.</p>
<p>You can use ordinary, weighted or regularized least squares like Lasso or Elastic Net.</p>
<p>It can be 2x-88x times faster than popular libraries like sklearn or statsmodels.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span>!pip install polars-ols
</pre></div>
</div>
</div>
</div>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">polars</span> <span class="k">as</span> <span class="nn">pl</span>
<span class="kn">import</span> <span class="nn">polars_ols</span> <span class="k">as</span> <span class="nn">pls</span>

<span class="n">lasso_expr</span> <span class="o">=</span> <span class="n">pl</span><span class="o">.</span><span class="n">col</span><span class="p">(</span><span class="s2">&quot;y&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">least_squares</span><span class="o">.</span><span class="n">lasso</span><span class="p">(</span><span class="s2">&quot;x1&quot;</span><span class="p">,</span> <span class="s2">&quot;x2&quot;</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.0001</span><span class="p">,</span> <span class="n">add_intercept</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span><span class="o">.</span><span class="n">over</span><span class="p">(</span><span class="s2">&quot;group&quot;</span><span class="p">)</span>

<span class="n">predictions</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="n">with_columns</span><span class="p">(</span><span class="n">lasso_expr</span><span class="o">.</span><span class="n">round</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span><span class="o">.</span><span class="n">alias</span><span class="p">(</span><span class="s2">&quot;predictions_lasso&quot;</span><span class="p">))</span>
</pre></div>
</div>
</div>
</div>
</section>
</section>

<script type="text/x-thebe-config">
Expand Down Expand Up @@ -540,6 +566,7 @@ <h2><span class="section-number">9.1.1. </span>Plugin for Data Science Functions
<nav class="bd-toc-nav page-toc">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#plugin-for-data-science-functions">9.1.1. Plugin for Data Science Functions</a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#plugin-for-fitting-linear-models">9.1.2. Plugin for Fitting Linear Models</a></li>
</ul>
</nav></div>

Expand Down
2 changes: 1 addition & 1 deletion searchindex.js

Large diffs are not rendered by default.

0 comments on commit 427a6d6

Please sign in to comment.