[DOCS] Consolidates ELSER deployment guide with ES deployment tutoria…

…l content (#2560) (#2561) Co-authored-by: István Zoltán Szabó <[email protected]>
elastic · Oct 12, 2023 · 5b4ef3e · 5b4ef3e
1 parent 54c6b5b
commit 5b4ef3e
Show file tree

Hide file tree

Showing 5 changed files with 76 additions and 15 deletions.
diff --git a/docs/en/stack/ml/nlp/images/ml-nlp-deploy-elser-v2-es.png b/docs/en/stack/ml/nlp/images/ml-nlp-deploy-elser-v2-es.png
diff --git a/docs/en/stack/ml/nlp/images/ml-nlp-pipeline-copy-customize.png b/docs/en/stack/ml/nlp/images/ml-nlp-pipeline-copy-customize.png
diff --git a/docs/en/stack/ml/nlp/images/ml-nlp-start-elser-v2-es.png b/docs/en/stack/ml/nlp/images/ml-nlp-start-elser-v2-es.png
diff --git a/docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc b/docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc
@@ -25,8 +25,9 @@ are learned to co-occur frequently within a diverse set of training data. The
 terms that the text is expanded into by the model _are not_ synonyms for the 
 search terms; they are learned associations. These expanded terms are weighted 
 as some of them are more significant than others. Then the {es} 
-{ref}/rank-features.html[sparse vector (or rank features) field type] is used to 
-store the terms and weights at index time, and to search against later.
+{ref}/sparse-vector.html[sparse vector] 
+(or {ref}/rank-features.html[rank features]) field type is used to store the 
+terms and weights at index time, and to search against later.
 
 
 [discrete]
@@ -82,7 +83,7 @@ how to reindex your data through the pipeline.
 == Download and deploy ELSER
 
 You can download and deploy ELSER either from **{ml-app}** > **Trained Models**, 
-from **{ents}** > **Indices**, or by using the Dev Console.
+from **Search** > **Indices**, or by using the Dev Console.
 
 [discrete]
 [[trained-model]]
@@ -113,22 +114,65 @@ image::images/ml-nlp-deployment-id-elser-v2.png[alt="Deploying ELSER",align="cen
 
 
 [discrete]
-[[enterprise-search]]
-=== Using the Indices page in {ents}
+[[elasticsearch]]
+=== Using the search indices UI
 
-You can also download and deploy ELSER to an {infer} pipeline directly from the
-{ents} app.
+Alternatively, you can download and deploy ELSER to an {infer} pipeline using 
+the search indices UI.
 
-1. In {kib}, navigate to **{ents}** > **Indices**.
+1. In {kib}, navigate to **Search** > **Indices**.
 2. Select the index from the list that has an {infer} pipeline in which you want 
 to use ELSER.
 3. Navigate to the **Pipelines** tab.
 4. Under **{ml-app} {infer-cap} Pipelines**, click the **Deploy** button to 
 begin downloading the ELSER model. This may take a few minutes depending on your 
-network. Once it's downloaded, click the **Start single-threaded** button to 
+network. 
++
+--
+[role="screenshot"]
+image::images/ml-nlp-deploy-elser-v2-es.png[alt="Deploying ELSER in Elasticsearch",align="center"]
+--
+5. Once the model is downloaded, click the **Start single-threaded** button to 
 start the model with basic configuration or select the **Fine-tune performance** 
 option to navigate to the **Trained Models** page where you can configure the 
 model deployment.
++
+--
+[role="screenshot"]
+image::images/ml-nlp-start-elser-v2-es.png[alt="Start ELSER in Elasticsearch",align="center"]
+--
+
+When your ELSER model is deployed and started, it is ready to be used in a 
+pipeline.
+
+
+[discrete]
+[[elasticsearch-ingest-pipeline]]
+==== Adding ELSER to an ingest pipeline
+
+To add ELSER to an ingest pipeline, you need to copy the default ingest 
+pipeline and then customize it according to your needs.
+
+1. Click **Copy and customize** under the **Unlock your custom pipelines** block 
+at the top of the page. This enables the **Add inference pipeline** button.
++
+--
+[role="screenshot"]
+image::images/ml-nlp-pipeline-copy-customize.png[alt="Start ELSER in Elasticsearch",align="center"]
+--
+2. Under **{ml-app} {infer-cap} Pipelines**, click **Add inference pipeline**.
+3. Give a name to the pipeline, select ELSER from the list of trained ML models, 
+and click **Continue**.
+4. Select the source text field, define the target field, and click **Add** then 
+**Continue**.
+5. Review the index mappings updates. Click **Back** if you want to change the 
+mappings. Click **Continue** if you are satisfied with the updated index 
+mappings.
+6. You can optionally test your pipeline. Click **Continue**.
+7. **Create pipeline**.
+
+Once your pipeline is created, you are ready to ingest documents and utilize 
+ELSER for text expansions in your search queries.
 
 
 [discrete]
@@ -295,6 +339,23 @@ clicking **Reload examples**.
 image::images/ml-nlp-elser-v2-test.png[alt="Testing ELSER",align="center"]
 
 
+[discrete]
+[[performance]]
+== Performance considerations
+
+* ELSER works best on small-to-medium sized fields that contain natural 
+language. For connector or web crawler use cases, this aligns best with fields 
+like _title_, _description_, _summary_, or _abstract_. As ELSER encodes the 
+first 512 tokens of a field, it may not be as good a match for `body_content` on 
+web crawler documents, or body fields resulting from extracting text from office 
+documents with connectors.
+* Larger documents take longer at ingestion time, and {infer} time per 
+document also increases the more fields in a document that need to be processed.
+* The more fields your pipeline has to perform inference on, the longer it takes 
+per document to ingest.
+
+To learn more about ELSER performance, refer to the <<elser-benchmarks>>.
+
 [discrete]
 [[further-readings]]
 == Further reading

diff --git a/docs/en/stack/ml/nlp/ml-nlp-limitations.asciidoc b/docs/en/stack/ml/nlp/ml-nlp-limitations.asciidoc
@@ -11,10 +11,10 @@ the Elastic {nlp} trained models feature.
 
 [discrete]
 [[ml-nlp-elser-v1-limit-512]]
-== ELSER v1 semantic search is limited to 512 tokens per field that inference is applied to
+== ELSER semantic search is limited to 512 tokens per field that inference is applied to
 
-When you use ELSER v1 for semantic search, only the first 512 extracted tokens 
-from each field of the ingested documents that ELSER is applied to are taken 
-into account for the search process. If your data set contains long documents, 
-divide them into smaller segments before ingestion if you need the full text to 
-be searchable.
+When you use ELSER for semantic search, only the first 512 extracted tokens from 
+each field of the ingested documents that ELSER is applied to are taken into 
+account for the search process. If your data set contains long documents, divide 
+them into smaller segments before ingestion if you need the full text to be 
+searchable.