Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elastic Rerank model landing page #2884

Merged
merged 12 commits into from
Dec 11, 2024

Conversation

leemthompo
Copy link
Contributor

@leemthompo leemthompo commented Nov 28, 2024

  • Initial drafting happened in shadow Google Doc

URL preview 🔭 👁️

TODO

  • Air-gapped deployment instructions (possibly in follow-up PR)

@leemthompo leemthompo added backport-8.17 Automated backport with mergify backport-8.x Automated backport with mergify labels Nov 28, 2024
@leemthompo leemthompo self-assigned this Nov 28, 2024
Copy link

A documentation preview will be available soon.

Request a new doc build by commenting
  • Rebuild this PR: run docs-build
  • Rebuild this PR and all Elastic docs: run docs-build rebuild

run docs-build is much faster than run docs-build rebuild. A rebuild should only be needed in rare situations.

If your PR continues to fail for an unknown reason, the doc build pipeline may be broken. Elastic employees can check the pipeline status here.

@leemthompo leemthompo changed the title [WIP] Elastic Rerank model landing page Elastic Rerank model landing page Dec 10, 2024
Copy link
Contributor

@szabosteve szabosteve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for writing this up! I left a few suggestions, mostly nits to use attributes and to decrease word count. Please take or leave them.

docs/en/stack/ml/nlp/ml-nlp-elastic-rerank.asciidoc Outdated Show resolved Hide resolved
docs/en/stack/ml/nlp/ml-nlp-elastic-rerank.asciidoc Outdated Show resolved Hide resolved
docs/en/stack/ml/nlp/ml-nlp-elastic-rerank.asciidoc Outdated Show resolved Hide resolved
docs/en/stack/ml/nlp/ml-nlp-elastic-rerank.asciidoc Outdated Show resolved Hide resolved
docs/en/stack/ml/nlp/ml-nlp-elastic-rerank.asciidoc Outdated Show resolved Hide resolved
Co-authored-by: István Zoltán Szabó <[email protected]>
@leemthompo leemthompo requested a review from davidkyle December 11, 2024 08:56
@leemthompo leemthompo marked this pull request as ready for review December 11, 2024 08:57
@leemthompo leemthompo requested a review from a team as a code owner December 11, 2024 08:57
docs/en/stack/ml/nlp/ml-nlp-elastic-rerank.asciidoc Outdated Show resolved Hide resolved
docs/en/stack/ml/nlp/ml-nlp-elastic-rerank.asciidoc Outdated Show resolved Hide resolved

It's important to note that if you rerank to depth `n` then you will need to run `n` inferences per query. This will include the document text and will therefore be significantly more expensive than inference for query embeddings. Hardware can be scaled to run these inferences in parallel, but we would recommend shallow reranking for CPU inference: no more than top-30 results. You may find that the preview version is cost prohibitive for high query rates and low query latency requirements. We plan to address performance issues for GA.

// // Is air-gapped deployment supported?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes air gapped download is supported, the same instructions as ELSER apply just change the model id

https://www.elastic.co/guide/en/machine-learning/master/ml-nlp-elser.html#air-gapped-install

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok cool thanks

Copy link
Contributor Author

@leemthompo leemthompo Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidkyle are those model artifact files already available?

found 'em :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we just have cross-platform version initially?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is no platform specific model for rerank

@leemthompo leemthompo requested a review from davidkyle December 11, 2024 14:10
@leemthompo
Copy link
Contributor Author

@davidkyle I updated with air-gapped instructions, copying the ELSER instructions but removing the trained model UI UI instructions and just replacing with "create inference endpoint".

maxhniebergall
maxhniebergall previously approved these changes Dec 11, 2024
Copy link
Member

@maxhniebergall maxhniebergall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

+
--
```
xpack.ml.model_repository: file://${path.home}/config/models/`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

theres an extra backtick at the end of this line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch, fixing!

+
When using the {ref}/semantic-text.html[`semantic_text` field type], text is divided into chunks. By default, each chunk contains 250 words (approximately 400 tokens). Be cautious when increasing the chunk size - if the combined length of your query and chunk text exceeds 512 tokens, the model won't have access to the full content.
+
When the combined inputs exceed the 512 token limit, a balanced truncation strategy is used. If both the query and input text are longer than 255 tokens each then both are truncated, otherwise the longest is truncated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very clear phrasing!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because it's 98% @tveasey and @davidkyle 😉

@leemthompo leemthompo merged commit 65aa83a into elastic:main Dec 11, 2024
3 checks passed
mergify bot pushed a commit that referenced this pull request Dec 11, 2024
mergify bot pushed a commit that referenced this pull request Dec 11, 2024
@leemthompo leemthompo deleted the elastic-rerank-landing-page branch December 11, 2024 17:19
leemthompo added a commit that referenced this pull request Dec 11, 2024
(cherry picked from commit 65aa83a)

Co-authored-by: Liam Thompson <[email protected]>
leemthompo added a commit that referenced this pull request Dec 11, 2024
(cherry picked from commit 65aa83a)

Co-authored-by: Liam Thompson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport with mergify backport-8.17 Automated backport with mergify
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants