From 4b0a22465221fda1ee308b200f757ee09966ce04 Mon Sep 17 00:00:00 2001 From: Arnstein Ressem Date: Wed, 30 Oct 2024 14:33:40 +0100 Subject: [PATCH] Add alternative podman command for memory info --- billion-scale-image-search/README.md | 149 +++++++++--------- billion-scale-vector-search/README.md | 35 ++-- commerce-product-ranking/README.md | 42 ++--- custom-embeddings/README.md | 31 ++-- examples/document-processing/README.md | 3 +- examples/generic-request-processing/README.md | 3 +- .../README.md | 3 +- examples/multiple-bundles/README.md | 3 +- examples/operations/multinode-HA/README.md | 61 +++---- examples/part-purchases-demo/README.md | 2 + examples/predicate-fields/README.md | 39 ++--- .../search-as-you-type/README.md | 8 +- .../search-suggestions/README.md | 12 +- model-inference/README.md | 8 +- msmarco-ranking/README.md | 46 +++--- multi-vector-indexing/README.md | 30 ++-- multilingual-search/README.md | 22 +-- text-image-search/README.md | 14 +- transformers/README.md | 22 +-- use-case-shopping/README.md | 8 +- vector-streaming-search/README.md | 14 +- 21 files changed, 293 insertions(+), 262 deletions(-) diff --git a/billion-scale-image-search/README.md b/billion-scale-image-search/README.md index d0c12b230..93058dd2d 100644 --- a/billion-scale-image-search/README.md +++ b/billion-scale-image-search/README.md @@ -1,4 +1,3 @@ - @@ -9,9 +8,9 @@ # Billion-Scale Image Search -This sample application combines two sample applications to implement -cost-efficient large scale image search over multimodal AI powered vector representations; -[text-image-search](https://github.com/vespa-engine/sample-apps/tree/master/text-image-search) and +This sample application combines two sample applications to implement +cost-efficient large scale image search over multimodal AI powered vector representations; +[text-image-search](https://github.com/vespa-engine/sample-apps/tree/master/text-image-search) and [billion-scale-vector-search](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-vector-search). ## The Vector Dataset @@ -20,13 +19,13 @@ This sample app use the [LAION-5B](https://laion.ai/blog/laion-5b/) dataset, > Large image-text models like ALIGN, BASIC, Turing Bletchly, FLORENCE & GLIDE have > shown better and better performance compared to previous flagship models like CLIP and DALL-E. -> Most of them had been trained on billions of image-text pairs and unfortunately, no datasets of this size had been openly available until now. -> To address this problem we present LAION 5B, a large-scale dataset for research purposes -> consisting of 5,85B CLIP-filtered image-text pairs. 2,3B contain English language, -> 2,2B samples from 100+ other languages and 1B samples have texts that do not allow a certain language assignment (e.g. names ). +> Most of them had been trained on billions of image-text pairs and unfortunately, no datasets of this size had been openly available until now. +> To address this problem we present LAION 5B, a large-scale dataset for research purposes +> consisting of 5,85B CLIP-filtered image-text pairs. 2,3B contain English language, +> 2,2B samples from 100+ other languages and 1B samples have texts that do not allow a certain language assignment (e.g. names ). -The LAION-5B dataset was used to train the popular text-to-image -generative [StableDiffusion](https://stability.ai/blog/stable-diffusion-public-release) model. +The LAION-5B dataset was used to train the popular text-to-image +generative [StableDiffusion](https://stability.ai/blog/stable-diffusion-public-release) model. Note the following about the LAION 5B dataset @@ -38,74 +37,74 @@ The released dataset does not contain image data itself, but CLIP encoded vector representations of the images, and metadata like `url` and `caption`. -## Use cases +## Use cases The app can be used to implement several use cases over the LAION dataset, or adopted to your large-scale vector dataset: - Search with a free text prompt over the `caption` or `url` fields in the LAION dataset using Vespa's standard text-matching functionality. -- CLIP retrieval, using vector search, given a text prompt, search the image vector representations (CLIP ViT-L/14), for example for 'french cat'. +- CLIP retrieval, using vector search, given a text prompt, search the image vector representations (CLIP ViT-L/14), for example for 'french cat'. - Given an image vector representation, search for similar images in the dataset. This can for example -be used to take the output image of StableDiffusion to find similar images in the training dataset. +be used to take the output image of StableDiffusion to find similar images in the training dataset. All this combined using [Vespa's query language](https://docs.vespa.ai/en/query-language.html), - and also in combination with filters. + and also in combination with filters. -## Vespa Primitives Demonstrated +## Vespa Primitives Demonstrated -The sample application demonstrates many Vespa primitives: +The sample application demonstrates many Vespa primitives: -- Importing an [ONNX](https://onnx.ai/)-exported version of [CLIP ViT-L/14](https://github.com/openai/CLIP) -for [accelerated inference](https://blog.vespa.ai/stateful-model-serving-how-we-accelerate-inference-using-onnx-runtime/) -in [Vespa stateless](https://docs.vespa.ai/en/overview.html) containers. +- Importing an [ONNX](https://onnx.ai/)-exported version of [CLIP ViT-L/14](https://github.com/openai/CLIP) +for [accelerated inference](https://blog.vespa.ai/stateful-model-serving-how-we-accelerate-inference-using-onnx-runtime/) +in [Vespa stateless](https://docs.vespa.ai/en/overview.html) containers. The exported CLIP model encodes a free-text prompt to a joint image-text embedding space with 768 dimensions. - [HNSW](https://docs.vespa.ai/en/approximate-nn-hnsw.html) indexing of vector centroids drawn -from the dataset, and combination with classic Inverted File as described in +from the dataset, and combination with classic Inverted File as described in [Billion-scale vector search using hybrid HNSW-IF](https://blog.vespa.ai/vespa-hybrid-billion-scale-vector-search/). - Decoupling of vector storage and vector similarity computations. The stateless layer performs vector similarity computation over the full precision vectors. -By using Vespa's support for accelerated inference with [onnxruntime](https://onnxruntime.ai/), +By using Vespa's support for accelerated inference with [onnxruntime](https://onnxruntime.ai/), moving the majority of the vector compute to the stateless layer -allows for faster auto-scaling with daily query volume changes. -The full precision vectors are stored in Vespa's summary log store, using lossless compression (zstd). +allows for faster auto-scaling with daily query volume changes. +The full precision vectors are stored in Vespa's summary log store, using lossless compression (zstd). - Dimension reduction with PCA - The centroid vectors are compressed from 768 dimensions to 128 dimensions. This allows indexing 6x more -centroids on the same instance type due to the reduced memory footprint. With Vespa's support for distributed search, coupled with powerful -high memory instances, this allows Vespa to scale cost efficiently to trillion-sized vector datasets. -- The trained PCA matrix and matrix multiplication which projects the 768-dim vectors to 128-dimensions is -evaluated in Vespa using accelerated inference, both at indexing time and at query time. The PCA weights are represented also using ONNX. -- Phased ranking. -The image embedding vectors are also projected to 128 dimensions, stored using -memory mapped [paged attribute tensors](https://docs.vespa.ai/en/attributes.html#paged-attributes). -Full precision vectors are on stored on disk in Vespa summary store. +centroids on the same instance type due to the reduced memory footprint. With Vespa's support for distributed search, coupled with powerful +high memory instances, this allows Vespa to scale cost efficiently to trillion-sized vector datasets. +- The trained PCA matrix and matrix multiplication which projects the 768-dim vectors to 128-dimensions is +evaluated in Vespa using accelerated inference, both at indexing time and at query time. The PCA weights are represented also using ONNX. +- Phased ranking. +The image embedding vectors are also projected to 128 dimensions, stored using +memory mapped [paged attribute tensors](https://docs.vespa.ai/en/attributes.html#paged-attributes). +Full precision vectors are on stored on disk in Vespa summary store. The first-phase coarse search ranks vectors in the reduced vector space, per node, and results are merged from all nodes before -the final ranking phase in the stateless layer. +the final ranking phase in the stateless layer. The final ranking phase is implemented in the stateless container layer using [accelerated inference](https://blog.vespa.ai/stateful-model-serving-how-we-accelerate-inference-using-onnx-runtime/). - Combining approximate nearest neighbor search with [filters](https://blog.vespa.ai/constrained-approximate-nearest-neighbor-search/), filtering -can be on url, caption, image height, width, safety probability, NSFW label, and more. -- Hybrid ranking, both textual sparse matching features and the CLIP similarity, can be used when ranking images. +can be on url, caption, image height, width, safety probability, NSFW label, and more. +- Hybrid ranking, both textual sparse matching features and the CLIP similarity, can be used when ranking images. - Reduced tensor cell precision. The original LAION-5B dataset uses `float16`. The app uses Vespa's support for `bfloat16` tensors, saving 50% of storage compared to full `float` representation. -- Caching, both reduced vectors (128) cached by the OS buffer cache, and full version 768 dims are cached using Vespa summary cache. +- Caching, both reduced vectors (128) cached by the OS buffer cache, and full version 768 dims are cached using Vespa summary cache. - Query-time vector de-duping and diversification of the search engine result page using document to document similarity instead of query to document similarity. Also -accelerated by stateless model inference. -- Scale, from a single node deployment to multi-node deployment using managed [Vespa Cloud](https://cloud.vespa.ai/), -or self-hosted on-premise. +accelerated by stateless model inference. +- Scale, from a single node deployment to multi-node deployment using managed [Vespa Cloud](https://cloud.vespa.ai/), +or self-hosted on-premise. -## Stateless Components +## Stateless Components The app contains several [container components](https://docs.vespa.ai/en/jdisc/container-components.html): - [RankingSearcher](src/main/java/ai/vespa/examples/searcher/RankingSearcher.java) implements the last stage ranking using -full-precision vectors using an ONNX model for accelerated inference. -- [DedupingSearcher](src/main/java/ai/vespa/examples/searcher/DeDupingSearcher.java) implements run-time de-duping after Ranking, using -document to document similarity matrix, using an ONNX model for accelerated inference. +full-precision vectors using an ONNX model for accelerated inference. +- [DedupingSearcher](src/main/java/ai/vespa/examples/searcher/DeDupingSearcher.java) implements run-time de-duping after Ranking, using +document to document similarity matrix, using an ONNX model for accelerated inference. - [DimensionReducer](src/main/java/ai/vespa/examples/DimensionReducer.java) PCA dimension reducing vectors from 768-dims to 128-dims. - [AssignCentroidsDocProc](src/main/java/ai/vespa/examples/docproc/AssignCentroidsDocProc.java) searches the HNSW graph content cluster during ingestion to find the nearest centroids of the incoming vector. - [SPANNSearcher](src/main/java/ai/vespa/examples/searcher/SPANNSearcher.java) -## Deploying this app +## Deploying this app These reproducing steps, demonstrates the app using a smaller subset of the LAION-5B vector dataset, suitable -for playing around with the app on a laptop. +for playing around with the app on a laptop. **Requirements:** @@ -118,14 +117,16 @@ for playing around with the app on a laptop. * [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases). * Java 17 installed. -* Python3 and numpy to process the vector dataset +* Python3 and numpy to process the vector dataset * [Apache Maven](https://maven.apache.org/install.html) - this sample app uses custom Java components and Maven is used - to build the application. + to build the application. Verify Docker Memory Limits:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -180,36 +181,36 @@ $ curl -L -o metadata_0000.parquet \ Install python dependencies to process the files:
-$ python3 -m pip install pandas numpy requests mmh3 pyarrow 
+$ python3 -m pip install pandas numpy requests mmh3 pyarrow
 
Generate centroids, this process randomly selects vectors from the dataset to represent centroids. Performing an incremental clustering can improve vector search recall and allow -indexing fewer centroids. For simplicity, this tutorial uses random sampling. +indexing fewer centroids. For simplicity, this tutorial uses random sampling.
 $ python3 src/main/python/create-centroid-feed.py img_emb_0000.npy > centroids.jsonl
 
Generate the image feed, this merges the embedding data with the metadata and creates a Vespa -jsonl feed file, with one json operation per line. +jsonl feed file, with one json operation per line.
 $ python3 src/main/python/create-joined-feed.py metadata_0000.parquet img_emb_0000.npy > feed.jsonl
 
-To process the entire dataset, we recommend starting several processes, each operating on separate split files -as the processing implementation is single-threaded. +To process the entire dataset, we recommend starting several processes, each operating on separate split files +as the processing implementation is single-threaded. -## Build and deploy Vespa app +## Build and deploy Vespa app `src/main/application/models` has three small ONNX models: - `vespa_innerproduct_ranker.onnx` for vector similarity (inner dot product) between the query and the vectors in the stateless container. - `vespa_pairwise_similarity.onnx` for matrix multiplication between the top retrieved vectors. -- `pca_transformer.onnx` for dimension reduction, projecting the 768-dim vector space to a 128-dimensional space. +- `pca_transformer.onnx` for dimension reduction, projecting the 768-dim vector space to a 128-dimensional space. These `ONNX` model files are generated by specifying the compute operation using [pytorch](https://pytorch.org/) and using `torch`'s ability to export the model to [ONNX](https://onnx.ai/) format: @@ -277,14 +278,14 @@ $ vespa document get \ The response contains all fields, including the full vector representation and the -reduced vector, plus all the metadata. Everything represented in the same -[schema](src/main/application/schemas/image.sd). +reduced vector, plus all the metadata. Everything represented in the same +[schema](src/main/application/schemas/image.sd). ## Query the data -The following provides a few query examples, -`prompt` is a run-time query parameter which is used by the -[CLIPEmbeddingSearcher](src/main/java/ai/vespa/examples/searcher/CLIPEmbeddingSearcher.java) +The following provides a few query examples, +`prompt` is a run-time query parameter which is used by the +[CLIPEmbeddingSearcher](src/main/java/ai/vespa/examples/searcher/CLIPEmbeddingSearcher.java) which will encode the prompt text into a CLIP vector representation using the embedded CLIP model:
@@ -294,7 +295,7 @@ $ vespa query \
  'prompt=two dogs running on a sandy beach'
 
-Results are filtered by a constraint on the `nsfw` field. Note that even if the image is classified +Results are filtered by a constraint on the `nsfw` field. Note that even if the image is classified as `unlikely` the image content might still be explicit as the NSFW classifier is not 100% accurate. The returned images are ranked by CLIP similarity (The score is found in each hit's `relevance` field). @@ -318,7 +319,7 @@ $ vespa query \ 'prompt=two dogs running on a sandy beach' -Regular query, matching over the `default` fieldset, searching the `caption` and the `url` field, ranked by +Regular query, matching over the `default` fieldset, searching the `caption` and the `url` field, ranked by the `text` ranking profile:
@@ -329,32 +330,32 @@ $ vespa query \
  'ranking=text'
 
-The `text` [rank](https://docs.vespa.ai/en/ranking.html) profile uses -[nativeRank](https://docs.vespa.ai/en/nativerank.html), one of Vespa's many -text matching rank features. +The `text` [rank](https://docs.vespa.ai/en/ranking.html) profile uses +[nativeRank](https://docs.vespa.ai/en/nativerank.html), one of Vespa's many +text matching rank features. ## Non-native hyperparameters -There are several non-native query request +There are several non-native query request parameters that controls the vector search accuracy and performance tradeoffs. These -can be set with the request, e.g, `/search/&spann.clusters=12`. +can be set with the request, e.g, `/search/&spann.clusters=12`. - `spann.clusters`, default `64`, the number of centroids in the reduced vector space used to restrict the image search. -A higher number improves recall, but increases computational complexity and disk reads. +A higher number improves recall, but increases computational complexity and disk reads. - `rank-count`, default `1000`, the number of vectors that are fully re-ranked in the container using the full vector representation. -A higher number improves recall, but increases the computational complexity and network. +A higher number improves recall, but increases the computational complexity and network. - `collapse.enable`, default `true`, controls de-duping of the top ranked results using image to image similarity. - `collapse.similarity.max-hits`, default `1000`, the number of top-ranked hits to perform de-duping of. Must be less than `rank-count`. - `collapse.similarity.threshold`, default `0.95`, how similar a given image to image must be before it is considered a duplicate. -## Areas of improvement -There are several areas that could be improved. +## Areas of improvement +There are several areas that could be improved. - CLIP model. The exported text transformer model uses fixed sequence length (77), this wastes computations and makes -the model a lot slower than it has to be for shorter sequence lengths. A dynamic sequence length would -make encoding short queries a lot faster than the current model. -It would also be interesting to use the text encoder as a teacher and train a smaller distilled model using a different architecture (for example based on smaller MiniLM models). -- CLIP query embedding caching. The CLIP model is fixed and only uses the text input. Caching the map from text to -embedding would save resources. +the model a lot slower than it has to be for shorter sequence lengths. A dynamic sequence length would +make encoding short queries a lot faster than the current model. +It would also be interesting to use the text encoder as a teacher and train a smaller distilled model using a different architecture (for example based on smaller MiniLM models). +- CLIP query embedding caching. The CLIP model is fixed and only uses the text input. Caching the map from text to +embedding would save resources. ## Shutdown and remove the container: diff --git a/billion-scale-vector-search/README.md b/billion-scale-vector-search/README.md index c9bc82739..a638e0209 100644 --- a/billion-scale-vector-search/README.md +++ b/billion-scale-vector-search/README.md @@ -1,4 +1,3 @@ - @@ -7,21 +6,21 @@ #Vespa -# SPANN Billion Scale Vector Search +# SPANN Billion Scale Vector Search -This sample application demonstrates how to represent *SPANN* (Space Partitioned ANN) using Vespa.ai. +This sample application demonstrates how to represent *SPANN* (Space Partitioned ANN) using Vespa.ai. The *SPANN* approach for approximate nearest neighbor search is described in -[SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search](https://arxiv.org/abs/2111.08566). +[SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search](https://arxiv.org/abs/2111.08566). -SPANN uses a hybrid combination of graph and inverted index methods for approximate nearest neighbor search. +SPANN uses a hybrid combination of graph and inverted index methods for approximate nearest neighbor search. -This sample app demonstrates how the `SPANN` algorithm can be represented using Vespa. -See the [Billion-scale vector search using hybrid HNSW-IF](https://blog.vespa.ai/vespa-hybrid-billion-scale-vector-search/) for details on how `SPANN` -is represented with Vespa. +This sample app demonstrates how the `SPANN` algorithm can be represented using Vespa. +See the [Billion-scale vector search using hybrid HNSW-IF](https://blog.vespa.ai/vespa-hybrid-billion-scale-vector-search/) for details on how `SPANN` +is represented with Vespa. These reproducing steps, demonstrates the functionality using a smaller subset of the 1B vector dataset, suitable -for reproducing on a laptop. +for reproducing on a laptop. **Requirements:** @@ -30,17 +29,19 @@ for reproducing on a laptop. for details and troubleshooting * Alternatively, deploy using [Vespa Cloud](#deployment-note) * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement) -* Architecture: x86_64 or arm64 +* Architecture: x86_64 or arm64 * [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases). * Java 17 installed. -* Python3 and numpy to process the vector dataset +* Python3 and numpy to process the vector dataset * [Apache Maven](https://maven.apache.org/install.html) - this sample app uses custom Java components and Maven is used - to build the application. + to build the application. Verify Docker Memory Limits:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -78,8 +79,8 @@ $ vespa clone billion-scale-vector-search myapp && cd myapp ## Download Vector Data -This sample app uses the Microsoft SPACEV vector dataset from -https://big-ann-benchmarks.com/. +This sample app uses the Microsoft SPACEV vector dataset from +https://big-ann-benchmarks.com/. It uses the first 10M vectors of the 100M slice sample. This sample file is about 1GB (10M vectors): @@ -88,7 +89,7 @@ $ curl -L -o spacev10m_base.i8bin \ https://data.vespa-cloud.com/sample-apps-data/spacev10m_base.i8bin -Generate the feed file for the first 10M vectors from the 100M sample. +Generate the feed file for the first 10M vectors from the 100M sample. This step creates two feed files: * `graph-vectors.jsonl` @@ -103,7 +104,7 @@ $ python3 src/main/python/create-vespa-feed.py spacev10m_base.i8bin -## Build and deploy Vespa app +## Build and deploy Vespa app Build the sample app:
 $ mvn clean package -U
@@ -145,7 +146,7 @@ $ curl -L -o spacev10m_gt100.i8bin \
 
Note, initially, the routine above used the query file from https://comp21storage.blob.core.windows.net/publiccontainer/comp21/spacev1b/query.i8bin -but the link no longer works. +but the link no longer works. Run first 1K queries and evaluate recall@10. A higher number of clusters gives higher recall:
diff --git a/commerce-product-ranking/README.md b/commerce-product-ranking/README.md
index f3d1688a1..8cb30a66b 100644
--- a/commerce-product-ranking/README.md
+++ b/commerce-product-ranking/README.md
@@ -6,14 +6,14 @@
   #Vespa
 
 
-# Vespa Product Ranking 
+# Vespa Product Ranking
 
 This sample application is used to demonstrate how to improve Product Search with Learning to Rank (LTR).
 
 Blog post series:
 
 * [Improving Product Search with Learning to Rank - part one](https://blog.vespa.ai/improving-product-search-with-ltr/)
-This post introduces the dataset used in this sample application and several baseline ranking models. 
+This post introduces the dataset used in this sample application and several baseline ranking models.
 * [Improving Product Search with Learning to Rank - part two](https://blog.vespa.ai/improving-product-search-with-ltr-part-two/)
 This post demonstrates how to train neural methods for search ranking. The neural training routine is found in this
 [notebook](https://github.com/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/train_neural.ipynb)
@@ -21,9 +21,9 @@ This post demonstrates how to train neural methods for search ranking. The neura
 * [Improving Product Search with Learning to Rank - part three](https://blog.vespa.ai/improving-product-search-with-ltr-part-three/)
 This post demonstrates how to train GBDT methods for search ranking. The model uses also neural signals as features. See notebooks:
 [XGBoost](https://github.com/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/Train-xgboost.ipynb)
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/Train-xgboost.ipynb) and 
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/Train-xgboost.ipynb) and
 [LightGBM](https://github.com/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/Train-lightgbm.ipynb)
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/Train-lightgbm.ipynb) 
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/Train-lightgbm.ipynb)
 
 This work uses the largest product relevance dataset released by Amazon:
 
@@ -35,12 +35,12 @@ This work uses the largest product relevance dataset released by Amazon:
 > Each query-product pair is accompanied by additional information.
 > The dataset is multilingual, as it contains queries in English, Japanese, and Spanish.
 
-The dataset is found at [amazon-science/esci-data](https://github.com/amazon-science/esci-data). 
+The dataset is found at [amazon-science/esci-data](https://github.com/amazon-science/esci-data).
 The dataset is released under the [Apache 2.0 license](https://github.com/amazon-science/esci-data/blob/main/LICENSE).
 
 ## Quick start
 
-The following is a quick start recipe on how to get started with this application. 
+The following is a quick start recipe on how to get started with this application.
 
 * [Docker](https://www.docker.com/) Desktop installed and running. 6 GB available memory for Docker is recommended.
   Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-containers.html#memory)
@@ -48,14 +48,16 @@ The following is a quick start recipe on how to get started with this applicatio
 * Alternatively, deploy using [Vespa Cloud](#deployment-note)
 * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
 * Architecture: x86_64 or arm64
-* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download 
+* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download
   a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases).
 * zstd: `brew install zstd`
-* Python3 with `requests` `pyarrow` and `pandas` installed 
+* Python3 with `requests` `pyarrow` and `pandas` installed
 
 Validate Docker resource settings, should be minimum 6 GB:
 
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -94,7 +96,7 @@ $ curl -L -o application/models/title_ranker.onnx \ See [scripts/export-bi-encoder.py](scripts/export-bi-encoder.py) and [scripts/export-cross-encoder.py](scripts/export-cross-encoder.py) for how -to export models from PyTorch to ONNX format. +to export models from PyTorch to ONNX format. Deploy the application:
@@ -113,7 +115,7 @@ It is possible to deploy this app to
 
 ## Run basic system test
 
-This step is optional, but it indexes two 
+This step is optional, but it indexes two
 documents and runs a query [test](https://docs.vespa.ai/en/reference/testing.html)
 
 
@@ -130,9 +132,9 @@ $ zstdcat sample-data/sample-products.jsonl.zstd | vespa feed -
 
-## Evaluation +## Evaluation -Evaluate the `semantic-title` rank profile using the evaluation +Evaluate the `semantic-title` rank profile using the evaluation script ([scripts/evaluate.py](scripts/evaluate.py)). Install requirements @@ -145,7 +147,7 @@ pip3 install numpy pandas pyarrow requests $ python3 scripts/evaluate.py \ --endpoint http://localhost:8080/search/ \ --example_file sample-data/test-sample.parquet \ - --ranking semantic-title + --ranking semantic-title
[evaluate.py](scripts/evaluate.py) runs all the queries in the test split using the `--ranking` `` @@ -153,7 +155,7 @@ and produces a `.run` file with the top ranked results. This file is formatted in the format that `trec_eval` expects.
-$ cat semantic-title.run 
+$ cat semantic-title.run
 
Example ranking produced by Vespa using the `semantic-title` rank-profile for query 535: @@ -196,10 +198,10 @@ Run evaluation : $ trec_eval test.qrels semantic-title.run -m 'ndcg.1=0,2=0.01,3=0.1,4=1'
-This particular product ranking for the query produces a NDCG score of 0.7046. +This particular product ranking for the query produces a NDCG score of 0.7046. Note that the `sample-data/test-sample.parquet` file only contains one query. To get the overall score, one must compute all the NDCG scores of all queries in the -test split and report the *average* NDCG score. +test split and report the *average* NDCG score. Note that the evaluation uses custom NDCG label gains: @@ -231,7 +233,7 @@ $ docker rm -f vespa -## Full evaluation +## Full evaluation Download a pre-processed feed file with all (1,215,854) products: @@ -240,21 +242,21 @@ $ curl -L -o product-search-products.jsonl.zstd \ https://data.vespa-cloud.com/sample-apps-data/product-search-products.jsonl.zstd -This step is resource intensive as the semantic embedding model encodes +This step is resource intensive as the semantic embedding model encodes the product title and description into the dense embedding vector space.
 $ zstdcat product-search-products.jsonl.zstd | vespa feed -
 
-Evaluate the `hybrid` baseline rank profile using the evaluation +Evaluate the `hybrid` baseline rank profile using the evaluation script ([scripts/evaluate.py](scripts/evaluate.py)).
 $ python3 scripts/evaluate.py \
   --endpoint http://localhost:8080/search/ \
   --example_file "https://github.com/amazon-science/esci-data/blob/main/shopping_queries_dataset/shopping_queries_dataset_examples.parquet?raw=true" \
-  --ranking semantic-title 
+  --ranking semantic-title
 
For Vespa cloud deployments we need to pass certificate and the private key. diff --git a/custom-embeddings/README.md b/custom-embeddings/README.md index 5b27478c5..712e312f7 100644 --- a/custom-embeddings/README.md +++ b/custom-embeddings/README.md @@ -1,4 +1,3 @@ - @@ -9,9 +8,9 @@ # Customizing Frozen Data Embeddings in Vespa -This sample application is used to demonstrate how to adapt frozen embeddings from foundational -embedding models. -Frozen data embeddings from Foundational models are an emerging industry practice for reducing the complexity of maintaining and versioning embeddings. The frozen data embeddings are re-used for various tasks, such as classification, search, or recommendations. +This sample application is used to demonstrate how to adapt frozen embeddings from foundational +embedding models. +Frozen data embeddings from Foundational models are an emerging industry practice for reducing the complexity of maintaining and versioning embeddings. The frozen data embeddings are re-used for various tasks, such as classification, search, or recommendations. Read the [blog post](https://blog.vespa.ai/). @@ -25,12 +24,14 @@ The following is a quick start recipe on how to get started with this applicatio * Alternatively, deploy using [Vespa Cloud](#deployment-note) * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement) * Architecture: x86_64 or arm64 -* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download +* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases). Validate Docker resource settings, should be minimum 4 GB:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -61,7 +62,7 @@ Download this sample application: $ vespa clone custom-embeddings my-app && cd my-app -Download a frozen embedding model file, see +Download a frozen embedding model file, see [text embeddings made easy](https://blog.vespa.ai/text-embedding-made-simple/) for details:
 $ mkdir -p models
@@ -71,7 +72,7 @@ $ curl -L -o models/tokenizer.json \
 $ curl -L -o models/frozen.onnx \
   https://github.com/vespa-engine/sample-apps/raw/master/simple-semantic-search/model/e5-small-v2-int8.onnx
 
-$ cp models/frozen.onnx models/tuned.onnx 
+$ cp models/frozen.onnx models/tuned.onnx
 
In this case, we re-use the frozen model as the tuned model to demonstrate functionality. @@ -95,36 +96,36 @@ vespa document ext/3.json ## Query and ranking examples -We demonstrate using `vespa cli`, use `-v` to see the curl equivalent using HTTP api. +We demonstrate using `vespa cli`, use `-v` to see the curl equivalent using HTTP api. ### Simple retrieve all documents with undefined ranking:
 vespa query 'yql=select * from doc where true' \
 'ranking=unranked'
 
-Notice the `relevance`, which is assigned by the rank-profile. +Notice the `relevance`, which is assigned by the rank-profile. -### Using the frozen query tower +### Using the frozen query tower
 vespa query 'yql=select * from doc where {targetHits:10}nearestNeighbor(embedding, q)' \
 'input.query(q)=embed(frozen, "space contains many suns")'
 
-### Using the tuned query tower +### Using the tuned query tower
 vespa query 'yql=select * from doc where {targetHits:10}nearestNeighbor(embedding, q)' \
 'input.query(q)=embed(tuned, "space contains many suns")'
 
In this case, the tuned model is equivelent to the frozen query tower that was used for document embeddings. -### Using the simple weight transformation query tower +### Using the simple weight transformation query tower
 vespa query 'yql=select * from doc where {targetHits:10}nearestNeighbor(embedding, q)' \
 'input.query(q)=embed(tuned, "space contains many suns")' \
 'ranking=simple-similarity'
 
This invokes the `simple-similarity` ranking model, which performs the query transformation -to the tuned embedding. +to the tuned embedding. ### Using the Deep Neural Network similarity
@@ -134,12 +135,12 @@ vespa query 'yql=select * from doc where {targetHits:10}nearestNeighbor(embeddin
 
Note that this just demonstrates the functionality, the custom similarity model is -initialized from random weights. +initialized from random weights. ### Dump all embeddings This is useful for training routines, getting the frozen document embeddings out of Vespa:
-vespa visit --field-set "[all]" > ../vector-data.jsonl 
+vespa visit --field-set "[all]" > ../vector-data.jsonl
 
### Get a specific document and it's embedding(s): diff --git a/examples/document-processing/README.md b/examples/document-processing/README.md index 7a1950247..143c57439 100644 --- a/examples/document-processing/README.md +++ b/examples/document-processing/README.md @@ -1,4 +1,3 @@ - @@ -38,6 +37,8 @@ Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-c for details and troubleshooting:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
diff --git a/examples/generic-request-processing/README.md b/examples/generic-request-processing/README.md index 79cfac9cb..da1b4995d 100644 --- a/examples/generic-request-processing/README.md +++ b/examples/generic-request-processing/README.md @@ -1,4 +1,3 @@ - @@ -24,6 +23,8 @@ Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-c for details and troubleshooting:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
**Check-out, compile and run:** diff --git a/examples/http-api-using-request-handlers-and-processors/README.md b/examples/http-api-using-request-handlers-and-processors/README.md index 13d14e55f..8220092e9 100644 --- a/examples/http-api-using-request-handlers-and-processors/README.md +++ b/examples/http-api-using-request-handlers-and-processors/README.md @@ -1,4 +1,3 @@ - @@ -22,6 +21,8 @@ Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-c for details and troubleshooting:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
**Check-out, compile and run:** diff --git a/examples/multiple-bundles/README.md b/examples/multiple-bundles/README.md index 4ed1793c0..c8a5a909c 100644 --- a/examples/multiple-bundles/README.md +++ b/examples/multiple-bundles/README.md @@ -1,4 +1,3 @@ - @@ -27,6 +26,8 @@ Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-c for details and troubleshooting:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
**Check-out, compile and run:** diff --git a/examples/operations/multinode-HA/README.md b/examples/operations/multinode-HA/README.md index 9071b79c9..bebfebece 100644 --- a/examples/operations/multinode-HA/README.md +++ b/examples/operations/multinode-HA/README.md @@ -1,4 +1,3 @@ - @@ -55,6 +54,8 @@ This guide is tested with Docker using 12G Memory:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Note that this guide is configured for minimum memory use for easier testing, adding: @@ -262,13 +263,13 @@ slobrok storagenode $ docker exec -it node0 /opt/vespa/bin/vespa-model-inspect service container -container @ node4.vespanet : +container @ node4.vespanet : default/container.0 tcp/node4.vespanet:8080 (STATE EXTERNAL QUERY HTTP) tcp/node4.vespanet:19100 (EXTERNAL HTTP) tcp/node4.vespanet:19101 (MESSAGING RPC) tcp/node4.vespanet:19102 (ADMIN RPC) -container @ node5.vespanet : +container @ node5.vespanet : default/container.1 tcp/node5.vespanet:8080 (STATE EXTERNAL QUERY HTTP) tcp/node5.vespanet:19100 (EXTERNAL HTTP) @@ -454,7 +455,7 @@ Notes: * See [slobrok](https://docs.vespa.ai/en/slobrok.html) for the Vespa naming service * The [cluster controller](https://docs.vespa.ai/en/content/content-nodes.html#cluster-controller) cluster manages the system state, and is useful in debugging cluster failures. -* The [metrics proxy](https://docs.vespa.ai/en/reference/metrics.html) is used to aggregate metrics +* The [metrics proxy](https://docs.vespa.ai/en/reference/metrics.html) is used to aggregate metrics from all processes on a node, serving on _http://node:19092/metrics/v1/values_ @@ -744,7 +745,7 @@ export VESPA_CLI_DATA_PLANE_KEY_FILE=pki/client/client.key ``` Feed documents: ``` -vespa feed -t https://localhost:8443 ../../../album-recommendation/ext/documents.jsonl +vespa feed -t https://localhost:8443 ../../../album-recommendation/ext/documents.jsonl ``` Visit documents: ``` @@ -873,32 +874,32 @@ Normal deploy output in this guide, as the service nodes are not started yet: Ports mapped in this guide: ```sh $ netstat -an | egrep '1907[1,2,3]|1905[0,1,2]|19098|2009[2,3,4,5,6,7,8,9]|2010[0,1]|1910[0,1,2]|808[0,1,2,3]|1910[7,8]' | sort -tcp46 0 0 *.19050 *.* LISTEN -tcp46 0 0 *.19051 *.* LISTEN -tcp46 0 0 *.19052 *.* LISTEN -tcp46 0 0 *.19071 *.* LISTEN -tcp46 0 0 *.19072 *.* LISTEN -tcp46 0 0 *.19073 *.* LISTEN -tcp46 0 0 *.19098 *.* LISTEN -tcp46 0 0 *.19100 *.* LISTEN -tcp46 0 0 *.19101 *.* LISTEN -tcp46 0 0 *.19102 *.* LISTEN -tcp46 0 0 *.19107 *.* LISTEN +tcp46 0 0 *.19050 *.* LISTEN +tcp46 0 0 *.19051 *.* LISTEN +tcp46 0 0 *.19052 *.* LISTEN +tcp46 0 0 *.19071 *.* LISTEN +tcp46 0 0 *.19072 *.* LISTEN +tcp46 0 0 *.19073 *.* LISTEN +tcp46 0 0 *.19098 *.* LISTEN +tcp46 0 0 *.19100 *.* LISTEN +tcp46 0 0 *.19101 *.* LISTEN +tcp46 0 0 *.19102 *.* LISTEN +tcp46 0 0 *.19107 *.* LISTEN tcp46 0 0 *.19108 *.* LISTEN -tcp46 0 0 *.20092 *.* LISTEN -tcp46 0 0 *.20093 *.* LISTEN -tcp46 0 0 *.20094 *.* LISTEN -tcp46 0 0 *.20095 *.* LISTEN -tcp46 0 0 *.20096 *.* LISTEN -tcp46 0 0 *.20097 *.* LISTEN -tcp46 0 0 *.20098 *.* LISTEN -tcp46 0 0 *.20099 *.* LISTEN -tcp46 0 0 *.20100 *.* LISTEN -tcp46 0 0 *.20101 *.* LISTEN -tcp46 0 0 *.8080 *.* LISTEN -tcp46 0 0 *.8081 *.* LISTEN -tcp46 0 0 *.8082 *.* LISTEN -tcp46 0 0 *.8083 *.* LISTEN +tcp46 0 0 *.20092 *.* LISTEN +tcp46 0 0 *.20093 *.* LISTEN +tcp46 0 0 *.20094 *.* LISTEN +tcp46 0 0 *.20095 *.* LISTEN +tcp46 0 0 *.20096 *.* LISTEN +tcp46 0 0 *.20097 *.* LISTEN +tcp46 0 0 *.20098 *.* LISTEN +tcp46 0 0 *.20099 *.* LISTEN +tcp46 0 0 *.20100 *.* LISTEN +tcp46 0 0 *.20101 *.* LISTEN +tcp46 0 0 *.8080 *.* LISTEN +tcp46 0 0 *.8081 *.* LISTEN +tcp46 0 0 *.8082 *.* LISTEN +tcp46 0 0 *.8083 *.* LISTEN ``` ## Clean up after testing diff --git a/examples/part-purchases-demo/README.md b/examples/part-purchases-demo/README.md index d673ce88a..41360961d 100644 --- a/examples/part-purchases-demo/README.md +++ b/examples/part-purchases-demo/README.md @@ -25,6 +25,8 @@ A sample Vespa application to assist with learning how to group according to the **Validate environment, should be minimum 4G:**
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-containers.html#memory) diff --git a/examples/predicate-fields/README.md b/examples/predicate-fields/README.md index e379fb093..5970328d3 100644 --- a/examples/predicate-fields/README.md +++ b/examples/predicate-fields/README.md @@ -1,4 +1,3 @@ - @@ -10,13 +9,13 @@ # Vespa sample application - Predicate Search -This sample application demonstrates how to use Vespa [predicate fields](https://docs.vespa.ai/en/predicate-fields.html) +This sample application demonstrates how to use Vespa [predicate fields](https://docs.vespa.ai/en/predicate-fields.html) for indexing boolean *document* constraints. A predicate is a specification of a -boolean constraint in the form of a boolean expression. Vespa's predicate fields +boolean constraint in the form of a boolean expression. Vespa's predicate fields are used to implement [targeted advertising](https://en.wikipedia.org/wiki/Targeted_advertising) systems at scale. -For example, this predicate using three target +For example, this predicate using three target properties or attributes (not to be confused with Vespa [attributes](https://docs.vespa.ai/en/attributes.html)): > gender in ['male'] and age in [30..40] and income in [200..50000] @@ -36,7 +35,7 @@ a high income (measured in thousands). Both `Bob` and `Alice` are indexed in the marketplace's index system (powered by Vespa of course) as Vespa documents. The predicate expression in the *document* determines which queries (other users) they would be retrieved for. The marketplace owner is responsible for managing available targeting properties (e.g `gender`, `age` and `income`) and -at query or recommendation time, set all known properties of the query side user. +at query or recommendation time, set all known properties of the query side user. We also demonstrate how the marketplace can implement query side filter over regular Vespa fields, so a user `Karen` can also specify regular query side constraints (for example, searching for users in a certain age group). @@ -68,6 +67,8 @@ Requirements: Validate Docker resource settings, should be minimum 4 GB:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -142,20 +143,20 @@ $ vespa document -v user.json A user, _Ronald_, enters the marketplace home page and the marketplace knows the following properties about _Ronald_: -- gender: male +- gender: male - age: 32 - income 3000 -The marketplace uses these properties when matching against the index of users using the +The marketplace uses these properties when matching against the index of users using the [predicate](https://docs.vespa.ai/en/reference/query-language-reference.html#predicate) query operator:
 $ vespa query 'yql=select * from sources * where predicate(target, {"gender":["male"]}, {"age":32, "income": 3000})'
 
-The above request will retrieve both _Karen_ and _Alice_ as their `target` predicate matches the user properties. +The above request will retrieve both _Karen_ and _Alice_ as their `target` predicate matches the user properties. -If `Ronald`'s income estimate drops to 100K, _Alice_ will no longer match since _Alice_ +If `Ronald`'s income estimate drops to 100K, _Alice_ will no longer match since _Alice_ has specified a picky income limitation.
@@ -163,13 +164,13 @@ $ vespa query 'yql=select * from sources * where predicate(target, {"gender":["m
 
-## Matching combining predicate with regular filters +## Matching combining predicate with regular filters Another user, _Jon_, enters the marketplace's search page. The marketplace knows the following properties about _Jon_: - gender: male - age: 32 - income 100 -- hobby: sports +- hobby: sports The marketplace search page will fill in the known properties and perform a search against the index of users: @@ -177,8 +178,8 @@ The marketplace search page will fill in the known properties and perform a sear $ vespa query 'yql=select * from sources * where predicate(target, {"gender":["male"], "hobby":["sports"]}, {"age":32, "income": 100})' -The query returns both _Bob_ and _Karen_. Jon is mostly interested in men, so the marketplace can -specify a regular filter on the `gender` field using regular YQL filter syntax, adding `and gender contains "male"` +The query returns both _Bob_ and _Karen_. Jon is mostly interested in men, so the marketplace can +specify a regular filter on the `gender` field using regular YQL filter syntax, adding `and gender contains "male"` as a query constraint:
@@ -193,9 +194,9 @@ This is an example of two-sided filtering, both the search user and the indexed
 Predicate fields control matching and as we have seen from the above examples,
 can also be used with regular query filters.
 
-The combination of document side predicate and query filters determines what documents are returned, 
+The combination of document side predicate and query filters determines what documents are returned,
 but also which documents (users) are exposed to
-[Vespa's ranking framework](https://docs.vespa.ai/en/ranking.html). 
+[Vespa's ranking framework](https://docs.vespa.ai/en/ranking.html).
 
 Feed data with user profile embedding vectors, and the marketplace business user `cpc`:
 
@@ -208,7 +209,7 @@ _Ronald_, enters the marketplace home page again
 - gender: male
 - age: 32
 - income 3000
-- Interest embedding representation based on past user to user interactions, or explicit preferences. 
+- Interest embedding representation based on past user to user interactions, or explicit preferences.
 
 And the marketplace runs a recommendation query to display users for _Ronald_:
 
@@ -218,15 +219,15 @@ $ vespa query 'yql=select * from sources * where predicate(target, {"gender":["m
 
 Notice that we match both _Alice_ and _Karen_, but _Karen_ is ranked higher because karen has paid more,
 her `cpc` score is higher. Notice that the `relevance` is now non-zero, in all the previous examples, the ordering
-of the users was non-deterministic. The ranking formula is expressed in the [user](src/main/application/schemas/user.sd) 
+of the users was non-deterministic. The ranking formula is expressed in the [user](src/main/application/schemas/user.sd)
 schema `default` rank-profile
 
 If we now add personalization to the ranking mix, _Alice_ is ranked higher than _Karen_,
-as _Alice_ is closer to _Ronald_ in the interest embedding vector space. 
+as _Alice_ is closer to _Ronald_ in the interest embedding vector space.
 
 This query combines the `predicate` with the [nearestNeighbor](https://docs.vespa.ai/en/nearest-neighbor-search.html)
 query operator. The marketplace sends the interest embedding vector representation of _Ronald_ with the query
-as a query tensor. 
+as a query tensor.
 
 
 $ vespa query 'yql=select documentid from sources * where (predicate(target, {"gender":["male"]}, {"age":32, "income": 3000})) and ({targetHits:10}nearestNeighbor(profile,profile))' \
diff --git a/incremental-search/search-as-you-type/README.md b/incremental-search/search-as-you-type/README.md
index a85f61880..21105c1b8 100644
--- a/incremental-search/search-as-you-type/README.md
+++ b/incremental-search/search-as-you-type/README.md
@@ -11,14 +11,14 @@
 Uses N-grams to simulate substring search.
 
 
-## Quick Start 
+## Quick Start
 Requirements:
 * [Docker](https://www.docker.com/) Desktop installed and running. 4GB available memory for Docker is recommended.
   Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-containers.html#memory)
   for details and troubleshooting
 * Alternatively, deploy using [Vespa Cloud](#deployment-note)
 * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
-* Architecture: x86_64 or arm64 
+* Architecture: x86_64 or arm64
 * [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download
   a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases).
 * Java 17 installed.
@@ -28,6 +28,8 @@ Requirements:
 Validate environment, must be minimum 4GB:
 
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -61,7 +63,7 @@ $ mvn clean package -U Download feed file:
 $ curl -L -o search-as-you-type-index.jsonl \
-  https://data.vespa-cloud.com/sample-apps-data/search-as-you-type-index.jsonl 
+  https://data.vespa-cloud.com/sample-apps-data/search-as-you-type-index.jsonl
 
Verify that configuration service (deploy api) is ready: diff --git a/incremental-search/search-suggestions/README.md b/incremental-search/search-suggestions/README.md index 1d76dd81c..d16f73abf 100644 --- a/incremental-search/search-suggestions/README.md +++ b/incremental-search/search-suggestions/README.md @@ -52,8 +52,8 @@ A simplistic ranking based on term frequencies is used - a real application could implement a more sophisticated ranking for better suggestions. ### Performance considerations -For short inputs, a trick is to use range queries with -[hitLimit](https://docs.vespa.ai/en/reference/query-language-reference.html#hitlimit) on a fast-search attribute. +For short inputs, a trick is to use range queries with +[hitLimit](https://docs.vespa.ai/en/reference/query-language-reference.html#hitlimit) on a fast-search attribute. This changes the semantics of the prefix query to only match against documents in the top 1K, which is usually what one wants for short prefix lengths. * [Advanced range search with hitLimit](https://docs.vespa.ai/en/performance/practical-search-performance-guide.html#advanced-range-search-with-hitlimit) @@ -65,7 +65,7 @@ Requirements: for details and troubleshooting * Alternatively, deploy using [Vespa Cloud](#deployment-note) * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement) -* Architecture: x86_64 or arm64 +* Architecture: x86_64 or arm64 * [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases). * Java 17 installed. @@ -75,6 +75,8 @@ Requirements: Validate environment, must be minimum 4GB:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -168,7 +170,7 @@ $ docker rm -f vespa
-## Appendix +## Appendix ### Indexed prefix search @@ -176,7 +178,7 @@ Indexed prefix search matches documents where the prefix of the term matches the To do an indexed prefix search the query needs \[{"prefix":true}], see [example](https://docs.vespa.ai/en/reference/schema-reference#match). -It is important to note that this type of prefix search is not supported for fields set to _index_ in the schema. +It is important to note that this type of prefix search is not supported for fields set to _index_ in the schema. Therefore, all fields for prefix search has to be _attributes_. Indexed prefix search is faster than using streaming search, diff --git a/model-inference/README.md b/model-inference/README.md index 10277f534..9ba4b09c5 100644 --- a/model-inference/README.md +++ b/model-inference/README.md @@ -27,7 +27,7 @@ various ways stateless model evaluation can be used in Vespa: - In a post-processing searcher to run a model in batch with the result from the content node. -### Quick Start +### Quick Start Requirements: * [Docker](https://www.docker.com/) Desktop installed and running. 6GB available memory for Docker is recommended. @@ -35,7 +35,7 @@ Requirements: for details and troubleshooting * Alternatively, deploy using [Vespa Cloud](#deployment-note) * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement) -* Architecture: x86_64 or arm64 +* Architecture: x86_64 or arm64 * Minimum 4GB memory dedicated to Docker. * [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases). @@ -46,6 +46,8 @@ Requirements: Validate environment, should be minimum 4G:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -107,7 +109,7 @@ In the following examples we use vespa CLI's curl option, it manages the endpoin List the available models:
-$ vespa curl /model-evaluation/v1/ 
+$ vespa curl /model-evaluation/v1/
 
Details of model the `transformer` model: diff --git a/msmarco-ranking/README.md b/msmarco-ranking/README.md index bbf87dc28..f6e730885 100644 --- a/msmarco-ranking/README.md +++ b/msmarco-ranking/README.md @@ -6,9 +6,9 @@ #Vespa -# MS Marco Passage Ranking +# MS Marco Passage Ranking -This sample application demonstrates how to efficiently represent three ways of applying Transformer-based ranking +This sample application demonstrates how to efficiently represent three ways of applying Transformer-based ranking models for text ranking in Vespa. Blog posts with more details: @@ -19,7 +19,7 @@ Blog posts with more details: - [Post four: Re-ranking using cross-encoders](https://blog.vespa.ai/pretrained-transformer-language-models-for-search-part-4/). -## Transformers for Ranking +## Transformers for Ranking ![Colbert overview](img/colbert_illustration.png) *Illustration from [ColBERT paper](https://arxiv.org/abs/2004.12832)*. @@ -28,28 +28,28 @@ This sample application demonstrates: - Simple single-stage sparse retrieval accelerated by the [WAND](https://docs.vespa.ai/en/using-wand-with-vespa.html) - dynamic pruning algorithm with [BM25](https://docs.vespa.ai/en/reference/bm25.html) ranking. + dynamic pruning algorithm with [BM25](https://docs.vespa.ai/en/reference/bm25.html) ranking. - Dense (vector) search retrieval for efficient candidate retrieval using Vespa's support for [approximate nearest neighbor search](https://docs.vespa.ai/en/approximate-nn-hnsw.html). - Illustrated in figure **a**. + Illustrated in figure **a**. - Re-ranking using the [Late contextual interaction over BERT (ColBERT)](https://arxiv.org/abs/2004.12832) model - This method is illustrated in figure **d**. + This method is illustrated in figure **d**. - Re-ranking using a *cross-encoder* with cross attention between the query and document terms. This method is illustrated in figure **c**. - [Multiphase retrieval and ranking](https://docs.vespa.ai/en/phased-ranking.html) combining efficient retrieval (WAND or ANN) with re-ranking stages. - Using Vespa [embedder](https://docs.vespa.ai/en/embedding.html) functionality. -- Hybrid ranking functionality +- Hybrid ranking functionality -## Retrieval and Ranking -There are several ranking profiles defined in the *passage* document schema. +## Retrieval and Ranking +There are several ranking profiles defined in the *passage* document schema. See [vespa ranking documentation](https://docs.vespa.ai/en/ranking.html) for an overview of how to represent ranking in Vespa. ## Quick start -Make sure to read and agree to the terms and conditions of [MS Marco](https://microsoft.github.io/msmarco/) -before downloading the dataset. The following is a quick start recipe for getting started with a tiny slice of +Make sure to read and agree to the terms and conditions of [MS Marco](https://microsoft.github.io/msmarco/) +before downloading the dataset. The following is a quick start recipe for getting started with a tiny slice of the ms marco passage ranking dataset. Requirements: @@ -60,7 +60,7 @@ Requirements: * Alternatively, deploy using [Vespa Cloud](https://cloud.vespa.ai/) * Operating system: Linux, macOS, or Windows 10 Pro (Docker requirement) * Architecture: x86_64 or arm64 -* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download +* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download a vespa-cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases). * python (requests, tqdm, ir_datasets) @@ -68,6 +68,8 @@ Requirements: Validate Docker resource settings, which should be a minimum of 6 GB:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -75,10 +77,10 @@ Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): $ brew install vespa-cli
-Install python dependencies for exporting the passage dataset: +Install python dependencies for exporting the passage dataset:
-$ pip3 install ir_datasets 
+$ pip3 install ir_datasets
 
For local deployment using docker image: @@ -116,7 +118,7 @@ $ curl -L https://huggingface.co/Xenova/ms-marco-MiniLM-L-6-v2/raw/main/tokenize Deploy the application:
-$ vespa deploy --wait 300 
+$ vespa deploy --wait 300
 
## Feeding sample data @@ -126,18 +128,18 @@ Feed a small sample of data: $ vespa feed ext/docs.jsonl -## Query examples +## Query examples -For example, do a query for *what was the Manhattan Project*: +For example, do a query for *what was the Manhattan Project*: -Note that the `@query` parameter substitution syntax requires Vespa 8.299 or above. +Note that the `@query` parameter substitution syntax requires Vespa 8.299 or above.
 vespa query 'query=what was the manhattan project' \
  'yql=select * from passage where {targetHits: 100}nearestNeighbor(e5, q)'\
  'input.query(q)=embed(e5, @query)' \
  'input.query(qt)=embed(colbert, @query)' \
- 'ranking=e5-colbert' 
+ 'ranking=e5-colbert'
 
@@ -159,15 +161,15 @@ $ docker rm -f vespa
 ### Ranking Evaluation using Ms Marco Passage Ranking development queries
 
 With the [evaluate_passage_run.py](python/evaluate_passage_run.py)
-we can run retrieval and ranking using the methods demonstrated. 
+we can run retrieval and ranking using the methods demonstrated.
 
 To do so, we need to index the entire dataset as follows:
 
-ir_datasets export msmarco-passage docs --format jsonl |python3 python/to-vespa-feed.py | vespa feed - 
+ir_datasets export msmarco-passage docs --format jsonl |python3 python/to-vespa-feed.py | vespa feed -
 
Note that the ir_datasets utility will download MS Marco query evaluation data, -so the first run will take some time to complete. +so the first run will take some time to complete. **BM25(WAND) Single-phase sparse retrieval**
diff --git a/multi-vector-indexing/README.md b/multi-vector-indexing/README.md
index 8b268ee51..8e245b93e 100644
--- a/multi-vector-indexing/README.md
+++ b/multi-vector-indexing/README.md
@@ -9,7 +9,7 @@
 # Vespa Multi-Vector Indexing with HNSW
 
 This sample application is used to demonstrate multi-vector indexing with Vespa.
-Multi-vector indexing was introduced in Vespa 8.144.19. 
+Multi-vector indexing was introduced in Vespa 8.144.19.
 Read the [blog post](https://blog.vespa.ai/semantic-search-with-multi-vector-indexing/) announcing multi-vector indexing.
 
 Go to [multi-vector-indexing](https://pyvespa.readthedocs.io/en/latest/examples/multi-vector-indexing.html)
@@ -20,7 +20,7 @@ vector space.
 
 ## Quick start
 
-The following is a quick start recipe on how to get started with this application. 
+The following is a quick start recipe on how to get started with this application.
 
 * [Docker](https://www.docker.com/) Desktop installed and running. 4 GB available memory for Docker is recommended.
   Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-containers.html#memory)
@@ -28,12 +28,14 @@ The following is a quick start recipe on how to get started with this applicatio
 * Alternatively, deploy using [Vespa Cloud](#deployment-note)
 * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
 * Architecture: x86_64 or arm64
-* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download 
+* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download
   a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases).
 
 Validate Docker resource settings, should be minimum 4 GB:
 
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -78,7 +80,7 @@ It is possible to deploy this app to Index the Wikipedia articles. This embeds all the paragraphs using the native embedding model, which is computationally expensive for CPU. For production use cases, use [Vespa Cloud with GPU](https://cloud.vespa.ai/en/reference/services#gpu) -instances and [autoscaling](https://cloud.vespa.ai/en/autoscaling) enabled. +instances and [autoscaling](https://cloud.vespa.ai/en/autoscaling) enabled.
 $ zstdcat ext/articles.jsonl.zst | vespa feed -
@@ -86,7 +88,7 @@ $ zstdcat ext/articles.jsonl.zst | vespa feed -
 
 
 ## Query and ranking examples
-We demonstrate using `vespa cli`, use `-v` to see the curl equivalent using HTTP api.  
+We demonstrate using `vespa cli`, use `-v` to see the curl equivalent using HTTP api.
 
 ### Simple retrieve all articles with undefined ranking
 
@@ -101,8 +103,8 @@ $ vespa query 'yql=select * from wiki where userQuery()' \
   'ranking=bm25'
 
-Notice the `relevance`, which is assigned by the rank-profile expression. -Also, note that the matched keywords are highlighted in the `paragraphs` field. +Notice the `relevance`, which is assigned by the rank-profile expression. +Also, note that the matched keywords are highlighted in the `paragraphs` field. ### Semantic vector search on the paragraph level
@@ -121,7 +123,7 @@ This index corresponds to the following paragraph:
 "In railway timetables 24:00 means the \"end\" of the day. For example, a train due to arrive at a station during the last minute of a day arrives at 24:00; but trains which depart during the first minute of the day go at 00:00."
 ```
 The [tensor presentation format](search/query-profiles/default.xml) is overridden in
-this sample application to shorten down the output. 
+this sample application to shorten down the output.
 
 ### Hybrid search and ranking
 Hybrid combining keyword search on the article level with vector search in the paragraph index:
@@ -134,8 +136,8 @@ $ vespa query 'yql=select * from wiki where userQuery() or ({targetHits:1}neares
   'hits=1'
 
-This case combines keyword search with vector (nearestNeighbor) search. -The `hybrid` rank-profile also calculates several additional features using +This case combines keyword search with vector (nearestNeighbor) search. +The `hybrid` rank-profile also calculates several additional features using [tensor expressions](https://docs.vespa.ai/en/tensor-user-guide.html): - `firstPhase` is the score of the first ranking phase, configured in the hybrid @@ -145,14 +147,14 @@ profile as `cos(distance(field, paragraph_embeddings))`. See the `hybrid` rank-profile in the [schema](schemas/wiki.sd) for details. The [Vespa Tensor Playground](https://docs.vespa.ai/playground/) is useful to play -with tensor expressions. +with tensor expressions. -These additional features are calculated during [second-phase](https://docs.vespa.ai/en/phased-ranking.html) -ranking to limit the number of vector computations. +These additional features are calculated during [second-phase](https://docs.vespa.ai/en/phased-ranking.html) +ranking to limit the number of vector computations. ### Hybrid search and filter -Filtering is also supported, also disable bolding. +Filtering is also supported, also disable bolding.
 $ vespa query 'yql=select * from wiki where url contains "9985" and userQuery() or ({targetHits:1}nearestNeighbor(paragraph_embeddings,q))' \
diff --git a/multilingual-search/README.md b/multilingual-search/README.md
index 8ffbc2728..9cbdc351e 100644
--- a/multilingual-search/README.md
+++ b/multilingual-search/README.md
@@ -9,13 +9,13 @@
 # Multilingual Search with multilingual embeddings
 
 This sample application demonstrates multilingual search
-using multilingual embeddings. 
- 
-Read the [blog post](https://blog.vespa.ai/simplify-search-with-multilingual-embeddings/). 
+using multilingual embeddings.
+
+Read the [blog post](https://blog.vespa.ai/simplify-search-with-multilingual-embeddings/).
 
 ## Quick start
 
-The following is a quick start recipe for getting started with this application. 
+The following is a quick start recipe for getting started with this application.
 
 * [Docker](https://www.docker.com/) Desktop installed and running. 4 GB available memory for Docker is recommended.
   Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-containers.html#memory)
@@ -23,12 +23,14 @@ The following is a quick start recipe for getting started with this application.
 * Alternatively, deploy using [Vespa Cloud](#deployment-note)
 * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
 * Architecture: x86_64 or arm64
-* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download 
+* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download
   a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases).
 
 Validate Docker resource settings, should be minimum 4 GB:
 
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -59,10 +61,10 @@ Download this sample application: $ vespa clone multilingual-search my-app && cd my-app
-This sample app embedder configuration in [services.xml](services.xml) points to a quantized model. +This sample app embedder configuration in [services.xml](services.xml) points to a quantized model. Alternatively, [export your own model](https://docs.vespa.ai/en/onnx.html#onnx-export), see also the -export script in [simple-semantic-search](../simple-semantic-search/README.md). +export script in [simple-semantic-search](../simple-semantic-search/README.md). Deploy the application :
@@ -74,7 +76,7 @@ It is possible to deploy this app to
 [Vespa Cloud](https://cloud.vespa.ai/en/getting-started#deploy-sample-applications).
 
 ## Evaluation
-The following reproduces the results reported on the MIRACL Swahili(sw) dataset. 
+The following reproduces the results reported on the MIRACL Swahili(sw) dataset.
 
 Install `trec_eval`:
 
@@ -93,7 +95,7 @@ The evaluation script queries Vespa (requires pandas and requests libraries):
 
 $ pip3 install pandas requests
 
- + ## E5 multilingual embedding model Using the multilingual embedding model @@ -110,7 +112,7 @@ $ trec_eval -mndcg_cut.10 ext/qrels.miracl-v1.0-sw-dev.tsv semantic.run
Which should produce the following:
-ndcg_cut_10           	all 	0.6848	
+ndcg_cut_10           	all 	0.6848
 
## BM25 diff --git a/text-image-search/README.md b/text-image-search/README.md index daa804c98..ebc40021d 100644 --- a/text-image-search/README.md +++ b/text-image-search/README.md @@ -47,14 +47,14 @@ and the Python app, is that the transformation from text to a vector representation has been moved from Python and into Vespa. This includes both tokenization and transformer model evaluation. -## Quick start +## Quick start Requirements: * [Docker](https://www.docker.com/) Desktop installed and running. 6GB available memory for Docker is recommended. Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-containers.html#memory) for details and troubleshooting * Alternatively, deploy using [Vespa Cloud](#deployment-note) * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement) -* Architecture: x86_64 or arm64 +* Architecture: x86_64 or arm64 * [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases). * Java 17 installed. @@ -63,11 +63,13 @@ Requirements: * python3.8+ (tested with 3.8) The following instructions sets up the stand-alone Vespa application using the -[Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html). +[Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html). Validate environment, should be minimum 6G:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -82,7 +84,7 @@ $ vespa config set target local Checkout this sample app :
-$ vespa clone text-image-search myapp && cd myapp 
+$ vespa clone text-image-search myapp && cd myapp
 
Set up transformer model: @@ -121,7 +123,7 @@ It is possible to deploy this app to Running [Vespa System Tests](https://docs.vespa.ai/en/reference/testing.html) which runs a set of basic tests to verify that the application is working as expected.
-$ vespa test src/test/application/tests/system-test/image-search-system-test.json 
+$ vespa test src/test/application/tests/system-test/image-search-system-test.json
 
Download and extract image data: @@ -144,7 +146,7 @@ $ python3 src/python/clip_feed.py Alternatively, instead of computing the embeddings, use the pre-computed embeddings:
 $ curl -L -o flickr-8k-clip-embeddings.jsonl.zst \
-    https://data.vespa-cloud.com/sample-apps-data/flickr-8k-clip-embeddings.jsonl.zst 
+    https://data.vespa-cloud.com/sample-apps-data/flickr-8k-clip-embeddings.jsonl.zst
 
diff --git a/transformers/README.md b/transformers/README.md
index bbe6e0f31..cd54a0322 100644
--- a/transformers/README.md
+++ b/transformers/README.md
@@ -9,17 +9,17 @@
 # Vespa sample application - Transformers
 
 This sample application is a small example of using Transformer-based cross-encoders for ranking
-using a small sample from the MS MARCO data set. 
+using a small sample from the MS MARCO data set.
 
 See also the more comprehensive [MS Marco Ranking sample app](../msmarco-ranking/)
-which uses multiple Transformer based models for retrieval and ranking. 
+which uses multiple Transformer based models for retrieval and ranking.
 
 This application uses [phased ranking](https://docs.vespa.ai/en/phased-ranking.html), first a set of candidate
-documents are retrieved using [WAND](https://docs.vespa.ai/en/using-wand-with-vespa.html). 
+documents are retrieved using [WAND](https://docs.vespa.ai/en/using-wand-with-vespa.html).
 
-The hits retrieved by the WAND operator are ranked using [BM25](https://docs.vespa.ai/en/reference/bm25.html). 
+The hits retrieved by the WAND operator are ranked using [BM25](https://docs.vespa.ai/en/reference/bm25.html).
 The top-k ranking documents from the first phase
-are re-ranked using a cross-encoder Transformer model. 
+are re-ranked using a cross-encoder Transformer model.
 The cross-encoder re-ranking uses [global phase](https://docs.vespa.ai/en/phased-ranking.html#global-phase), evaluated in the
 Vespa stateless container.
 
@@ -30,14 +30,16 @@ Vespa stateless container.
   for details and troubleshooting
 * Alternatively, deploy using [Vespa Cloud](#deployment-note)
 * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
-* Architecture: x86_64 or arm64 
+* Architecture: x86_64 or arm64
 * [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download
   a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases).
-* python3.8+ to export models from Huggingface. 
+* python3.8+ to export models from Huggingface.
 
 Validate environment, should be minimum 6G:
 
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -69,7 +71,7 @@ $ python3 -m pip install --upgrade pip $ python3 -m pip install torch transformers onnx onnxruntime
-For this sample application, we use a [fine-tuned MiniLM](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) +For this sample application, we use a [fine-tuned MiniLM](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) model with 6 layers and 22 million parameters. This step downloads the cross-encoder transformer model, converts it to an ONNX model, and saves it in the `files` directory: @@ -96,7 +98,7 @@ Wait for the application endpoint to become available: $ vespa status --wait 300
-Convert from MS MARCO format to Vespa JSON feed format. +Convert from MS MARCO format to Vespa JSON feed format. To use the entire MS MARCO data set, use the download script. This step creates a `vespa.json` file in the `msmarco` directory:
@@ -130,7 +132,7 @@ $ docker rm -f vespa
 
-## Bonus +## Bonus To export other cross-encoder models, change the code in "src/python/setup-model.py". However, this sample application uses a Vespa [WordPiece embedder](https://docs.vespa.ai/en/reference/embedding-reference.html#wordpiece-embedder), diff --git a/use-case-shopping/README.md b/use-case-shopping/README.md index 1c1d2dd34..9b52c5ad9 100644 --- a/use-case-shopping/README.md +++ b/use-case-shopping/README.md @@ -24,7 +24,7 @@ Requirements: for details and troubleshooting * Alternatively, deploy using [Vespa Cloud](#deployment-note) * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement) -* Architecture: x86_64 or arm64 +* Architecture: x86_64 or arm64 * [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases). * Java 17 installed. @@ -38,6 +38,8 @@ See also [Vespa quick start guide](https://docs.vespa.ai/en/vespa-quick-start.ht Validate environment, should be minimum 4 GB:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -90,7 +92,7 @@ $ vespa test src/test/application/tests/system-test/product-search-test.json First, create data feed for products:
-$ curl -L -o meta_sports_20k_sample.json.zst https://data.vespa-cloud.com/sample-apps-data/meta_sports_20k_sample.json.zst 
+$ curl -L -o meta_sports_20k_sample.json.zst https://data.vespa-cloud.com/sample-apps-data/meta_sports_20k_sample.json.zst
 $ zstdcat meta_sports_20k_sample.json.zst | ./convert_meta.py > feed_items.json
 
@@ -103,7 +105,7 @@ $ zstdcat reviews_sports_24k_sample.json.zst | ./convert_reviews.py > feed_revie Next, data feed for query suggestions:
 $ pip3 install spacy mmh3
-$ python3 -m spacy download en_core_web_sm 
+$ python3 -m spacy download en_core_web_sm
 $ ./create_suggestions.py feed_items.json > feed_suggestions.json
 
diff --git a/vector-streaming-search/README.md b/vector-streaming-search/README.md index 5861e4080..b14f2ae8f 100644 --- a/vector-streaming-search/README.md +++ b/vector-streaming-search/README.md @@ -1,4 +1,3 @@ - @@ -19,7 +18,7 @@ The subject and content of a mail are combined and embedded into a 384-dimension using a [Bert embedder](https://docs.vespa.ai/en/reference/embedding-reference.html#bert-embedder). ## Quick start -The following is a quick recipe for getting started with this application. +The following is a quick recipe for getting started with this application. * [Docker](https://www.docker.com/) Desktop installed and running. 4 GB available memory for Docker is recommended. Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-containers.html#memory) @@ -27,13 +26,15 @@ The following is a quick recipe for getting started with this application. * Alternatively, deploy using [Vespa Cloud](#deployment-note) * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement) * Architecture: x86_64 or arm64 -* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download +* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases). Validate Docker resource settings, should be minimum 4 GB:
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 
Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html): @@ -69,7 +70,7 @@ Download this sample application: $ vespa clone vector-streaming-search my-app && cd my-app
-Deploy the application : +Deploy the application :
 $ vespa deploy --wait 300
@@ -85,7 +86,7 @@ It is possible to deploy this app to
 During feeding the `subject` and `content` of a mail document are embedded using the Bert embedding model.
 This is computationally expensive for CPU.
 For production use cases, use [Vespa Cloud with GPU](https://cloud.vespa.ai/en/reference/services#gpu)
-instances and [autoscaling](https://cloud.vespa.ai/en/autoscaling) enabled. 
+instances and [autoscaling](https://cloud.vespa.ai/en/autoscaling) enabled.
 
 
 $ vespa feed ext/docs.json
@@ -93,7 +94,7 @@ $ vespa feed ext/docs.json
 
 ## Query and ranking examples
 The following uses [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html) to execute queries.
-Use `-v` to see the curl equivalent using HTTP API.  
+Use `-v` to see the curl equivalent using HTTP API.
 
 ### Exact nearest neighbor search
 
@@ -128,4 +129,3 @@ Tear down the running container:
 
 $ docker rm -f vespa
 
-