Merge pull request #1553 from vespa-engine/aressem/add-podman-cmd

Add alternative podman command for memory info
vespa-engine · Oct 30, 2024 · 79c7f4c · 79c7f4c
2 parents 9eb85b4 + 4b0a224
commit 79c7f4c
Show file tree

Hide file tree

Showing 21 changed files with 293 additions and 262 deletions.
diff --git a/billion-scale-image-search/README.md b/billion-scale-image-search/README.md
diff --git a/billion-scale-vector-search/README.md b/billion-scale-vector-search/README.md
@@ -1,4 +1,3 @@
-
 <!-- Copyright Vespa.ai. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root.-->
 
 <picture>
@@ -7,21 +6,21 @@
   <img alt="#Vespa" width="200" src="https://assets.vespa.ai/logos/Vespa-logo-dark-RGB.svg" style="margin-bottom: 25px;">
 </picture>
 
-# SPANN Billion Scale Vector Search 
+# SPANN Billion Scale Vector Search
 
-This sample application demonstrates how to represent *SPANN* (Space Partitioned ANN) using Vespa.ai. 
+This sample application demonstrates how to represent *SPANN* (Space Partitioned ANN) using Vespa.ai.
 
 The *SPANN* approach for approximate nearest neighbor search is described in
-[SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search](https://arxiv.org/abs/2111.08566). 
+[SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search](https://arxiv.org/abs/2111.08566).
 
-SPANN uses a hybrid combination of graph and inverted index methods for approximate nearest neighbor search. 
+SPANN uses a hybrid combination of graph and inverted index methods for approximate nearest neighbor search.
 
-This sample app demonstrates how the `SPANN` algorithm can be represented using Vespa. 
-See the [Billion-scale vector search using hybrid HNSW-IF](https://blog.vespa.ai/vespa-hybrid-billion-scale-vector-search/) for details on how `SPANN` 
-is represented with Vespa. 
+This sample app demonstrates how the `SPANN` algorithm can be represented using Vespa.
+See the [Billion-scale vector search using hybrid HNSW-IF](https://blog.vespa.ai/vespa-hybrid-billion-scale-vector-search/) for details on how `SPANN`
+is represented with Vespa.
 
 These reproducing steps, demonstrates the functionality using a smaller subset of the 1B vector dataset, suitable
-for reproducing on a laptop. 
+for reproducing on a laptop.
 
 **Requirements:**
 
@@ -30,17 +29,19 @@ for reproducing on a laptop.
   for details and troubleshooting
 * Alternatively, deploy using [Vespa Cloud](#deployment-note)
 * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
-* Architecture: x86_64 or arm64 
+* Architecture: x86_64 or arm64
 * [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download
   a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases).
 * <a href="https://openjdk.org/projects/jdk/17/" data-proofer-ignore>Java 17</a> installed.
-* Python3 and numpy to process the vector dataset 
+* Python3 and numpy to process the vector dataset
 * [Apache Maven](https://maven.apache.org/install.html) - this sample app uses custom Java components and Maven is used
-  to build the application. 
+  to build the application.
 
 Verify Docker Memory Limits:
 <pre>
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 </pre>
 
 Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html):
@@ -78,8 +79,8 @@ $ vespa clone billion-scale-vector-search myapp && cd myapp
 
 
 ## Download Vector Data
-This sample app uses the Microsoft SPACEV vector dataset from 
-https://big-ann-benchmarks.com/. 
+This sample app uses the Microsoft SPACEV vector dataset from
+https://big-ann-benchmarks.com/.
 
 It uses the first 10M vectors of the 100M slice sample.
 This sample file is about 1GB (10M vectors):
@@ -88,7 +89,7 @@ $ curl -L -o spacev10m_base.i8bin \
   https://data.vespa-cloud.com/sample-apps-data/spacev10m_base.i8bin
 </pre>
 
-Generate the feed file for the first 10M vectors from the 100M sample. 
+Generate the feed file for the first 10M vectors from the 100M sample.
 This step creates two feed files:
 
 * `graph-vectors.jsonl`
@@ -103,7 +104,7 @@ $ python3 src/main/python/create-vespa-feed.py spacev10m_base.i8bin
 </pre>
 
 
-## Build and deploy Vespa app 
+## Build and deploy Vespa app
 Build the sample app:
 <pre data-test="exec" data-test-expect="BUILD SUCCESS" data-test-timeout="300">
 $ mvn clean package -U
@@ -145,7 +146,7 @@ $ curl -L -o spacev10m_gt100.i8bin \
 </pre>
 
 Note, initially, the routine above used the query file from https://comp21storage.blob.core.windows.net/publiccontainer/comp21/spacev1b/query.i8bin
-but the link no longer works. 
+but the link no longer works.
 
 Run first 1K queries and evaluate recall@10. A higher number of clusters gives higher recall:
 <pre data-test="exec">

diff --git a/commerce-product-ranking/README.md b/commerce-product-ranking/README.md
@@ -6,24 +6,24 @@
   <img alt="#Vespa" width="200" src="https://assets.vespa.ai/logos/Vespa-logo-dark-RGB.svg" style="margin-bottom: 25px;">
 </picture>
 
-# Vespa Product Ranking 
+# Vespa Product Ranking
 
 This sample application is used to demonstrate how to improve Product Search with Learning to Rank (LTR).
 
 Blog post series:
 
 * [Improving Product Search with Learning to Rank - part one](https://blog.vespa.ai/improving-product-search-with-ltr/)
-This post introduces the dataset used in this sample application and several baseline ranking models. 
+This post introduces the dataset used in this sample application and several baseline ranking models.
 * [Improving Product Search with Learning to Rank - part two](https://blog.vespa.ai/improving-product-search-with-ltr-part-two/)
 This post demonstrates how to train neural methods for search ranking. The neural training routine is found in this
 [notebook](https://github.com/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/train_neural.ipynb)
 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/train_neural.ipynb).
 * [Improving Product Search with Learning to Rank - part three](https://blog.vespa.ai/improving-product-search-with-ltr-part-three/)
 This post demonstrates how to train GBDT methods for search ranking. The model uses also neural signals as features. See notebooks:
 [XGBoost](https://github.com/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/Train-xgboost.ipynb)
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/Train-xgboost.ipynb) and 
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/Train-xgboost.ipynb) and
 [LightGBM](https://github.com/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/Train-lightgbm.ipynb)
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/Train-lightgbm.ipynb) 
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/sample-apps/blob/master/commerce-product-ranking/notebooks/Train-lightgbm.ipynb)
 
 This work uses the largest product relevance dataset released by Amazon:
 
@@ -35,27 +35,29 @@ This work uses the largest product relevance dataset released by Amazon:
 > Each query-product pair is accompanied by additional information.
 > The dataset is multilingual, as it contains queries in English, Japanese, and Spanish.
 
-The dataset is found at [amazon-science/esci-data](https://github.com/amazon-science/esci-data). 
+The dataset is found at [amazon-science/esci-data](https://github.com/amazon-science/esci-data).
 The dataset is released under the [Apache 2.0 license](https://github.com/amazon-science/esci-data/blob/main/LICENSE).
 
 ## Quick start
 
-The following is a quick start recipe on how to get started with this application. 
+The following is a quick start recipe on how to get started with this application.
 
 * [Docker](https://www.docker.com/) Desktop installed and running. 6 GB available memory for Docker is recommended.
   Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-containers.html#memory)
   for details and troubleshooting
 * Alternatively, deploy using [Vespa Cloud](#deployment-note)
 * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
 * Architecture: x86_64 or arm64
-* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download 
+* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download
   a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases).
 * zstd: `brew install zstd`
-* Python3 with `requests` `pyarrow` and `pandas` installed 
+* Python3 with `requests` `pyarrow` and `pandas` installed
 
 Validate Docker resource settings, should be minimum 6 GB:
 <pre>
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 </pre>
 
 Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html):
@@ -94,7 +96,7 @@ $ curl -L -o application/models/title_ranker.onnx \
 
 See [scripts/export-bi-encoder.py](scripts/export-bi-encoder.py) and
 [scripts/export-cross-encoder.py](scripts/export-cross-encoder.py) for how
-to export models from PyTorch to ONNX format. 
+to export models from PyTorch to ONNX format.
 
 Deploy the application:
 <pre data-test="exec" data-test-assert-contains="Success">
@@ -113,7 +115,7 @@ It is possible to deploy this app to
 
 ## Run basic system test
 
-This step is optional, but it indexes two 
+This step is optional, but it indexes two
 documents and runs a query [test](https://docs.vespa.ai/en/reference/testing.html)
 
 <pre data-test="exec" data-test-assert-contains="Success">
@@ -130,9 +132,9 @@ $ zstdcat sample-data/sample-products.jsonl.zstd | vespa feed -
 </pre>
 
 
-## Evaluation 
+## Evaluation
 
-Evaluate the `semantic-title` rank profile using the evaluation 
+Evaluate the `semantic-title` rank profile using the evaluation
 script ([scripts/evaluate.py](scripts/evaluate.py)).
 
 Install requirements
@@ -145,15 +147,15 @@ pip3 install numpy pandas pyarrow requests
 $ python3 scripts/evaluate.py \
   --endpoint http://localhost:8080/search/ \
   --example_file sample-data/test-sample.parquet \
-  --ranking semantic-title 
+  --ranking semantic-title
 </pre>
 
 [evaluate.py](scripts/evaluate.py) runs all the queries in the test split using the `--ranking` `<rank-profile>`
 and produces a `<ranking>.run` file with the top ranked results.
 This file is formatted in the format that `trec_eval` expects.
 
 <pre data-test="exec" data-test-assert-contains="B08PB9TTKT">
-$ cat semantic-title.run 
+$ cat semantic-title.run
 </pre>
 
 Example ranking produced by Vespa using the `semantic-title` rank-profile for query 535:
@@ -196,10 +198,10 @@ Run evaluation :
 $ trec_eval test.qrels semantic-title.run -m 'ndcg.1=0,2=0.01,3=0.1,4=1'
 </pre>
 
-This particular product ranking for the query produces a NDCG score of 0.7046. 
+This particular product ranking for the query produces a NDCG score of 0.7046.
 Note that the `sample-data/test-sample.parquet` file only contains one query.
 To get the overall score, one must compute all the NDCG scores of all queries in the
-test split and report the *average* NDCG score.  
+test split and report the *average* NDCG score.
 
 Note that the evaluation uses custom NDCG label gains:
 
@@ -231,7 +233,7 @@ $ docker rm -f vespa
 </pre>
 
 
-## Full evaluation 
+## Full evaluation
 
 Download a pre-processed feed file with all (1,215,854) products:
 
@@ -240,21 +242,21 @@ $  curl -L -o product-search-products.jsonl.zstd \
     https://data.vespa-cloud.com/sample-apps-data/product-search-products.jsonl.zstd
 </pre>
 
-This step is resource intensive as the semantic embedding model encodes 
+This step is resource intensive as the semantic embedding model encodes
 the product title and description into the dense embedding vector space.
 
 <pre>
 $ zstdcat product-search-products.jsonl.zstd | vespa feed -
 </pre>
 
-Evaluate the `hybrid` baseline rank profile using the evaluation 
+Evaluate the `hybrid` baseline rank profile using the evaluation
 script ([scripts/evaluate.py](scripts/evaluate.py)).
 
 <pre>
 $ python3 scripts/evaluate.py \
   --endpoint http://localhost:8080/search/ \
   --example_file "https://github.com/amazon-science/esci-data/blob/main/shopping_queries_dataset/shopping_queries_dataset_examples.parquet?raw=true" \
-  --ranking semantic-title 
+  --ranking semantic-title
 </pre>
 
 For Vespa cloud deployments we need to pass certificate and the private key.

diff --git a/custom-embeddings/README.md b/custom-embeddings/README.md
@@ -1,4 +1,3 @@
-
 <!-- Copyright Vespa.ai. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root.-->
 
 <picture>
@@ -9,9 +8,9 @@
 
 # Customizing Frozen Data Embeddings in Vespa
 
-This sample application is used to demonstrate how to adapt frozen embeddings from foundational 
-embedding models. 
-Frozen data embeddings from Foundational models are an emerging industry practice for reducing the complexity of maintaining and versioning embeddings. The frozen data embeddings are re-used for various tasks, such as classification, search, or recommendations. 
+This sample application is used to demonstrate how to adapt frozen embeddings from foundational
+embedding models.
+Frozen data embeddings from Foundational models are an emerging industry practice for reducing the complexity of maintaining and versioning embeddings. The frozen data embeddings are re-used for various tasks, such as classification, search, or recommendations.
 
 Read the [blog post](https://blog.vespa.ai/).
 
@@ -25,12 +24,14 @@ The following is a quick start recipe on how to get started with this applicatio
 * Alternatively, deploy using [Vespa Cloud](#deployment-note)
 * Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
 * Architecture: x86_64 or arm64
-* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download 
+* [Homebrew](https://brew.sh/) to install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html), or download
   a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases).
 
 Validate Docker resource settings, should be minimum 4 GB:
 <pre>
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 </pre>
 
 Install [Vespa CLI](https://docs.vespa.ai/en/vespa-cli.html):
@@ -61,7 +62,7 @@ Download this sample application:
 $ vespa clone custom-embeddings my-app && cd my-app
 </pre>
 
-Download a frozen embedding model file, see 
+Download a frozen embedding model file, see
 [text embeddings made easy](https://blog.vespa.ai/text-embedding-made-simple/) for details:
 <pre data-test="exec">
 $ mkdir -p models
@@ -71,7 +72,7 @@ $ curl -L -o models/tokenizer.json \
 $ curl -L -o models/frozen.onnx \
   https://github.com/vespa-engine/sample-apps/raw/master/simple-semantic-search/model/e5-small-v2-int8.onnx
 
-$ cp models/frozen.onnx models/tuned.onnx 
+$ cp models/frozen.onnx models/tuned.onnx
 </pre>
 
 In this case, we re-use the frozen model as the tuned model to demonstrate functionality.
@@ -95,36 +96,36 @@ vespa document ext/3.json
 
 ## Query and ranking examples
 
-We demonstrate using `vespa cli`, use `-v` to see the curl equivalent using HTTP api.  
+We demonstrate using `vespa cli`, use `-v` to see the curl equivalent using HTTP api.
 
 ### Simple retrieve all documents with undefined ranking:
 <pre data-test="exec" data-test-assert-contains='"totalCount": 3'>
 vespa query 'yql=select * from doc where true' \
 'ranking=unranked'
 </pre>
-Notice the `relevance`, which is assigned by the rank-profile. 
+Notice the `relevance`, which is assigned by the rank-profile.
 
-### Using the frozen query tower 
+### Using the frozen query tower
 <pre data-test="exec" data-test-assert-contains='"totalCount": 3'>
 vespa query 'yql=select * from doc where {targetHits:10}nearestNeighbor(embedding, q)' \
 'input.query(q)=embed(frozen, "space contains many suns")'
 </pre>
 
-### Using the tuned query tower 
+### Using the tuned query tower
 <pre data-test="exec" data-test-assert-contains='"totalCount": 3'>
 vespa query 'yql=select * from doc where {targetHits:10}nearestNeighbor(embedding, q)' \
 'input.query(q)=embed(tuned, "space contains many suns")'
 </pre>
 In this case, the tuned model is equivelent to the frozen query tower that was used for document embeddings.
 
-### Using the simple weight transformation query tower 
+### Using the simple weight transformation query tower
 <pre data-test="exec" data-test-assert-contains='"totalCount": 3'>
 vespa query 'yql=select * from doc where {targetHits:10}nearestNeighbor(embedding, q)' \
 'input.query(q)=embed(tuned, "space contains many suns")' \
 'ranking=simple-similarity'
 </pre>
 This invokes the `simple-similarity` ranking model, which performs the query transformation
-to the tuned embedding. 
+to the tuned embedding.
 
 ### Using the Deep Neural Network similarity
 <pre data-test="exec" data-test-assert-contains='"totalCount": 3'>
@@ -134,12 +135,12 @@ vespa query 'yql=select * from doc where {targetHits:10}nearestNeighbor(embeddin
 </pre>
 
 Note that this just demonstrates the functionality, the custom similarity model is
-initialized from random weights. 
+initialized from random weights.
 
 ### Dump all embeddings
 This is useful for training routines, getting the frozen document embeddings out of Vespa:
 <pre>
-vespa visit --field-set "[all]" > ../vector-data.jsonl 
+vespa visit --field-set "[all]" > ../vector-data.jsonl
 </pre>
 
 ### Get a specific document and it's embedding(s):

diff --git a/examples/document-processing/README.md b/examples/document-processing/README.md
@@ -1,4 +1,3 @@
-
 <!-- Copyright Vespa.ai. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root. -->
 
 <picture>
@@ -38,6 +37,8 @@ Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-c
 for details and troubleshooting:
 <pre>
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 </pre>
 
 

diff --git a/examples/generic-request-processing/README.md b/examples/generic-request-processing/README.md
@@ -1,4 +1,3 @@
-
 <!-- Copyright Vespa.ai. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root. -->
 
 <picture>
@@ -24,6 +23,8 @@ Refer to [Docker memory](https://docs.vespa.ai/en/operations-selfhosted/docker-c
 for details and troubleshooting:
 <pre>
 $ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
 </pre>
 
 **Check-out, compile and run:**