Opnesearch Learning-to-Rank POC

This repo is intended as for educational purposes. It contains the steps needed to get a working implementation of the Opensearch Learning to Rank Plugin working locally with a dockerized version of Opensearch.

The best way to understand LTR is to read the official docs here.

Requirements

Docker
opensearch-plugin CLI tool
- brew install opensearch

1. Set up Opensearch in Docker

NOTE: To use the OpenSearch image with a custom plugin, you must first create a Dockerfile. See

Working with plugins (Opensearch)
Installing (LTR official docs)
Opensearch version: 2.5
LTR Plugin v2.1.0 (compatible with OS 2.5). See plugin release history on GitHub.

Run:

docker compose up
curl http://localhost:9200 in a new terminal to check the cluster

2. Create index mapping and bulk index data

./bin/index.sh

Search for a movie to confirm it worked:

curl -X GET "http://localhost:9200/tmdb/_search?pretty=true" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match": {
            "title": "First"
        }
    }
}'

3. Set up an LTR feature store

Set up default featureset index:

curl -X PUT "http://localhost:9200/_ltr?pretty=true" -H 'Content-Type: application/json'

Create a feature set called moviefeatureset:

curl -X POST "http://localhost:9200/_ltr/_featureset/moviefeatureset?pretty=true" -H 'Content-Type: application/json' -d'
{
   "featureset": {
        "features": [
            {
                "name": "title_query",
                "params": [
                    "query_text"
                ],
                "template_language": "mustache",
                "template": {
                    "match": {
                        "title": "{{query_text}}"
                    }
                }
            },
            {
                "name": "description_query",
                "params": [
                    "query_text"
                ],
                "template_language": "mustache",
                "template": {
                    "match": {
                        "description": "{{query_text}}"
                    }
                }
            }
        ]
   }
}'

Run curl http://localhost:9200/_ltr/_featureset?pretty=true to see registered features in the featureset.

4. Run a query to get logged features

First, run a simple text query:

curl -X GET "http://localhost:9200/tmdb/_search?pretty=true" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match": {
            "title": "First"
        }
    }
}'

Now, run a query with an SLTR filter to get logged features back:

curl -X GET "http://localhost:9200/tmdb/_search?pretty=true" -H 'Content-Type: application/json' -d'
{
  "query": {
      "bool": {
        "filter" : [
            {
                "sltr" : {
                    "featureset" : "moviefeatureset",
                    "_name": "logged_featureset",
                    "active_features" : [ 
                        "title_query",
                        "description_query"
                    ],
                    "params": {
                        "query_text": "First"
                    }
                }
            }
        ]
      }
  },
  "ext": {
        "ltr_log": {
            "log_specs": {
                "name": "log_entry1",
                "named_query": "logged_featureset"
            }
        }
    }
}'

Result:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.0,
    "hits": [
      {
        "_index": "movies",
        "_id": "rfzam40Bbf4EFOUV1cUr",
        "_score": 0.0,
        "_source": {
          "title": "First Blood",
          "year_released": 1982
        },
        "fields": {
          "_ltrlog": [
            {
              "log_entry1": [
                {
                  "name": "title_query",
                  "value": 0.2876821
                },
                {
                  "name": "description_query"
                }
              ]
            }
          ]
        },
        "matched_queries": [
          "logged_featureset"
        ]
      }
    ]
  }
}

Things to note:

Logged feature values in "_ltrlog"
Feature logging score returned for title_query
Feature log result returned for description_query with no score. This is because we originally indexed a document without a description field.
Nothing at all logged for year_released. This is because it was never registered as a feature of interest in the featureset.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
bin		bin
data		data
train		train
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
index.json		index.json
request.http		request.http

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Opnesearch Learning-to-Rank POC

Requirements

1. Set up Opensearch in Docker

2. Create index mapping and bulk index data

3. Set up an LTR feature store

4. Run a query to get logged features

Resources

About

Releases

Packages

Languages

adamjq/opensearch-learning-to-rank-example

Folders and files

Latest commit

History

Repository files navigation

Opnesearch Learning-to-Rank POC

Requirements

1. Set up Opensearch in Docker

2. Create index mapping and bulk index data

3. Set up an LTR feature store

4. Run a query to get logged features

Resources

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages