This repo is intended as for educational purposes. It contains the steps needed to get a working implementation of the Opensearch Learning to Rank Plugin working locally with a dockerized version of Opensearch.
The best way to understand LTR is to read the official docs here.
- Docker
- opensearch-plugin CLI tool
brew install opensearch
NOTE: To use the OpenSearch image with a custom plugin, you must first create a Dockerfile. See
-
Opensearch version:
2.5
-
LTR Plugin
v2.1.0
(compatible with OS 2.5). See plugin release history on GitHub.
Run:
docker compose up
curl http://localhost:9200
in a new terminal to check the cluster
./bin/index.sh
Search for a movie to confirm it worked:
curl -X GET "http://localhost:9200/tmdb/_search?pretty=true" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"title": "First"
}
}
}'
Set up default featureset index:
curl -X PUT "http://localhost:9200/_ltr?pretty=true" -H 'Content-Type: application/json'
Create a feature set called moviefeatureset
:
curl -X POST "http://localhost:9200/_ltr/_featureset/moviefeatureset?pretty=true" -H 'Content-Type: application/json' -d'
{
"featureset": {
"features": [
{
"name": "title_query",
"params": [
"query_text"
],
"template_language": "mustache",
"template": {
"match": {
"title": "{{query_text}}"
}
}
},
{
"name": "description_query",
"params": [
"query_text"
],
"template_language": "mustache",
"template": {
"match": {
"description": "{{query_text}}"
}
}
}
]
}
}'
Run curl http://localhost:9200/_ltr/_featureset?pretty=true
to see registered features in the featureset.
First, run a simple text query:
curl -X GET "http://localhost:9200/tmdb/_search?pretty=true" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"title": "First"
}
}
}'
Now, run a query with an SLTR filter to get logged features back:
curl -X GET "http://localhost:9200/tmdb/_search?pretty=true" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter" : [
{
"sltr" : {
"featureset" : "moviefeatureset",
"_name": "logged_featureset",
"active_features" : [
"title_query",
"description_query"
],
"params": {
"query_text": "First"
}
}
}
]
}
},
"ext": {
"ltr_log": {
"log_specs": {
"name": "log_entry1",
"named_query": "logged_featureset"
}
}
}
}'
Result:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.0,
"hits": [
{
"_index": "movies",
"_id": "rfzam40Bbf4EFOUV1cUr",
"_score": 0.0,
"_source": {
"title": "First Blood",
"year_released": 1982
},
"fields": {
"_ltrlog": [
{
"log_entry1": [
{
"name": "title_query",
"value": 0.2876821
},
{
"name": "description_query"
}
]
}
]
},
"matched_queries": [
"logged_featureset"
]
}
]
}
}
Things to note:
- Logged feature values in
"_ltrlog"
- Feature logging score returned for
title_query
- Feature log result returned for
description_query
with no score. This is because we originally indexed a document without a description field. - Nothing at all logged for
year_released
. This is because it was never registered as a feature of interest in the featureset.