Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch / Update to 8.14.3. #8337

Merged
merged 2 commits into from
Oct 16, 2024
Merged

Elasticsearch / Update to 8.14.3. #8337

merged 2 commits into from
Oct 16, 2024

Conversation

fxprunayre
Copy link
Member

Fix for #8305

Checklist

  • I have read the contribution guidelines
  • Pull request provided for main branch, backports managed with label
  • Good housekeeping of code, cleaning up comments, tests, and documentation
  • Clean commit history broken into understandable chucks, avoiding big commits with hundreds of files, cautious of reformatting and whitespace changes
  • Clean commit messages, longer verbose messages are encouraged
  • API Changes are identified in commit messages
  • Testing provided for features or enhancements using automatic tests
  • User documentation provided for new features or enhancements in manual
  • Build documentation provided for development instructions in README.md files
  • Library management using pom.xml dependency management. Update build documentation with intended library use and library tutorials or documentation

Funded by Ifremer

@fxprunayre fxprunayre added this to the 4.4.6 milestone Sep 2, 2024
@fxprunayre fxprunayre requested a review from josegar74 September 2, 2024 16:37
Copy link

sonarcloud bot commented Sep 2, 2024

Copy link
Member

@josegar74 josegar74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done the following test:

  1. Start ElasticSearch 8.4.2
  2. Start GeoNetwork
  3. Admin console > Tools > Delete index and reindex
  4. Load ISO19139 samples
  5. Go to the search page: Query returned an error. Check the console for details.

The response error:

{
    "servlet": "spring",
    "message": "Error is: Bad Request.\nRequest:\n{"from":0,"size":30,"sort":["_score"],"query":{"function_score":{"boost":"5","functions":[{"filter":{"match":{"resourceType":"series"}},"weight":1.5},{"filter":{"exists":{"field":"parentUuid"}},"weight":0.3},{"filter":{"match":{"cl_status.key":"obsolete"}},"weight":0.2},{"filter":{"match":{"cl_status.key":"superseded"}},"weight":0.3},{"gauss":{"changeDate":{"scale":"365d","offset":"90d","decay":0.5}}}],"score_mode":"multiply","query":{"bool":{"must":[{"terms":{"isTemplate":["n"]}}],"filter":{"query_string":{"query":"*:* AND (draft:n OR draft:e)"}}}}}},"aggregations":{"resourceType":{"terms":{"field":"resourceType"},"meta":{"decorator":{"type":"icon","prefix":"fa fa-fw gn-icon-"},"field":"resourceType"}},"cl_spatialRepresentationType.key":{"terms":{"field":"cl_spatialRepresentationType.key","size":10},"meta":{"field":"cl_spatialRepresentationType.key"}},"format":{"terms":{"field":"format"},"meta":{"collapsed":true,"field":"format"}},"availableInServices":{"filters":{"filters":{"availableInViewService":{"query_string":{"query":"+linkProtocol:/OGC:WMS.*/"}},"availableInDownloadService":{"query_string":{"query":"+linkProtocol:/OGC:WFS.*/"}}}},"meta":{"decorator":{"type":"icon","prefix":"fa fa-fw ","map":{"availableInViewService":"fa-globe","availableInDownloadService":"fa-download"}}}},"th_gemet_tree.key":{"terms":{"field":"th_gemet_tree.key","size":100,"order":{"_key":"asc"},"include":"[^^]+^?[^^]+"},"meta":{"field":"th_gemet_tree.key"}},"th_httpinspireeceuropaeumetadatacodelistPriorityDataset-PriorityDataset_tree.default":{"terms":{"field":"th_httpinspireeceuropaeumetadatacodelistPriorityDataset-PriorityDataset_tree.default","size":100,"order":{"_key":"asc"}},"meta":{"field":"th_httpinspireeceuropaeumetadatacodelistPriorityDataset-PriorityDataset_tree.default"}},"th_httpinspireeceuropaeutheme-theme_tree.key":{"terms":{"field":"th_httpinspireeceuropaeutheme-theme_tree.key","size":34},"meta":{"decorator":{"type":"icon","prefix":"fa fa-fw gn-icon iti-","expression":"http://inspire.ec.europa.eu/theme/(.*)"},"field":"th_httpinspireeceuropaeutheme-theme_tree.key"}},"tag":{"terms":{"field":"tag.langeng","include":".*","size":10},"meta":{"caseInsensitiveInclude":true,"field":"tag.langeng"}},"th_regions_tree.default":{"terms":{"field":"th_regions_tree.default","size":100,"order":{"_key":"asc"}},"meta":{"field":"th_regions_tree.default"}},"resolutionScaleDenominator":{"histogram":{"field":"resolutionScaleDenominator","interval":10000,"keyed":true,"min_doc_count":1},"meta":{"collapsed":true}},"creationYearForResource":{"histogram":{"field":"creationYearForResource","interval":5,"keyed":true,"min_doc_count":1},"meta":{"collapsed":true}},"OrgForResource":{"terms":{"field":"OrgForResourceObject.langeng","include":".*","size":20},"meta":{"caseInsensitiveInclude":true,"field":"OrgForResourceObject.langeng"}},"cl_maintenanceAndUpdateFrequency.key":{"terms":{"field":"cl_maintenanceAndUpdateFrequency.key","size":10},"meta":{"collapsed":true,"field":"cl_maintenanceAndUpdateFrequency.key"}}},"_source":{"includes":["uuid","id","groupOwner","logo","cat","inspireThemeUri","inspireTheme_syn","cl_topic","resourceType","resourceTitle*","resourceAbstract*","draft","draftId","owner","link","status*","rating","geom","contact*","Org*","isTemplate","valid","isHarvested","dateStamp","documentStandard","standardNameObject.default","cl_status*","mdStatus*","op*","documentStandard","groupOwner","owner","id"]},"script_fields":{"overview":{"script":{"source":"return params['_source'].overview == null ? [] : params['_source'].overview.stream().findFirst().orElse([]);"}}},"track_total_hits":true}\n.\nError:\n{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [cl_spatialRepresentationType.key] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"gn-records","node":"I1k01ZfGSYuYfRxVmwOglQ","reason":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [cl_spatialRepresentationType.key] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}}],"caused_by":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [cl_spatialRepresentationType.key] in order to load field data by uninverting the inverted index. Note that this can use significant memory.","caused_by":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [cl_spatialRepresentationType.key] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}}},"status":400}.",
    "url": "/geonetwork/srv/api/search/records/_search",
    "status": "400"
}

It seems related to the facets, clicking the search button displays the results.

Tested the same steps with ElasticSearch 8.14, looks ok.

ElasticSearch 8.4.2 was tested with security enabled, but I don't think that is the problem.

@fxprunayre
Copy link
Member Author

set fielddata=true on [cl_spatialRepresentationType.key] in order to load

So it relates to the mapping,
When catalogue is empty, checking
http://localhost:9200/gn-records/_mapping

 {
          "codelist": {
            "match": "[cl_*]",
            "mapping": {
              "properties": {
                "default": {
                  "type": "keyword"
                },
                "link": {
                  "type": "keyword"
                },
                "text": {
                  "type": "text"
                },
                "key": {
                  "type": "keyword"
                }
              },
              "type": "object"
            }
          }
        },

and when you index records, fields for each codelists are created:

"cl_topic": {
          "properties": {
            "default": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "key": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "lang": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "langeng": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            }
          }
        },

Not sure why the dynamic_templates is not matched anymore. Checking it

Copy link
Contributor

@jahow jahow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fxprunayre
Copy link
Member Author

Not sure why the dynamic_templates is not matched anymore. Checking it

Probably related to elastic/elasticsearch-java#841

 "codelist": {
            "match": "[cl_*]",

instead of

          "codelist": {
            "match": "cl_*",

Not sure what is the best way to solve this?

@fxprunayre
Copy link
Member Author

Not sure what is the best way to solve this?

Maybe on the long run, for each GeoNetwork branches we should stick to an Elasticsearch version branch eg. 4.4.x on 8.14.x (or if we don't want the issue above, rollback to a version before 8.9? 8.11 was used and we did not noticed that issue)

On my side, no setup requires a fixed (and "old") version of Elasticsearch and usually the request is more to update to the latest so using the same version for the Java client and the server is also fine.

@fxprunayre
Copy link
Member Author

For users who would like to use 8.4 servers they can always create the index with the mapping

curl -X DELETE http://localhost:9200/gn-records
curl -X PUT http://localhost:9200/gn-records -H "Content-Type:application/json"  -d @web/src/main/webapp/WEB-INF/data/config/index/records.json
{"acknowledged":true,"shards_acknowledged":true,"index":"gn-records"}

and then use the "reindex record"

image

to avoid the 8.14 Java client to send the index mapping to the server.

Copy link
Member

@josegar74 josegar74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this issue already happens with the upgrade done earlier to 8.14.0.

It's a problem creating the index, so as long as the release notes are clear to use curl for older versions it's fine for me.

Please update the documentation page as indicated by @jahow, and should be fine.

@fxprunayre
Copy link
Member Author

As discussed

  • documentation updated with the Java client version in use and recommend to use the same Elasticsearch server version
  • for future branches, we should stick on an Elasticsearch version eg. 4.4.x stays on 8.14.x, to avoid those kind of issues between Java client and server.

Copy link

sonarcloud bot commented Oct 15, 2024

@fxprunayre fxprunayre merged commit 7b9361a into main Oct 16, 2024
10 checks passed
@fxprunayre fxprunayre deleted the 44-es-8143 branch October 16, 2024 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants