You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As compared to individual extractions, there are currently quite a lot of spurious terms/concepts/categories in the corresponding frontend search facets and dashboard views.
The problem is that grobid-keyterm tends to extract a lot of keyterms (by default 40), but normally rank them from the most important to the less one. The facets and views simply use the occurrence of the keyterms (or key concepts or key categories) and do not consider the score associated to the keyterm to decrease its importance.
For improving these facets/views, we could either reduce the number of the extracted keyterms or to exploit the score of the keyterms when ranking the term in the facets/views (with an ElasticSearch script).
The text was updated successfully, but these errors were encountered:
As compared to individual extractions, there are currently quite a lot of spurious terms/concepts/categories in the corresponding frontend search facets and dashboard views.
The problem is that grobid-keyterm tends to extract a lot of keyterms (by default 40), but normally rank them from the most important to the less one. The facets and views simply use the occurrence of the keyterms (or key concepts or key categories) and do not consider the score associated to the keyterm to decrease its importance.
For improving these facets/views, we could either reduce the number of the extracted keyterms or to exploit the score of the keyterms when ranking the term in the facets/views (with an ElasticSearch script).
The text was updated successfully, but these errors were encountered: