Skip to content

Season normalization

EgoLaparra edited this page Sep 14, 2020 · 1 revision

Eidos can identify and normalize, i.e. retrieve starting and ending dates, seasonal expressions. For this, SeaonFinder combines temporal and geographical information extracted by other components of the system, with a database of seasons that includes their typical starting and ending months.

Let's take the following example to illustrate the process:

“... in many areas of Oromia, an early start to the 2016 lean season is expected.”

In this case, the word season triggers a possible season expression. Thus, SeasonFinder will look in the trigger's close context for the season type (lean) and the reference location (Oromia). If the season database contains information about the season type for the reference location, SeasonFinder will look for a reference year (2016) and use it to normalize the expression.

The lean season in Oromia starts and ends typically in January and April respectively. In consequence, for this example, lean season would be normalized to the time interval:

[2016-01-01T00:00, 2016-04-01T00:00)

Update seasons database

The information about seasons is stored in src/main/resources/org/clulab/wm/eidos/english/context/seasons-db.yml. This file can be edited to extend, adapt or improve its knowledge without requiring any change in eidoscode.

seasons-db.yml is a YAML file that contains two data structures: a list of triggers and a nested dictionary of seasons per location. The latter has the following structure:

GeoLocation ID:
  season type:
    start: integer
    end: integer

Where GeoLocation ID must be identifier of the location in GeoNames database, and start and end are the starting month (included) and ending month (excluded) of the season respectively.

The example below contains a snippet of seasons-db.yml with a few triggers and some season types for the location Oromia:

---
triggers:
  - belg
  - meher
  - season

---
'444185': # Oromia
  meher:
    start: 6
    end: 9
  long rainy:
    start: 6
    end: 9
  belg:
    start: 3
    end: 5
  lean:
    start: 1
    end: 4