current-events-to-kg

A parser for the Wikipedia's Current events Portal which generates a knowledge graph from the extracted data. The dataset has a focus on extracting positional and temporal information about the events.

Apart from Wikipedia's Current events Portal these services are used to enrich the dataset with additional data:

Analytics

While the dataset is generated, some analytics about the extracted data are tracked. If no -msd or -med are used, they are saved for every month under ./currenteventstokg/analytics/.

To view the analytics for a specific month span X to Y, use -s X -e Y -cca .

Usage Examples

All arguments are listed via:

python -m currenteventstokg -h

Generating a dataset from February 2021 to March 2022:

python -m currenteventstokg -s 2/2021 -e 3/2022

Generating a dataset for the 2nd of March 2021:

python -m currenteventstokg -s 3/2021 -e 3/2021 -msd 2 -med 2

Use a docker container to run it

Clone this repo into a location of your choice
Navigate to the root directory of your clone.
Create the container:

docker build -t current-events-to-kg .

Run it with your arguments, e.g.:

./run-container.sh -s 3/2021 -e 3/2021 -msd 2 -med 2

Output

For each parsed month a file for each graph type (base, ohg, osm and raw) gets saved as {month}_{year}_{graph type}.jsonld, e.g. January_2022_base.jsonld.

If you change -msd or -med, only a part of each month is parsed. The output of partial month parsing gets saved as {msd}_{med}_{month}_{year}_{graph type}.jsonld, e.g. 1_2_January_2022_base.jsonld When you parse only the first two days.

Graph types

The generated graphs are subdivided into four graph types:

base: the main graph
ohg: includes the one hop subgraphs for each Wikidata entity
osm: includes the OSM Nominatim well-known text for the outlines of locations with its Types and IDs (this graph ca. 10x larger than base)
raw: raw HTML where information was extracted from e.g. the Wikipedia infobox

Because the URIs match in all graph types for each entity, you can just import them in a modular way and it unifies again.

Schema

The generated knowledge graph has the following schema:

License

GNU General Public License v3.0 or later

See COPYING to see the full text.

Name		Name	Last commit message	Last commit date
Latest commit History 173 Commits
currenteventstokg		currenteventstokg
docs		docs
.dockerignore		.dockerignore
COPYING		COPYING
Dockerfile		Dockerfile
README.md		README.md
create-container.sh		create-container.sh
requirements.txt		requirements.txt
requirements_visualization.txt		requirements_visualization.txt
run-container.sh		run-container.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

current-events-to-kg

Analytics

Usage Examples

Use a docker container to run it

Output

Graph types

Schema

License

About

Releases

Packages

Contributors 3

Languages

License

semantic-systems/current-events-to-kg

Folders and files

Latest commit

History

Repository files navigation

current-events-to-kg

Analytics

Usage Examples

Use a docker container to run it

Output

Graph types

Schema

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages