kglab

The kglab library provides a simple abstraction layer in Python for building and using knowledge graphs.

SPECIAL REQUEST: Which features would you like to see the most in an open source Python library for building and using knowledge graphs? Please add suggestions to this online survey: https://forms.gle/FMHgtmxHYWocprMn6 This will help us prioritize our roadmap for kglab.

Background

For several KG projects, we kept reusing a similar working set of libraries:

Each of these libraries provides a useful piece of the puzzle when you need to leverage knowledge representation, graph algorithms, entity linking, interactive visualization, metadata queries, axioms, etc. However, some of them are relatively low-level (e.g., rdflib) or perhaps not maintained as much (e.g., skosify) and there are challenges integrating them. Challenges we kept having to reinvent work-arounds to resolve.

There are general operations that one must perform on knowledge graphs:

building triples
quality assurance (e.g., axioms)
managing a mix of namespaces
serialization to/from multiple formats
parallel processing across a cluster
interactive visualization
queries
graph algorithms
transitivity and other forms of enriching a graph
embedding (deep learning integration)
inference (e.g., PSL, Bayesian Networks, Causal, MLN, etc.)
other ML integrations

The kglab library provides a reasonably "Pythonic" abstraction layer for these operations on KGs. The class definitions can be subclassed and extended to handle specific needs.

Meanwhile, we're also extending some of the key components with distributed versions, based on ray for better use of horizontal scale-out and parallelization.

NB: this repo is UNDER CONSTRUCTION and will undergo much iteration prior to the "KG 101" tutorial at https://www.knowledgeconnexions.world/talks/kg-101/

See wiki for further details.

Installation

Dependencies:

To install from PyPi:

pip install kglab

If you work directly from this Git repo, be sure to install the dependencies as well:

pip install -r requirements.txt

If you would like to run a local Notebook install Jupyter Lab:

If you use conda, you can install it with:

conda install -c conda-forge jupyterlab

If you use pip, you can install it with:

pip install jupyterlab

If installing via pip install --user you must add the user-level bin directory to your PATH environment variable in order to launch JupyterLab.

If you are using a Unix derivative (FreeBSD, GNU / Linux, OS X), you can achieve this by using the export PATH="$HOME/.local/bin:$PATH" command.

Once installed, launch JupyterLab with:

jupyter-lab

Tutorial Outline

Building a graph in RDF using rdflib

ex01_0.ipynb
- examine the dataset
ex01_1.ipynb
- construct a graph from RDF triples
- using multiple namespaces
- proper handling of literals
- serialization to strings and files using Turtle and JSON-LD

Leveraging the kglab abstraction layer

ex01_2.ipynb
- construct and serialize the same graph using kglab

Interactive graph visualization with pyvis

ex01_3.ipynb
- render triples as an interactive graph

Build a medium size KG from a CSV dataset

ex01_4.ipynb
- iterate through a dataset, representing a recipe for each row
- compare relative file sizes for different serialization formats

Running SPARQL queries

ex01_5.ipynb
- load the medium size KG from the earlier example
- run a SPARQL query to identify recipes with special ingredients and cooking times
- use SPARQL queries and post-processing to create annotations

Graph algorithms with networkx

ex01_6.ipynb
- load the medium size KG from the earlier example
- run graph algorithms in networkx to analyze properties of the KG

Statistical relational learning with pslpython

ex01_7.ipynb
- use RDF to represent the "simple acquaintance" PSL example graph
- load the graph into a KG
- visualize the KG
- run PSL to infer uncertainty in the knows relation for grounded nodes

Vector embedding with gensim

ex01_8.ipynb
- curating annotations
- analyze ingredient labels from 250K recipes
- use vector embedding to rank relatedness for labels
- add string similarity for an approximate pareto archive

Production Use Cases

Derwen and its client projects

Kudos

Many thanks to our contributors: @jake-aft, plus general support from Derwen, Inc. and The Knowledge Graph Conference.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.github		.github
dat		dat
docs		docs
kglab		kglab
wip		wip
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
acq.ttl		acq.ttl
changelog.txt		changelog.txt
ex01_0.ipynb		ex01_0.ipynb
ex01_1.ipynb		ex01_1.ipynb
ex01_2.ipynb		ex01_2.ipynb
ex01_3.ipynb		ex01_3.ipynb
ex01_4.ipynb		ex01_4.ipynb
ex01_5.ipynb		ex01_5.ipynb
ex01_6.ipynb		ex01_6.ipynb
ex01_7.ipynb		ex01_7.ipynb
ex01_8.ipynb		ex01_8.ipynb
nom.ttl		nom.ttl
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kglab

Background

Installation

Tutorial Outline

Production Use Cases

Kudos

About

Releases

Packages

Languages

License

dmoore247/kglab

Folders and files

Latest commit

History

Repository files navigation

kglab

Background

Installation

Tutorial Outline

Production Use Cases

Kudos

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages