Skip to content

Commit

Permalink
updated docs
Browse files Browse the repository at this point in the history
  • Loading branch information
ceteri committed Nov 27, 2021
1 parent b93dbd3 commit e9f03c5
Show file tree
Hide file tree
Showing 7 changed files with 53 additions and 52 deletions.
2 changes: 1 addition & 1 deletion docs/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ see: <https://derwen.ai/d/data_science>

### graph database

### graph-based data science
### graph data science


## – K –
Expand Down
6 changes: 3 additions & 3 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Graph-Based Data Science
# Graph Data Science

<img src="assets/logo.png" width="113" alt="illustration of a knowledge graph, plus laboratory glassware"/>

Expand All @@ -12,7 +12,7 @@ The main goal is to leverage idiomatic Python for common use cases in
and
[data engineering](glossary/#data-engineering)
work that require graph data, presenting
[*graph-based data science*](glossary/#graph-based-data-science)
[*graph data science*](glossary/#graph-data-science)
as an emerging practice.


Expand Down Expand Up @@ -86,7 +86,7 @@ etc.
### Natural Language Understanding

**Point 4:**
incorporate graph-based methods and
incorporate graph data science practices and
[semantic technologies](glossary/#semantic-technologies)
into
[`spaCy`](https://spacy.io/) pipelines, e.g., through
Expand Down
7 changes: 3 additions & 4 deletions docs/learn.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Learning

This work builds on the ["Graph-Based Data Science"](https://derwen.ai/s/kcgh)
This work builds on the ["Graph Thinking"](https://derwen.ai/s/kcgh)
talks at conferences and meetups, which experiment with the content –
collecting feedback, critiques, suggestions, etc.

Expand All @@ -11,9 +11,8 @@ To illustrate:
<img src="../assets/learning.png" width="500" />

The objective for these learning materials is to help people learn how
to leverage **kglab** effectively, gain confidence working with
graph-based data science, plus have examples to repurpose for their
own use cases.
to leverage **kglab** effectively, gain confidence working with graph
data science, plus have examples to repurpose for their own use cases.

You'll find a mix of topics throughout:
data science, business context, AI applications, data management,
Expand Down
4 changes: 2 additions & 2 deletions docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ with a notebook-based tutorial at its core,
focused on a community and their business use cases.

The scope now is about
[*graph-based data science*](../glossary/#graph-based-data-science),
[*graph data science*](../glossary/#graph-data-science),
and perhaps someday this may spin-out a book or other learning materials.


Expand All @@ -62,7 +62,7 @@ Consider the fact that many dependencies have their origins in the
[Semantic Web](glossary/#semantic-web).
The ongoing work of [W3C](glossary/#w3c)
provides ontologies, standards, and other initiatives that are incredibly
valuable for graph-based.
valuable for graph data science.
That overall effort began in the 1990s, and arguably its momentum
imploded circa 2005 – despite best intentions by brilliant individuals
and quite capable organizations.
Expand Down
65 changes: 34 additions & 31 deletions docs/ref.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ a `dict` describing the namespaces in this RDF graph

---
#### [`describe_ns` method](#kglab.KnowledgeGraph.describe_ns)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L231)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L236)

```python
describe_ns()
Expand All @@ -129,7 +129,7 @@ a [`pandas.DataFrame`](https://pandas.pydata.org/pandas-docs/stable/reference/ap

---
#### [`get_context` method](#kglab.KnowledgeGraph.get_context)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L258)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L263)

```python
get_context()
Expand All @@ -144,7 +144,7 @@ context needed for JSON-LD serialization

---
#### [`encode_date` method](#kglab.KnowledgeGraph.encode_date)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L277)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L282)

```python
encode_date(dt, tzinfos)
Expand All @@ -164,7 +164,7 @@ timezones as a dict, used by

---
#### [`add` method](#kglab.KnowledgeGraph.add)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L299)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L304)

```python
add(s, p, o)
Expand All @@ -187,7 +187,7 @@ To prepare for upcoming **kglab** features, **this is the preferred method for a

---
#### [`remove` method](#kglab.KnowledgeGraph.remove)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L333)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L338)

```python
remove(s, p, o)
Expand Down Expand Up @@ -237,7 +237,7 @@ this `KnowledgeGraph` object – used for method chaining

---
#### [`load_rdf_text` method](#kglab.KnowledgeGraph.load_rdf_text)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L525)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L530)

```python
load_rdf_text(data, format="ttl", base=None, **args)
Expand All @@ -263,7 +263,7 @@ this `KnowledgeGraph` object – used for method chaining

---
#### [`save_rdf` method](#kglab.KnowledgeGraph.save_rdf)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L568)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L573)

```python
save_rdf(path, format="ttl", base=None, encoding="utf-8", **args)
Expand All @@ -287,7 +287,7 @@ optional text encoding value, defaults to `"utf-8"`, must be in the [Python code

---
#### [`save_rdf_text` method](#kglab.KnowledgeGraph.save_rdf_text)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L633)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L638)

```python
save_rdf_text(format="ttl", base=None, encoding="utf-8", **args)
Expand Down Expand Up @@ -333,7 +333,7 @@ this `KnowledgeGraph` object – used for method chaining

---
#### [`save_jsonld` method](#kglab.KnowledgeGraph.save_jsonld)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L720)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L725)

```python
save_jsonld(path, encoding="utf-8", **args)
Expand Down Expand Up @@ -373,7 +373,7 @@ this `KnowledgeGraph` object – used for method chaining

---
#### [`save_parquet` method](#kglab.KnowledgeGraph.save_parquet)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L806)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L811)

```python
save_parquet(path, compression="snappy", storage_options=None, **kwargs)
Expand All @@ -396,7 +396,7 @@ extra options parsed by [`fsspec`](https://github.com/intake/filesystem_spec) fo

---
#### [`load_csv` method](#kglab.KnowledgeGraph.load_csv)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L851)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L856)

```python
load_csv(url)
Expand Down Expand Up @@ -441,7 +441,7 @@ a list of identifiers for the top-level nodes added from the Roam Research graph

---
#### [`n3fy` method](#kglab.KnowledgeGraph.n3fy)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L971)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L976)

```python
n3fy(node, pythonify=True)
Expand All @@ -461,7 +461,7 @@ text (or Python objects) for the serialized node

---
#### [`n3fy_row` method](#kglab.KnowledgeGraph.n3fy_row)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L997)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1002)

```python
n3fy_row(row_dict, pythonify=True)
Expand All @@ -481,7 +481,7 @@ a dictionary of serialized row bindings

---
#### [`query` method](#kglab.KnowledgeGraph.query)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1026)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1031)

```python
query(sparql, bindings=None)
Expand All @@ -501,7 +501,7 @@ initial variable bindings

---
#### [`query_as_df` method](#kglab.KnowledgeGraph.query_as_df)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1054)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1059)

```python
query_as_df(sparql, bindings=None, simplify=True, pythonify=True)
Expand All @@ -527,7 +527,7 @@ the query result set represented as a [`pandas.DataFrame`](https://pandas.pydata

---
#### [`visualize_query` method](#kglab.KnowledgeGraph.visualize_query)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1098)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1103)

```python
visualize_query(sparql, notebook=False)
Expand All @@ -547,7 +547,7 @@ PyVis network object, to be rendered

---
#### [`validate` method](#kglab.KnowledgeGraph.validate)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1122)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1127)

```python
validate(shacl_graph=None, shacl_graph_format=None, ont_graph=None, ont_graph_format=None, advanced=False, inference=None, inplace=True, abort_on_error=None, **kwargs)
Expand Down Expand Up @@ -584,7 +584,7 @@ a tuple of `conforms` (RDF graph passes the validation rules) + `report_graph` (

---
#### [`infer_owlrl_closure` method](#kglab.KnowledgeGraph.infer_owlrl_closure)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1200)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1205)

```python
infer_owlrl_closure()
Expand All @@ -597,7 +597,7 @@ See <https://wiki.uib.no/info216/index.php/Python_Examples#RDFS_inference_with_R

---
#### [`infer_rdfs_closure` method](#kglab.KnowledgeGraph.infer_rdfs_closure)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1213)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1218)

```python
infer_rdfs_closure()
Expand All @@ -610,7 +610,7 @@ See <https://wiki.uib.no/info216/index.php/Python_Examples#RDFS_inference_with_R

---
#### [`infer_rdfs_properties` method](#kglab.KnowledgeGraph.infer_rdfs_properties)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1226)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1231)

```python
infer_rdfs_properties()
Expand All @@ -623,7 +623,7 @@ Adapted from [`skosify`](https://github.com/NatLibFi/Skosify) which wasn't being

---
#### [`infer_rdfs_classes` method](#kglab.KnowledgeGraph.infer_rdfs_classes)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1254)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1259)

```python
infer_rdfs_classes()
Expand All @@ -636,7 +636,7 @@ Adapted from [`skosify`](https://github.com/NatLibFi/Skosify) which wasn't being

---
#### [`infer_skos_related` method](#kglab.KnowledgeGraph.infer_skos_related)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1287)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1292)

```python
infer_skos_related()
Expand All @@ -650,7 +650,7 @@ Adapted from [`skosify`](https://github.com/NatLibFi/Skosify) which wasn't being

---
#### [`infer_skos_concept` method](#kglab.KnowledgeGraph.infer_skos_concept)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1302)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1307)

```python
infer_skos_concept()
Expand All @@ -667,7 +667,7 @@ Adapted from [`skosify`](https://github.com/NatLibFi/Skosify) which wasn't being

---
#### [`infer_skos_hierarchical` method](#kglab.KnowledgeGraph.infer_skos_hierarchical)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1326)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1331)

```python
infer_skos_hierarchical(narrower=True)
Expand All @@ -684,7 +684,7 @@ if false, `skos:narrower` will be removed instead of added

---
#### [`infer_skos_transitive` method](#kglab.KnowledgeGraph.infer_skos_transitive)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1353)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1358)

```python
infer_skos_transitive(narrower=True)
Expand All @@ -705,7 +705,7 @@ also infer transitive closure for `skos:narrowerTransitive`

---
#### [`infer_skos_symmetric_mappings` method](#kglab.KnowledgeGraph.infer_skos_symmetric_mappings)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1382)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1387)

```python
infer_skos_symmetric_mappings(related=True)
Expand All @@ -722,7 +722,7 @@ infer the `skos:related` super-property for all `skos:relatedMatch` relations

---
#### [`infer_skos_hierarchical_mappings` method](#kglab.KnowledgeGraph.infer_skos_hierarchical_mappings)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1413)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/kglab.py#L1418)

```python
infer_skos_hierarchical_mappings(narrower=True)
Expand Down Expand Up @@ -1512,7 +1512,7 @@ PyVis graph to be rendered
## [module functions](#kglab)
---
#### [`calc_quantile_bins` function](#kglab.calc_quantile_bins)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/util.py#L36)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/util.py#L47)

```python
calc_quantile_bins(num_rows)
Expand All @@ -1535,9 +1535,12 @@ the calculated bins, as a [`numpy.ndarray`](https://numpy.org/doc/stable/referen
get_gpu_count()
```
Special handling for detecting GPU availability: an approach
recommended by the NVidia RAPIDS engineering team, since `nvml`
recommended by the NVIDIA RAPIDS engineering team, since `nvml`
bindings are difficult for Python libraries to keep updated.

This has the side-effect of importing the `cuDF` library, when
GPUs are available.

* *returns* : `int`
count of available GPUs

Expand Down Expand Up @@ -1579,7 +1582,7 @@ an [`rdflib.Graph`](https://rdflib.readthedocs.io/en/stable/apidocs/rdflib.html?

---
#### [`root_mean_square` function](#kglab.root_mean_square)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/util.py#L89)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/util.py#L100)

```python
root_mean_square(values)
Expand All @@ -1596,7 +1599,7 @@ RMS metric as a float

---
#### [`stripe_column` function](#kglab.stripe_column)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/util.py#L52)
[*\[source\]*](https://github.com/DerwenAI/kglab/blob/main/kglab/util.py#L63)

```python
stripe_column(values, bins, use_gpus=False)
Expand Down
19 changes: 9 additions & 10 deletions docs/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,20 +18,19 @@ RAPIDS, Apache Parquet, fsspec, etc.) on cloud computing.

This tutorial introduces **kglab** – an open source project that
integrates RDFlib, OWL-RL, pySHACL, NetworkX, iGraph, pslpython,
node2vec, PyVis, and more – to show how to use a wide range of
graph-based approaches, blending smoothly into data science workflows,
and working efficiently with popular data engineering practices.
The material emphasizes hands-on coding examples which you can reuse;
best practices for integrating and leveraging other useful libraries;
PyVis, and more – to show how to use a wide range of graph-based
approaches, blending smoothly into data science workflows, and working
efficiently with popular data engineering practices. The material
emphasizes hands-on coding examples which you can reuse; best
practices for integrating and leveraging other useful libraries;
history and bibliography (e.g., links to primary sources); accessible,
detailed API documentation; a detailed glossary of terminology; plus
links to many helpful resources, such as online 'playgrounds" –
meanwhile, keeping a practical focus on use cases.

The coding exercises in the following tutorial are based on
progressive examples based on cooking recipes, which illustrate the
use of **kglab** and related libraries in Python for *graph-based data
science*.
The coding exercises in the following tutorial are based on progressive
examples based on cooking recipes, which illustrate the use of **kglab**
and related libraries in Python for *graph data science*.


## Prerequisites
Expand Down Expand Up @@ -59,7 +58,7 @@ come in handy.
* Coding examples that can be used as starting points for your own KG projects
* How to blend different graph-based approaches within a data science workflow to complement each other’s strengths: for data quality checks, inference, human-in-the-loop, etc.
* Integrating with popular data science tools, such as pandas, scikit-learn, matplotlib, etc.
* Graph-based practices that fit well with Big Data tools such as Spark, Parquet, Ray, RAPIDS, and so on
* Graph data science practices that fit well with Big Data tools such as Spark, Parquet, Ray, RAPIDS, and so on


## Outline
Expand Down
Loading

0 comments on commit e9f03c5

Please sign in to comment.