Skip to content

Commit

Permalink
Formatting and text/code consistency changes
Browse files Browse the repository at this point in the history
  • Loading branch information
spmallette committed Jan 25, 2024
1 parent b2c2aa0 commit d60b282
Show file tree
Hide file tree
Showing 9 changed files with 572 additions and 471 deletions.
320 changes: 198 additions & 122 deletions book/Section-Beyond-Basic-Queries.adoc

Large diffs are not rendered by default.

43 changes: 22 additions & 21 deletions book/Section-Common-Serialization-Formats.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,11 @@ contain all of the vertex data and the other will contain all of the edge data.
[[csvair]]
Using two CSV files to represent the air-routes data
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If we were to store the airport data from the 'air-routes' graph in CSV format we might
do something like the example below. Note that to improve readability we have not
included every property (or indeed every airport) in this example. Notice how each
vertex has a unique ID assigned. This is important as when we define the edges we will
need the vertex IDs to build the connections.
If we were to store the airport data from the 'air-routes' graph in CSV format we
might do something like the example below. Note that to improve readability we have
not included every property (or indeed every airport) in this example. Notice how
each vertex has a unique ID assigned. This is important as when we define the edges
we will need the vertex IDs to build the connections.


----
Expand Down Expand Up @@ -75,20 +75,21 @@ graph. This is very similar to another common practice, namely, using a script
generate 'INSERT' statements when working with SQL databases.

We have also written Java and Groovy programs that will read the CSV file and use the
TinkerPop API or the Gremlin Server REST API to insert vertices and edges into a graph.
If you work with graph systems for a while you will probably find yourself also doing
similar things.
TinkerPop API or the Gremlin Server REST API to insert vertices and edges into a
graph. If you work with graph systems for a while you will probably find yourself
also doing similar things.

Adjacency matrix format
^^^^^^^^^^^^^^^^^^^^^^^

The examples shown above of how a CSV file can be used to store data about vertices and
edges presents a convenient way to do it. However, this is by no means the only way
you could do it. For graphs that do not contain properties you could lay the graph
out using an 'adjacency matrix' as shown below. The letters represent the vertex labels
and a 1 indicates there is an edge between them and a zero indicates no edge. This
format can be useful if your vertices and edges do not have properties and if the graph is
small but in general is not a great way to try and represent large graphs.
The examples shown above of how a CSV file can be used to store data about vertices
and edges presents a convenient way to do it. However, this is by no means the only
way you could do it. For graphs that do not contain properties you could lay the
graph out using an 'adjacency matrix' as shown below. The letters represent the
vertex labels and a 1 indicates there is an edge between them and a zero indicates no
edge. This format can be useful if your vertices and edges do not have properties and
if the graph is small but in general is not a great way to try and represent large
graphs.

----
A,B,C,D,E,F,G
Expand Down Expand Up @@ -119,9 +120,9 @@ G,A,B,C,E,F
----

While this is a simple example, it is possible to represent a more complex graph such
as the 'air-routes' graph in this way. We could build a more complex CSV file where the
vertex and its properties are listed first, followed by all of the other vertices it
connects to and the properties for those edges.
as the 'air-routes' graph in this way. We could build a more complex CSV file where
the vertex and its properties are listed first, followed by all of the other vertices
it connects to and the properties for those edges.

Some graph database systems actually store their graphs to disk using a variation of
this format. JanusGraph in fact uses a system a lot like this when storing vertex and
Expand All @@ -147,9 +148,9 @@ C,B
----

There are many ways you could construct an edge list. By way of another simple
example we could represent routes in the 'air-routes' graph in a format similar to that
shown below. In this case we also include the label of the edge between each of the
vertices. The vertices are represented by their ID value.
example we could represent routes in the 'air-routes' graph in a format similar to
that shown below. In this case we also include the label of the edge between each of
the vertices. The vertices are represented by their ID value.

----
[1,route,623]
Expand Down
26 changes: 14 additions & 12 deletions book/Section-Getting-Started.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ Documentation::
- http://tinkerpop.apache.org/docs/current/
- http://tinkerpop.apache.org/docs/current/reference/
Useful Recipes::
- A set of examples or "recipes" showing how to perform common graph oriented tasks using Gremlin queries.
- A set of examples or "recipes" showing how to perform common graph oriented tasks
using Gremlin queries.
- http://tinkerpop.apache.org/docs/current/recipes/

The programming interfaces allow providers of graph databases to build systems that
Expand Down Expand Up @@ -385,10 +386,10 @@ TinkerGraph is used to explore static (unchanging) graphs but you can also use i
from a programming language like Java and mutate its contents if you want to.
However, TinkerGraph does not support some of the more advanced features you will
find in implementations like JanusGraph such as transactions and external indexes. We
will cover these topics as part of the discussion of JanusGraph in the <<janusintro>>
section later on. One other thing worth noting in the list above is that
'UserSuppliedIds' is set to true for vertex and edge ID values. This means that if
you load a graph file, such as a GraphML format file, that specifies ID values for
will cover these topics as part of the discussion of JanusGraph in the
"<<janusintro>>" section later on. One other thing worth noting in the list above is
that 'UserSuppliedIds' is set to true for vertex and edge ID values. This means that
if you load a graph file, such as a GraphML format file, that specifies ID values for
vertices and edges then TinkerGraph will honor those IDs and use them. As we shall
see later this is not the case with some other graph database systems.

Expand Down Expand Up @@ -580,7 +581,8 @@ Additional observations:
----

Here are the Top 15 airports sorted by overall number of routes (in and out). In
graph terminology this is often called the degree of the vertex or just 'vertex degree'.
graph terminology this is often called the degree of the vertex or just
'vertex degree'.

----
POS ID CODE TOTAL DETAILS
Expand Down Expand Up @@ -690,12 +692,12 @@ Once you have the Gremlin Console up and running and have the graph loaded, if
you feel like it you can cut and paste queries from this book directly into
the console to see them run.

Once the 'air-routes' graph is loaded you can enter the following command and you will
get back information about the graph. In the case of a TinkerGraph you will get back
a useful message telling you how many vertices and edges the graph contains. Note that
the contents of this message will vary from one graph system to another and should
not be relied upon as a way to keep track of vertex and edge counts. We will look at
some other ways of counting things a bit later.
Once the 'air-routes' graph is loaded you can enter the following command and you
will get back information about the graph. In the case of a TinkerGraph you will get
back a useful message telling you how many vertices and edges the graph contains.
Note that the contents of this message will vary from one graph system to another and
should not be relied upon as a way to keep track of vertex and edge counts. We will
look at some other ways of counting things a bit later.

[source,groovy]
----
Expand Down
12 changes: 6 additions & 6 deletions book/Section-Introducing-Gremlin-Server.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -564,10 +564,10 @@ connection to the Gremlin Server. This is done by first of all creating a
to and the protocol we want to use. It is important that these settings match what
the Gremlin Server is configured to use. For this simple example we are just using
'localhost' as the host name but the name of any Gremlin Server that you have access
to can be used instead. The default Gremlin Server port of '8182' is specified and the
'GraphBinaryMessageSerializerV1' serialization format is selected. Again, this needs
to match both the protocol and the version of the protocol that your Gremlin Server
is supporting.
to can be used instead. The default Gremlin Server port of '8182' is specified and
the 'GraphBinaryMessageSerializerV1' serialization format is selected. Again, this
needs to match both the protocol and the version of the protocol that your Gremlin
Server is supporting.

[source,groovy]
----
Expand Down Expand Up @@ -822,8 +822,8 @@ globals << [g : graph.traversal()]
Starting the Server
^^^^^^^^^^^^^^^^^^^

As discussed in the "<<serverconfig>>" section, you can start the Gremlin Server in the
foreground or in the background. For our initial test let's just start the server
As discussed in the "<<serverconfig>>" section, you can start the Gremlin Server in
the foreground or in the background. For our initial test let's just start the server
running in the foreground.

[source,console]
Expand Down
38 changes: 19 additions & 19 deletions book/Section-Introduction.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -139,9 +139,9 @@ language via real examples featuring real-world graph data. That data along with
sample code and example applications is available for download from the GitHub
project as well as many other items. The graph, 'air-routes', is a model of
the world airline route network between 3,373 airports including 43,400 routes. The
examples presented will work unmodified with the `air-routes.graphml` file loaded into
the Gremlin console running with a TinkerGraph. How to set that environment up is
covered in the <<gremlininstall>> section below.
examples presented will work unmodified with the `air-routes.graphml` file loaded
into the Gremlin console running with a TinkerGraph. How to set that environment up
is covered in the "<<gremlininstall>>" section below.

NOTE: The examples in this book have been tested using Apache TinkerPop release
{tpvercheck}.
Expand Down Expand Up @@ -312,8 +312,8 @@ TinkerPop 3.4
A major update to Apache TinkerPop, version 3.4.0, was released in January 2019 and a
number of point releases followed.

NOTE: Full details of all the new features added in the TinkerPop 3.4.x releases can be
found at the following link:
NOTE: Full details of all the new features added in the TinkerPop 3.4.x releases can
be found at the following link:
https://github.com/apache/tinkerpop/blob/master/CHANGELOG.asciidoc#tinkerpop-340-avant-gremlin-construction-3-for-theremin-and-flowers

[[tp35intro]]
Expand All @@ -338,8 +338,8 @@ the Gremlin language such that dates can be added without needing programming
language specific constructs. This is useful when sending Gremlin queries as text
strings.

NOTE: Full details of all the new features added in the TinkerPop 3.5.x releases can be
found at the following link:
NOTE: Full details of all the new features added in the TinkerPop 3.5.x releases can
be found at the following link:
https://github.com/apache/tinkerpop/blob/master/CHANGELOG.asciidoc#tinkerpop-350-the-sleeping-gremlin-no-18-entracte-symphonique

[[tp36intro]]
Expand Down Expand Up @@ -374,8 +374,8 @@ completed as of TinkerPop 3.6.0.
- A new 'fail' step that can be used to abort a query in a controlled way.
NOTE: Full details of all the new features added in the TinkerPop 3.6.x releases can be
found at the following link:
NOTE: Full details of all the new features added in the TinkerPop 3.6.x releases can
be found at the following link:
https://github.com/apache/tinkerpop/blob/master/CHANGELOG.asciidoc#tinkerpop-360-tinkerheart

[[tp37intro]]
Expand All @@ -384,8 +384,8 @@ Introducing TinkerPop 3.7

IMPORTANT: TODO - Add details for TinkerPop 3.7

NOTE: Full details of all the new features added in the TinkerPop 3.7.x releases can be
found at the following link:
NOTE: Full details of all the new features added in the TinkerPop 3.7.x releases can
be found at the following link:
https://github.com/apache/tinkerpop/blob/master/CHANGELOG.asciidoc#tinkerpop-370-gremfir-master-of-the-pan-flute

[[whygraph]]
Expand Down Expand Up @@ -519,12 +519,12 @@ The words 'node' and 'vertex' are synonymous when discussing a graph. Throughout
book you may find both words used. However, as the Apache TinkerPop documentation
almost exclusively uses the word 'vertex', as much as possible when discussing
Gremlin queries and other concepts, I endeavor to stick to the word 'vertex' or the
plural form 'vertices'. As this book has evolved, I realized my use of these terms had
become inconsistent and as I continue to make updates, I plan, with a few exceptions,
such as when discussing binary trees, to standardize on 'vertex' rather than 'node'.
In that way, this book will be consistent with the official TinkerPop documentation.
Similarly, when discussing the connections between vertices I use the term 'edge' or
the plural form, 'edges'. In other books and articles you may also see terms like
'relationship' or 'arc' used. Again these terms are synonymous in the context of
graphs.
plural form 'vertices'. As this book has evolved, I realized my use of these terms
had become inconsistent and as I continue to make updates, I plan, with a few
exceptions, such as when discussing binary trees, to standardize on 'vertex' rather
than 'node'. In that way, this book will be consistent with the official TinkerPop
documentation. Similarly, when discussing the connections between vertices I use the
term 'edge' or the plural form, 'edges'. In other books and articles you may also see
terms like 'relationship' or 'arc' used. Again these terms are synonymous in the
context of graphs.
// vim: set tw=85 cc=+1 wrap spell redrawtime=20000
Loading

0 comments on commit d60b282

Please sign in to comment.