Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy an analytics cluster using the Neo4j Helm chart #1244

Merged
merged 7 commits into from
Jan 2, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions modules/ROOT/content-nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@
*** xref:kubernetes/quickstart-cluster/access-inside-k8s.adoc[]
*** xref:kubernetes/quickstart-cluster/access-outside-k8s.adoc[]
*** xref:kubernetes/quickstart-cluster/uninstall-cleanup.adoc[]
** xref:kubernetes/quickstart-analytics-cluster.adoc[]
** xref:kubernetes/persistent-volumes.adoc[]
** xref:kubernetes/configuration.adoc[]
** xref:kubernetes/security.adoc[]
Expand Down
53 changes: 53 additions & 0 deletions modules/ROOT/pages/kubernetes/configuration.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -509,6 +509,7 @@ services:
# This service is available even if the deployment is not "ready"
internals:
enabled: false

# Annotations for the internals service
annotations: { }
# n.b. there is no ports object for this service. Ports are autogenerated based on the neo4j configuration
Expand Down Expand Up @@ -697,6 +698,12 @@ podSpec:
# If set to false then no anti-affinity rules are applied
# If set to an object then that object is used for the Neo4j podAntiAffinity
podAntiAffinity: true
# requiredDuringSchedulingIgnoredDuringExecution:
# - labelSelector:
# matchLabels:
# app: "demo"
# helm.neo4j.com/pod_category: "neo4j-instance"
# topologyKey: kubernetes.io/hostname

#Add tolerations to the Neo4j pod
tolerations: []
Expand Down Expand Up @@ -837,6 +844,52 @@ logging:
# </Root>
# </Loggers>
# </Configuration>

# define your podDisruptionBudget details here
podDisruptionBudget:
enabled: false
matchLabels: {}
# "demo": "neo4j"
matchExpressions: []
# - key: "demo"
# operator: "Equals"
# value: "neo4j"
labels: {}
# "name": "neo4j"
minAvailable: ""
maxUnavailable: ""

# Service Monitor for prometheus
# Please ensure prometheus operator or the service monitor CRD is present in your cluster before using service monitor config
serviceMonitor:
enabled: false
labels: {}
# "demo": "value"
jobLabel: ""
interval: ""
port: ""
path: ""
namespaceSelector: {}
# any: false
# matchNames:
# - default
targetLabels: []
# - "demo"
# - "value"
selector: {}
# matchLabels:
# helm.neo4j.com/service: "admin"


# this section is to be used only when setting up (1 primary + n secondary neo4j instances scenario)
# Disabled by default.
analytics:
# This flag will enable the internal ports and certain configs necessary to allow 1 primary + n secondary neo4j instances scenario
enabled: false
type:
# values can be primary or secondary
# this field denotes the neo4j instance type either primary or secondary
name: primary
----
+
. Pass the _neo4j-values.yaml_ file during installation.
Expand Down
1 change: 1 addition & 0 deletions modules/ROOT/pages/kubernetes/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ This chapter describes the following:
* xref:kubernetes/helm-charts-setup.adoc[Configure the Neo4j Helm chart repository] -- Configure the Neo4j Helm chart repository and check for the available charts.
* xref:kubernetes/quickstart-standalone/index.adoc[Quickstart: Deploy a standalone instance] -- Deploy a Neo4j standalone instance to a cloud (GKE, AWS, AKS) or a local (via Docker Desktop for macOS) Kubernetes cluster.
* xref:kubernetes/quickstart-cluster/index.adoc[Quickstart: Deploy a cluster] -- Deploy a Neo4j cluster to a cloud (GKE, AWS, AKS) Kubernetes cluster.
* xref:kubernetes/quickstart-analytics-cluster.adoc[Quickstart: Deploy a Neo4j cluster for analytic queries] -- Deploy an analytics Neo4j cluster to a local or a cloud (GKE, AWS, AKS) Kubernetes cluster.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to be specific in the title "Deploy a Neo4j Cluster for analytics (1 primary + N Secondary)" since N primary and N secondary was already possible

Copy link
Contributor Author

@renetapopova renetapopova Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that we could expand this page similar to how this topic is presented in the Clustering chapter , and have two sections - an analytics cluster with fault-tolerance and without.

* xref:kubernetes/persistent-volumes.adoc[Volume mounts and persistent volumes] -- Use persistent volumes with the Neo4j Helm chart and what types Neo4j supports.
* xref:kubernetes/configuration.adoc[Customizing a Neo4j Helm chart] -- Configure a Neo4j deployment using a customized _values.yaml_ file.
* xref:kubernetes/security.adoc[Configuring SSL] -- Configure SSL for a Neo4j deployment running on Kubernetes.
Expand Down
1 change: 1 addition & 0 deletions modules/ROOT/pages/kubernetes/plugins.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ config:
dbms.security.procedures.unrestricted: "gds.*"
----

[[install-gds-ee-bloom]]
=== Install GDS Enterprise Edition (EE) and Bloom plugins

To install GDS EE and Bloom, you must provide a license for each plugin.
Expand Down
168 changes: 168 additions & 0 deletions modules/ROOT/pages/kubernetes/quickstart-analytics-cluster.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
:description: How to deploy a Neo4j cluster to a cloud or a local Kubernetes cluster using Neo4j Helm chart.
[role=enterprise-edition]
[[quick-start-analytic-cluster]]
= Quickstart: Deploy a Neo4j cluster for analytic queries

_The feature is available in the Neo4j Helm chart from version 5.14._

This quickstart shows how to configure and deploy a special Neo4j cluster without fault tolerance to support analytic queries.
harshitsinghvi22 marked this conversation as resolved.
Show resolved Hide resolved
Information on using Neo4j’s graph data science library in a cluster can be found in the link:https://neo4j.com/docs/graph-data-science/current/[Neo4j Graph Data Science library documentation].

The cluster is deployed to a cloud or a local Kubernetes cluster using the Neo4j Helm chart.
Because the GDS library does not support fault tolerance, you can deploy a cluster with one primary server for the transaction workload and an N number of secondary servers for the GDS analytics.
harshitsinghvi22 marked this conversation as resolved.
Show resolved Hide resolved

== Prerequisites

Before you can deploy a Neo4j cluster on Kubernetes, you need to have:

* A Kubernetes cluster running and the `kubectl` command-line tool installed and configured to communicate with your cluster.
For more information, see link:xref:kubernetes/quickstart-cluster/prerequisites.adoc[Quickstart: Deploy a cluster -> Prerequisites].
* A valid license for Neo4j Enterprise Edition.
For more information, see xref:/kubernetes/plugins.adoc#install-gds-ee-bloom[Install GDS Enterprise Edition (EE) and Bloom plugins].
* The xref:kubernetes/helm-charts-setup.adoc[latest version of the Neo4j Helm chart repository].
* (Optional) A valid license for GDS Enterprise Edition.
To install a licensed plugin, you must provide the license files in a Kubernetes secret.
For more information, see xref:/kubernetes/plugins.adoc#install-gds-ee-bloom[Install GDS Enterprise Edition (EE) and Bloom plugins].

== Create a value YAML file for each type of server

To set up a Neo4j cluster for analytic queries, you need to create a value YAML file for each type of server, primary and secondary.
For example:

[.tabbed-example]
=====
[.include-with-primary]
======

Create a value YAML file for the primary server, for example, _primary-value.yaml_:

[source, yaml]
----
neo4j:
name: analytics-cluster
acceptLicenseAgreement: "yes"
edition: enterprise
password: my-password
volumes:
data:
mode: defaultStorageClass

# Disable the Neo4j load balancer and enable the internal service so that the servers can access each other:
services:
neo4j:
enabled: false
internals:
enabled: true

# Enable the analytics cluster and set the type to primary:
analytics:
enabled: true
type: primary
harshitsinghvi22 marked this conversation as resolved.
Show resolved Hide resolved
----
======
[.include-with-secondary]
======
Create a value YAML file for the secondary servers, for example, _secondary-gds.yaml_.
The password must be the same as for the primary server.
If you are using GDS Enterprise Edition, you also need to create a secret with the license file and mount it as the _/licenses_ volume mount.
For more information on how to create a secret, see xref:/kubernetes/plugins.adoc#install-gds-ee-bloom[Install GDS Enterprise Edition (EE) and Bloom plugins].

[source, yaml]
----
neo4j:
name: analytics-cluster
acceptLicenseAgreement: "yes"
edition: enterprise
password: my-password
volumes:
data:
mode: defaultStorageClass
# Define the volume mount for the license file:
licenses:
disableSubPathExpr: true
mode: volume
volume:
secret:
secretName: gds-license
items:
- key: gds.license
path: gds.license

# Set the environment variables to download the plugins:
env:
NEO4J_PLUGINS: '["graph-data-science"]'

# Set the configuration for the plugins directory and the mount for the license file:
config:
gds.enterprise.license_file: "/licenses/gds.license"
server.directories.plugins: "plugins"

# Disable the Neo4j load balancer and enable the internal service so that the servers can access each other:
services:
neo4j:
enabled: false
internals:
enabled: true

# Enable the analytics cluster and set the type to secondary:
analytics:
enabled: true
type: secondary
harshitsinghvi22 marked this conversation as resolved.
Show resolved Hide resolved

----
======
=====

For all available options, see xref:kubernetes/configuration.adoc[Customizing a Neo4j Helm chart].

== Install the servers

. Install a single Neo4j server using the _neo4j-primary.yaml_ file, created in the previous section:
+
[source, bash]
----
helm install primary neo4j/neo4j -f /path/to/neo4j-primary.yaml
----
. Install the first secondary server using the _secondary-gds.yaml_ file, created in the previous section:
+
[source, bash]
----
helm install gds1 neo4j/neo4j -f /path/to/secondary-gds.yaml
----
. Repeat step 2 to deploy a second secondary server.
Use a different name, for example, _gds2_.


== Verify the installation

To verify that the cluster is deployed and running, you can install a load balancer and access Neo4j from the Neo4j Browser.

. Deploy a Neo4j load balancer to the same namespace as the Neo4j cluster:
+
[source, bash]
----
helm install lb neo4j/neo4j-load-balancer --set neo4j.name="analytics-cluster"
----
. When deployed, copy the external IP of the LoadBalancer to access Neo4j from an application outside the Kubernetes cluster.
For more information, see xref:kubernetes/accessing-neo4j.adoc#access-outside-k8s[Applications accessing Neo4j from outside Kubernetes].
. In a web browser, open the Neo4j Browser at _http://EXTERNAL_IP:7474/browser_ and log in using the password you have configured in your values YAML files.
. Run the following Cypher query to verify that the cluster is deployed and running:
+
[source, cypher]
----
SHOW SERVERS
----
harshitsinghvi22 marked this conversation as resolved.
Show resolved Hide resolved
. Run the following Cypher function to verify that the GDS library is installed:
+
harshitsinghvi22 marked this conversation as resolved.
Show resolved Hide resolved
[source, cypher]
----
RETURN gds.version()
----
. Call `gds.isLicensed()` to verify that the GDS library is licensed:
+
[source, cypher]
----
RETURN gds.isLicensed()
----
+
The returned value must be `true`.