Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add seed restore until section and remove 5.x seed provider info #2002

Open
wants to merge 9 commits into
base: dev
Choose a base branch
from
69 changes: 26 additions & 43 deletions modules/ROOT/pages/clustering/databases.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -468,11 +468,11 @@ SHOW DATABASE foo;
=== Seed from URI

This method seeds all servers with an identical seed from an external source, specified by the URI.
The seed can either be a full backup, a differential backup (xref:clustering/databases.adoc#cloud-seed-provider[`CloudSeedProvider`], introduced in Neo4j 5.26), or a dump from an existing database.
The seed can either be a full backup, a differential backup, or a dump from an existing database.
The sources of seeds are called _seed providers_.

The mechanism is pluggable, allowing new sources of seeds to be supported (see link:https://www.neo4j.com/docs/java-reference/current/extending-neo4j/project-setup/#extending-neo4j-plugin-seed-provider[Java Reference -> Implement custom seed providers] for more information).
The product has built-in support for seed from a mounted file system (file), FTP server, HTTP/HTTPS server, Amazon S3, Google Cloud Storage (from Neo4j 5.25), and Azure Cloud Storage (from Neo4j 5.25).
The product has built-in support for seed from a mounted file system (file), FTP server, HTTP/HTTPS server, Amazon S3, Google Cloud Storage, and Azure Cloud Storage.

[NOTE]
====
Expand Down Expand Up @@ -504,7 +504,7 @@ To determine the cause of the problem, it is recommended to look at the `debug.l
[[file-seed-provider]]
==== FileSeedProvider

label:new[Introduced in 5.26], the `FileSeedProvider` supports:
The `FileSeedProvider` supports:

** `file:`

Expand All @@ -513,26 +513,25 @@ label:new[Introduced in 5.26], the `FileSeedProvider` supports:

The `URLConnectionSeedProvider` supports the following:

** `file:` label:deprecated[Deprecated in 5.26]
NataliaIvakina marked this conversation as resolved.
Show resolved Hide resolved
** `ftp:`
** `http:`
** `https:`

NataliaIvakina marked this conversation as resolved.
Show resolved Hide resolved
[[cloud-seed-provider]]
==== CloudSeedProvider

label:new[Introduced in 5.25], the `CloudSeedProvider` supports:
The `CloudSeedProvider` supports:

** `s3:`
** `gs:`
** `azb:`

Starting from Neo4j 5.26, the `CloudSeedProvider` supports using xref:backup-restore/modes.adoc#differential-backup[differential backup] files as seeds.
With the provided differential backup file, the `CloudSeedProvider` searches the directory containing differential backup files for a xref:backup-restore/online-backup.adoc#backup-chain[backup chain] ending at the specified differential backup, and then seeds using this backup chain.
The `CloudSeedProvider` supports using xref:backup-restore/modes.adoc#differential-backup[differential backups] as seeds.
With the provided differential backup, the `CloudSeedProvider` searches the directory containing differential backups for a xref:backup-restore/online-backup.adoc#backup-chain[backup chain] ending at the specified differential backup, and then seeds using this backup chain.

[.tabbed-example]
=====
jackwaudby marked this conversation as resolved.
Show resolved Hide resolved
[role=include-with-AWS-S3 label--new-5.25]
[role=include-with-AWS-S3]
======

include::partial$/aws-s3-overrides.adoc[]
Expand All @@ -547,7 +546,7 @@ CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3:/myBucket/myBack
----

======
[role=include-with-Google-cloud-storage label--new-5.25]
[role=include-with-Google-cloud-storage]
======

include::partial$/gcs-credentials.adoc[]
Expand All @@ -559,7 +558,7 @@ include::partial$/gcs-credentials.adoc[]
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 'gs:/myBucket/myBackup.backup' }
----
======
[role=include-with-Azure-cloud-storage label--new-5.25]
[role=include-with-Azure-cloud-storage]
======

include::partial$/azb-credentials.adoc[]
Expand All @@ -573,43 +572,29 @@ CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 'azb://myStorageAcco
======
=====

[[s3-seed-provider]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same question. Is this option removed? In which Neo4j and Cypher versions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Neo4j 2025.01 s3 is supported only with the CloudSeedProvider. This is true for both Cypher 5 and Cypher 25 running on 2025.01.

However, in theory it would work if a user run Cypher 5 on 2025.01 and used the S3SeedProvider but it is not the approach we want to encourage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a balancing act to be done here. The S3SeedProvider still exists in order to support the Cypher 5 usage of it. So it needs to remain in the docs, but be clearly 'Cypher 5 only'.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change to the setting description to capture this: https://github.com/neo-technology/neo4j/pull/28957

==== S3SeedProvider
The `CloudSeedProvider` supports seeding up to a specific date or transaction ID using the `seedRestoreUntil` option.

The `S3SeedProvider` supports:
==== Seed up to a specific date

** `s3:` label:deprecated[Deprecated in 5.26]
To seed up to a specific date, you need to pass the differential backup, which contains the data up to that date.


[NOTE]
====
Neo4j 5 comes bundled with necessary libraries for AWS S3 connectivity.
Therefore, if you use `S3SeedProvider`,`aws cli` is not required but can be used with the `CloudSeedProvider`.
====

The `S3SeedProvider` requires additional configuration.
This is specified with the `seedConfig` option.
This option expects a comma-separated list of configurations.
Each configuration value is specified as a name followed by `=` and the value, as such:

[source, cypher, role="noplay"]
[source,shell]
----
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3:/myBucket/myBackup.backup', seedConfig: 'region=eu-west-1' }
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3:/myBucket/myBackup.backup', seedRestoreUntil: datetime("2019-06-01T18:40:32.142+0100") }
----

`S3SeedProvider` also requires passing in credentials.
These are specified with the `seedCredentials` option.
Seed credentials are securely passed from the Cypher command to each server hosting the database.
For this to work, Neo4j on each server in the cluster must be configured with identical keystores.
This is identical to the configuration required by remote aliases, see xref:database-administration/aliases/remote-database-alias-configuration.adoc#remote-alias-config-DBMS_admin-A[Configuration of DBMS with remote database alias].
If this configuration is not performed, the `seedCredentials` option fails.
This will seed the database with transactions committed before the provided timestamp.

[source, cypher, role="noplay"]
==== Seed up to a specific transaction ID

To seed up to a specific transaction ID, you need to pass the differential backup that contains the data up to that transaction ID.

[source,shell]
----
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3:/myBucket/myBackup.backup', seedConfig: 'region=eu-west-1', seedCredentials: [accessKey];[secretKey] }
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3:/myBucket/myBackup.backup', seedRestoreUntil: 123 }
----
Where `accessKey` and `secretKey` are provided by AWS.

This will seed the database with transactions up to, but not including transaction 123.

Copy link
Contributor

@NataliaIvakina NataliaIvakina Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[role=label--deprecated label--available-Cypher5]
[[s3-seed-provider]]
==== S3SeedProvider
The `S3SeedProvider` supports:
** `s3:`
[NOTE]
====
Neo4j 5 comes bundled with necessary libraries for AWS S3 connectivity.
Therefore, if you use `S3SeedProvider`,`aws cli` is not required but can be used with the `CloudSeedProvider`.
====
The `S3SeedProvider` requires additional configuration.
This is specified with the `seedConfig` option.
This option expects a comma-separated list of configurations.
Each configuration value is specified as a name followed by `=` and the value, as such:
[source, cypher, role="noplay"]
----
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3:/myBucket/myBackup.backup', seedConfig: 'region=eu-west-1' }
----
`S3SeedProvider` also requires passing in credentials.
These are specified with the `seedCredentials` option.
Seed credentials are securely passed from the Cypher command to each server hosting the database.
For this to work, Neo4j on each server in the cluster must be configured with identical keystores.
This is identical to the configuration required by remote aliases, see xref:database-administration/aliases/remote-database-alias-configuration.adoc#remote-alias-config-DBMS_admin-A[Configuration of DBMS with remote database alias].
If this configuration is not performed, the `seedCredentials` option fails.
[source, cypher, role="noplay"]
----
CREATE DATABASE foo OPTIONS { existingData: 'use', seedURI: 's3:/myBucket/myBackup.backup', seedConfig: 'region=eu-west-1', seedCredentials: [accessKey];[secretKey] }
----
Where `accessKey` and `secretKey` are provided by AWS.

==== Seed provider reference

Expand All @@ -620,8 +605,7 @@ Where `accessKey` and `secretKey` are provided by AWS.
| URI example

| `file:`
| `URLConnectionSeedProvider` label:deprecated[Deprecated in 5.26], +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URLConnectionSeedProvider remains in 2025.01 but only supports http, ftp and https now.

`FileSeedProvider` label:new[Introduced in 5.26]
| `FileSeedProvider`
| `file:/tmp/backup1.backup`

| `ftp:`
Expand All @@ -637,16 +621,15 @@ Where `accessKey` and `secretKey` are provided by AWS.
| `\https://myhttp.com/backups/backup1.backup`

| `s3:`
| `S3SeedProvider` label:deprecated[Deprecated in 5.26], +
`CloudSeedProvider` label:new[Introduced in 5.25]
| `CloudSeedProvider`
| `s3://mybucket/backups/backup1.backup`

| `gs:`
| `CloudSeedProvider` label:new[Introduced in 5.25]
| `CloudSeedProvider`
| `gs://mybucket/backups/backup1.backup`

| `azb:`
| `CloudSeedProvider` label:new[Introduced in 5.25]
| `CloudSeedProvider`
| `azb://mystorageaccount.blob/backupscontainer/backup1.backup`
|===

Expand Down
Loading