-
Notifications
You must be signed in to change notification settings - Fork 362
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Co-authored-by: Barak Amar <[email protected]> Co-authored-by: itaiad200 <[email protected]> Co-authored-by: itai-david <[email protected]> Co-authored-by: eden-ohana <[email protected]>
- Loading branch information
1 parent
4f13738
commit 249c3e2
Showing
8 changed files
with
507 additions
and
64 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,15 +28,43 @@ This reference uses `.` to denote the nesting of values. | |
|
||
* `logging.format` `(one of ["json", "text"] : "text")` - Format to output log message in | ||
* `logging.level` `(one of ["TRACE", "DEBUG", "INFO", "WARN", "ERROR", "NONE"] : "DEBUG")` - Logging level to output | ||
* `logging.audit_log_level` `(one of ["TRACE", "DEBUG", "INFO", "WARN", "ERROR", "NONE"] : "DEBUG")` - Audit logs level to output. **Please notice that in case you configure this field to be lower than the main logger level, you won't be able to get the audit logs** | ||
* `logging.audit_log_level` `(one of ["TRACE", "DEBUG", "INFO", "WARN", "ERROR", "NONE"] : "DEBUG")` - Audit logs level to output. | ||
|
||
**Note:** In case you configure this field to be lower than the main logger level, you won't be able to get the audit logs | ||
{: .note } | ||
* `logging.output` `(string : "-")` - A path or paths to write logs to. A `-` means the standard output, `=` means the standard error. | ||
* `logging.file_max_size_mb` `(int : 100)` - Output file maximum size in megabytes. | ||
* `logging.files_keep` `(int : 0)` - Number of log files to keep, default is all. | ||
* `actions.enabled` `(bool : true)` - Setting this to false will block hooks from being executed | ||
* `database.connection_string` `(string : "postgres://localhost:5432/postgres?sslmode=disable")` - PostgreSQL connection string to use | ||
* `database.max_open_connections` `(int : 25)` - Maximum number of open connections to the database | ||
* `database.max_idle_connections` `(int : 25)` - Sets the maximum number of connections in the idle connection pool | ||
* `database.connection_max_lifetime` `(duration : 5m)` - Sets the maximum amount of time a connection may be reused | ||
* ~~`database.connection_string` `(string : "postgres://localhost:5432/postgres?sslmode=disable")` - PostgreSQL connection string to use~~ | ||
* ~~`database.max_open_connections` `(int : 25)` - Maximum number of open connections to the database~~ | ||
* ~~`database.max_idle_connections` `(int : 25)` - Sets the maximum number of connections in the idle connection pool~~ | ||
* ~~`database.connection_max_lifetime` `(duration : 5m)` - Sets the maximum amount of time a connection may be reused~~ | ||
|
||
**Note:** Deprecated - See `database` section | ||
{: .note } | ||
* `database` - Configuration section for the lakeFS key-value store database | ||
+ `database.type` `(string : ["postgres"|"dynamodb"])` - lakeFS database type | ||
+ `database.postgres` - Configuration section when using `database.type="postgres"` | ||
+ `database.postgres.connection_string` `(string : "postgres://localhost:5432/postgres?sslmode=disable")` - PostgreSQL connection string to use | ||
+ `database.postgres.max_open_connections` `(int : 25)` - Maximum number of open connections to the database | ||
+ `database.postgres.max_idle_connections` `(int : 25)` - Maximum number of connections in the idle connection pool | ||
+ `database.postgres.connection_max_lifetime` `(duration : 5m)` - Sets the maximum amount of time a connection may be reused `(valid units: ns|us|ms|s|m|h)` | ||
+ `database.dynamodb` - Configuration section when using `database.type="dynamodb"` | ||
+ `database.dynamodb.table_name` `(string : "kvstore")` - Table used to store the data | ||
+ `database.dynamodb.read_capacity_units` `(int : 1000)` - Read capacity units, measured in requests per second | ||
+ `database.dynamodb.write_capacity_units` `(int : 1000)` - Write capacity units, measured in requests per second | ||
+ `database.dynamodb.scan_limit` `(int : )` - Maximal number of items per page during scan operation | ||
|
||
**Note:** Refer to the following [AWS documentation](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.html#Query.Limit) for further information | ||
{: .note } | ||
+ `database.dynamodb.endpoint` `(string : )` - Endpoint URL for database instance | ||
+ `database.dynamodb.aws_region` `(string : )` - AWS Region of database instance | ||
+ `database.dynamodb.aws_access_key_id` `(string : )` - AWS access key ID | ||
+ `database.dynamodb.aws_secret_access_key` `(string : )` - AWS secret access key | ||
|
||
**Note:** `endpoint` `aw_region` `aws_access_key_id` `aws_secret_access_key` are not required and used mainly for experimental purposes when working with DynamoDB with different AWS credentials. | ||
{: .note } | ||
* `listen_address` `(string : "0.0.0.0:8000")` - A `<host>:<port>` structured string representing the address to listen on | ||
* `auth.cache.enabled` `(bool : true)` - Whether to cache access credentials and user policies in-memory. Can greatly improve throughput when enabled. | ||
* `auth.cache.size` `(int : 1024)` - How many items to store in the auth cache. Systems with a very high user count should use a larger value at the expense of ~1kb of memory per cached user. | ||
|
@@ -146,14 +174,16 @@ To set an environment variable, prepend `LAKEFS_` to its name, convert it to upp | |
For example, `logging.format` becomes `LAKEFS_LOGGING_FORMAT`, `blockstore.s3.region` becomes `LAKEFS_BLOCKSTORE_S3_REGION`, etc. | ||
|
||
|
||
## Example: Local Development | ||
## Example: Local Development with PostgreSQL database | ||
|
||
```yaml | ||
--- | ||
listen_address: "0.0.0.0:8000" | ||
|
||
database: | ||
connection_string: "postgres://localhost:5432/postgres?sslmode=disable" | ||
type: "postgres" | ||
postgres: | ||
connection_string: "postgres://localhost:5432/postgres?sslmode=disable" | ||
|
||
logging: | ||
format: text | ||
|
@@ -175,7 +205,7 @@ gateways: | |
``` | ||
## Example: AWS Deployment | ||
## Example: AWS Deployment with DynamoDB database | ||
```yaml | ||
--- | ||
|
@@ -185,7 +215,9 @@ logging: | |
output: "-" | ||
|
||
database: | ||
connection_string: "postgres://user:[email protected]:5432/postgres" | ||
type: "dynamodb" | ||
dynamodb: | ||
table_name: "kvstore" | ||
|
||
auth: | ||
encrypt: | ||
|
@@ -213,7 +245,9 @@ logging: | |
output: "-" | ||
|
||
database: | ||
connection_string: "postgres://user:[email protected]:5432/postgres" | ||
type: "postgres" | ||
postgres: | ||
connection_string: "postgres://user:[email protected]:5432/postgres" | ||
|
||
auth: | ||
encrypt: | ||
|
@@ -236,7 +270,9 @@ logging: | |
output: "-" | ||
|
||
database: | ||
connection_string: "postgres://user:[email protected]:5432/postgres" | ||
type: "postgres" | ||
postgres: | ||
connection_string: "postgres://user:[email protected]:5432/postgres" | ||
|
||
auth: | ||
encrypt: | ||
|
@@ -263,7 +299,9 @@ logging: | |
output: "-" | ||
|
||
database: | ||
connection_string: "postgres://user:[email protected]:5432/postgres" | ||
type: "postgres" | ||
postgres: | ||
connection_string: "postgres://user:[email protected]:5432/postgres" | ||
|
||
auth: | ||
encrypt: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
--- | ||
layout: default | ||
title: Database Migration | ||
description: A guide to migrating lakeFS database. | ||
parent: Reference | ||
nav_order: 51 | ||
has_children: false | ||
--- | ||
|
||
# lakeFS Database Migrate | ||
{: .no_toc } | ||
|
||
**Note:** Feature in development | ||
{: .note } | ||
|
||
The lakeFS database migration tool simplifies switching from one database implementation to another. | ||
More information can be found [here](https://github.com/treeverse/lakeFS/issues/3899) | ||
|
||
## lakeFS with Key Value Store | ||
|
||
Starting at version 0.80.0, lakeFS abandoned the tight coupling to [PostgreSQL](https://en.wikipedia.org/wiki/PostgreSQL) and moved all database operations to work over [Key-Value Store](https://en.wikipedia.org/wiki/Key%E2%80%93value_database) | ||
|
||
While SQL databases, and Postgres among them, have their obvious advantages, we felt that the tight coupling to Postgres is limiting our users and so, lakeFS with Key Value Store is introduced. | ||
Our KV Store implements a generic interface, with methods for `Get`, `Set`, `Compare-and-Set`, `Delete` and `Scan`. Each entry is represented by a [`partition`, `key`, `value`] triplet. All these fields are generic byte-array, and the using module has maximal flexibility on the format to use for each field | ||
|
||
Under the hood, our KV implementation relies on a backing DB, which persists the data. Theoretically, it could be any type of database and out of the box, we already implemented drivers for [DynamoDB](https://en.wikipedia.org/wiki/Amazon_DynamoDB), for AWS users, and [PostgreSQL](https://en.wikipedia.org/wiki/PostgreSQL), using its relational nature to store a KV Store. More databases will be supported in the future, and lakeFS users and contributors can develop their own driver to use their own favorite database. For experimenting purposes, an in-memory KV store can be used, though it obviously lack the persistency aspect | ||
|
||
In order to store ref store objects (that is `Repositories`, `Branches`, `Commits`, `Tags`, and `Uncommitted Objects`), lakeFS implements another layer over the generic KV Store, which supports serialization and deserialization of these objects as [protobuf](https://en.wikipedia.org/wiki/Protocol_Buffers). As this layer relies on the generic interface of the KV Store layer, it is totally agnostic to whichever store implementation is in use, gaining our users the maximal flexibility | ||
|
||
For further reading, please refer to our [KV Design](https://github.com/treeverse/lakeFS/blob/master/design/accepted/metadata_kv/index.md) |
Oops, something went wrong.