Skip to content

Commit

Permalink
updated docs
Browse files Browse the repository at this point in the history
  • Loading branch information
JulienR1 committed Nov 22, 2023
1 parent 41bf9a4 commit 3b61515
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 13 deletions.
53 changes: 41 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,19 @@

## Features

<details>
<summary><b><a href="https://crates.io/crates/substreams-entity-change/">Entity changes</a> support</b></summary>

Support for these entity change operations:

- `OPERATION_CREATE`: The received entity changes are directly inserted into ClickHouse according to a provided [schema](#schema-initialization).

- `OPERATION_UPDATE`: By default, updates are treated as new items. This allows to build an history of every transactions. If required, previous records can be replaced by specifying the engine as `ReplacingMergeTree`. See this [article](https://clickhouse.com/docs/en/guides/developer/deduplication#using-replacingmergetree-for-upserts).

- `OPERATION_DELETE`: Entity changes are not actually deleted from the database. Again, this allows to build an history of every transactions. The deleted fields are inserted into `deleted_entity_changes`. This table can then be used to filter out deleted data if required.

</details>

<details>
<summary><b>Serverless data sinking</b></summary>

Expand Down Expand Up @@ -133,7 +146,7 @@ Options:
--allow-unparsed <boolean> Enable storage in 'unparsed_json' table (default: false, env: ALLOW_UNPARSED)
--transaction-size <number> Number of insert statements in a SQLite transaction (default: 50, env: TRANSACTION_SIZE)
--resume <boolean> Save the cached data from the previous process into ClickHouse (default: true, env: RESUME)
--buffer <string> SQLite database to use as an insertion buffer. Use ':memory:' to make it volatile. (default: buffer.sqlite, env: BUFFER)
--buffer <string> SQLite database to use as an insertion buffer. Use ':memory:' to make it volatile. (default: buffer.db, env: BUFFER)
-h, --help display help for command
```

Expand Down Expand Up @@ -171,14 +184,18 @@ The `USER_DIMENSION` is generated by the user provided schema and is augmented b
```mermaid
erDiagram
USER_DIMENSION }|--|{ blocks : " "
USER_DIMENSION }|--|{ module_hashes : " "
module_hashes }|--|{ USER_DIMENSION : " "
USER_DIMENSION }|--|{ cursors : " "
blocks }|--|{ final_blocks : " "
deleted_entity_changes }|--|{ blocks : " "
module_hashes }|--|{ deleted_entity_changes : " "
deleted_entity_changes }|--|{ cursors : " "
blocks }|--|{ unparsed_json : " "
unparsed_json }|--|{ blocks : " "
module_hashes }|--|{ unparsed_json : " "
cursors }|--|{ unparsed_json : " "
unparsed_json }|--|{ cursors : " "
blocks }|--|{ final_blocks : " "
USER_DIMENSION {
user_data unknown
Expand All @@ -191,6 +208,17 @@ erDiagram
cursor String
}
deleted_entity_changes {
source LowCardinality(String)
id String
chain LowCardinality(String)
block_id FixedString(64)
block_number UInt32
module_hash FixedString(40)
timestamp DateTime(3_UTC)
cursor String
}
unparsed_json {
raw_data String
source LowCardinality(String)
Expand Down Expand Up @@ -232,13 +260,14 @@ erDiagram

**Indexes**

| Table | Fields |
| ------------- | -------------------------------------------- |
| blocks | `(block_id, block_number, chain, timestamp)` |
| module_hashes | `module_hash` |
| cursors | `(cursor, module_hash, block_id)` |
| unparsed_json | `(source, chain, module_hash, block_id)` |
| final_blocks | `block_id` |
| Table | Fields |
| ---------------------- | ---------------------------------------------------- |
| blocks | `(block_id, block_number, chain, timestamp)` |
| deleted_entity_changes | `(source, block_id, block_number, chain, timestamp)` |
| module_hashes | `module_hash` |
| cursors | `(cursor, module_hash, block_id)` |
| unparsed_json | `(source, chain, module_hash, block_id)` |
| final_blocks | `block_id` |

### Database initialization

Expand Down
2 changes: 1 addition & 1 deletion src/clickhouse/handleSinkRequest.ts
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ async function handleEntityChange(

// Updates are inserted as new rows in ClickHouse. This allows for the full history.
// If the user wants to override old data, they can specify it in their schema
// by setting the timestamp in the sorting key and by using a ReplacingMergeTree.
// by using a ReplacingMergeTree.
case "OPERATION_UPDATE":
prometheus.entity_changes_updated.inc();
return insertEntityChange(table, values, { ...metadata, id: change.id });
Expand Down

0 comments on commit 3b61515

Please sign in to comment.