Skip to content

Commit

Permalink
Docs 1104 Document missing optimized topology changes to tables (#6221)
Browse files Browse the repository at this point in the history
* bdr.monitor_local_replslots updates

Signed-off-by: Dj Walker-Morgan <[email protected]>

* Fix logical_transaction_status

Signed-off-by: Dj Walker-Morgan <[email protected]>

* Added bdr.leader and bdr.local_leader_change to refs

Signed-off-by: Dj Walker-Morgan <[email protected]>

* Update product_docs/docs/pgd/5.6/reference/catalogs-internal.mdx

Co-authored-by: Josh Earlenbaugh <[email protected]>

---------

Signed-off-by: Dj Walker-Morgan <[email protected]>
Co-authored-by: Josh Earlenbaugh <[email protected]>
  • Loading branch information
djw-m and jpe442 authored Nov 11, 2024
1 parent 1a5e372 commit ea1240e
Show file tree
Hide file tree
Showing 7 changed files with 106 additions and 56 deletions.
2 changes: 1 addition & 1 deletion product_docs/docs/pgd/5.6/commit-scopes/camo.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@ node isn't part of a CAMO pair.

To check the status of a transaction that was being committed when the node
failed, the application must use the function
[`bdr.logical_transaction_status`](/pgd/latest/reference/functions#bdrlogical_transaction_status).
[`bdr.logical_transaction_status()`](/pgd/latest/reference/functions#bdrlogical_transaction_status).

You pass this function the the node_id and transaction_id of the transaction you want to check
on. With CAMO used in pair mode, you can use this function only on a node that's part of a CAMO pair. Along with Eager Replication, you can use it on all nodes.
Expand Down
15 changes: 9 additions & 6 deletions product_docs/docs/pgd/5.6/monitoring/sql.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -699,7 +699,7 @@ consumed the corresponding transactions. Consequently, it's not necessary to
monitor the status of the group slot.

The function [`bdr.monitor_local_replslots()`](/pgd/latest/reference/functions#bdrmonitor_local_replslots) provides a summary of whether all
PGD node replication slots are working as expected. For example:
PGD node replication slots are working as expected. This summary is also available on subscriber-only nodes that are operating as subscriber-only group leaders in a PGD cluster when [optimized topology](../nodes/subscriber_only/optimizing-so) is enabled. For example:

```sql
bdrdb=# SELECT * FROM bdr.monitor_local_replslots();
Expand All @@ -710,11 +710,14 @@ bdrdb=# SELECT * FROM bdr.monitor_local_replslots();

One of the following status summaries is returned:

- `UNKNOWN`: `This node is not part of any BDR group`
- `OK`: `All BDR replication slots are working correctly`
- `OK`: `This node is part of a subscriber-only group`
- `CRITICAL`: `There is at least 1 BDR replication slot which is inactive`
- `CRITICAL`: `There is at least 1 BDR replication slot which is missing`
| Status | Message |
|----------|------------------------------------------------------------|
| UNKNOWN | This node is not part of any BDR group |
| OK | All BDR replication slots are working correctly |
| OK | This node is part of a subscriber-only group |
| CRITICAL | There is at least 1 BDR replication slot which is inactive |
| CRITICAL | There is at least 1 BDR replication slot which is missing |


## Monitoring transaction COMMITs

Expand Down
102 changes: 57 additions & 45 deletions product_docs/docs/pgd/5.6/reference/catalogs-internal.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,12 @@ A view of the `bdr.event_history` catalog that displays the information in a mor
human-friendly format. Specifically, it displays the event types and subtypes
as textual representations rather than integers.

### `bdr.local_leader_change`

This is a local cache of the recent portion of leader change history. It has the same fields as [`bdr.leader`](/pgd/5.6/reference/catalogs-visible#bdrleader), except that it is an ordered set of (node_group_id, leader_kind, generation) instead of a map tracking merely the current version.



### `bdr.node_config`

An internal catalog table with per-node configuration options.
Expand Down Expand Up @@ -160,12 +166,18 @@ An internal catalog table holding current routing information for a proxy.

#### `bdr.node_group_routing_info` columns

| Name | Type | Description |
|--------------------|-------|-----------------------------|
| node_group_id | oid | Node group ID |
| write_node_id | oid | Current write node |
| prev_write_node_id | oid | Previous write node |
| read_node_ids | oid[] | List of read-only nodes IDs |
| Name | Type | Description |
|----------------------|-------------|------------------------------------------------------------------------------------------|
| node_group_id | oid | Node group ID |
| write_node_id | oid | Current write node |
| prev_write_node_id | oid | Previous write node |
| read_node_ids | oid[] | List of read-only nodes IDs |
| record_version | bigint | Record version. Incremented by 1 on every material change to the routing record. |
| record_ts | timestamptz | Timestamp of last update to record_version. |
| write_leader_version | bigint | Write leader version. Copied from record_version every time write_node_id is changed. |
| write_leader_ts | timestamptz | Write leader timestamp. Copied from record_ts every time write_node_id is changed. |
| read_nodes_version | bigint | Read nodes version. Copied from record_version every time read_node_ids list is changed. |
| read_nodes_ts | timestamptz | Read nodes timestamp. Copied from record_tw every time read_node_ids list is changed. |

### `bdr.node_group_routing_summary`

Expand Down Expand Up @@ -202,25 +214,25 @@ An internal catalog table holding proxy specific configurations.

#### `bdr.proxy_config` columns

| Name | Type | Description |
|-----------------------------|----------|------------------------------------------------------------------------------|
| proxy_name | name | Name of the proxy |
| node_group_id | oid | Node group ID that this proxy uses |
| Name | Type | Description |
|-----------------------------|----------|----------------------------------------------------------------------------------|
| proxy_name | name | Name of the proxy |
| node_group_id | oid | Node group ID that this proxy uses |
| listen_port | int | Port that the proxy uses for read-write connections (setting to 0 disables port) |
| max_client_conn | int | Number of maximum read-write client connections that the proxy accepts |
| max_server_conn | int | Number of maximum read-write connections that the server accepts |
| server_conn_timeout | interval | Timeout for the read-write server connections |
| server_conn_keepalive | interval | Interval between the server connection keep-alive |
| fallback_group_timeout | interval | Timeout needed for the fallback |
| fallback_group_ids | oid[] | List of group IDs to use for the fallback |
| listen_addrs | text[] | Listen address |
| max_client_conn | int | Number of maximum read-write client connections that the proxy accepts |
| max_server_conn | int | Number of maximum read-write connections that the server accepts |
| server_conn_timeout | interval | Timeout for the read-write server connections |
| server_conn_keepalive | interval | Interval between the server connection keep-alive |
| fallback_group_timeout | interval | Timeout needed for the fallback |
| fallback_group_ids | oid[] | List of group IDs to use for the fallback |
| listen_addrs | text[] | Listen address |
| read_listen_port | int | Port that the proxy uses for read-only connections (setting to 0 disables port) |
| read_max_client_conn | int | Number of maximum read-only client connections that the proxy accepts |
| read_max_server_conn | int | Number of maximum read-only connections that the server accepts |
| read_server_conn_timeout | interval | Timeout for the server read-only connections |
| read_server_conn_keepalive | interval | Interval between the server read-only connection keep-alive |
| read_listen_addrs | text[] | Listen address for read-only connections |
| read_consensus_grace_period | interval | Duration for which proxy continues to route even upon loss of consensus |
| read_max_client_conn | int | Number of maximum read-only client connections that the proxy accepts |
| read_max_server_conn | int | Number of maximum read-only connections that the server accepts |
| read_server_conn_timeout | interval | Timeout for the server read-only connections |
| read_server_conn_keepalive | interval | Interval between the server read-only connection keep-alive |
| read_listen_addrs | text[] | Listen address for read-only connections |
| read_consensus_grace_period | interval | Duration for which proxy continues to route even upon loss of consensus |


### `bdr.proxy_config_summary`
Expand All @@ -229,27 +241,27 @@ A friendly view of per-proxy, instance-specific configuration options.

#### `bdr.proxy_config_summary` columns

| Name | Type | Description |
|---------------------------------|----------|-------------------------------------------------------------------------------|
| proxy_name | name | Name of the proxy |
| node_group_name | name | Node group name that this proxy uses |
| Name | Type | Description |
|---------------------------------|----------|-----------------------------------------------------------------------------------|
| proxy_name | name | Name of the proxy |
| node_group_name | name | Node group name that this proxy uses |
| listen_port | int | Port that the proxy uses for read-write connections (setting to -1 disables port) |
| max_client_conn | int | Number of maximum read-write client connections that the proxy accepts |
| max_server_conn | int | Number of maximum read-write connections that the server accepts |
| server_conn_timeout | interval | Timeout for the read-write server connections |
| server_conn_keepalive | interval | Interval between the server connection keep-alive |
| node_group_enable_proxy_routing | boolean | Does the group the proxy is in enable proxy routing? |
| node_group_location | name | The group's location value |
| fallback_group_timeout | interval | Timeout needed for the fallback |
| fallback_group_ids | oid[] | List of group IDs to use for the fallback |
| listen_addrs | text[] | Listen address |
| max_client_conn | int | Number of maximum read-write client connections that the proxy accepts |
| max_server_conn | int | Number of maximum read-write connections that the server accepts |
| server_conn_timeout | interval | Timeout for the read-write server connections |
| server_conn_keepalive | interval | Interval between the server connection keep-alive |
| node_group_enable_proxy_routing | boolean | Does the group the proxy is in enable proxy routing? |
| node_group_location | name | The group's location value |
| fallback_group_timeout | interval | Timeout needed for the fallback |
| fallback_group_ids | oid[] | List of group IDs to use for the fallback |
| listen_addrs | text[] | Listen address |
| read_listen_port | int | Port that the proxy uses for read-only connections (setting to -1 disables port) |
| read_max_client_conn | int | Number of maximum read-only client connections that the proxy accepts |
| read_max_server_conn | int | Number of maximum read-only connections that the server accepts |
| read_server_conn_timeout | interval | Timeout for the server read-only connections |
| read_server_conn_keepalive | interval | Interval between the server read-only connection keep-alive |
| read_listen_addrs | text[] | Listen address for read-only connections |
| read_consensus_grace_period | interval | Duration for which proxy continues to route even upon loss of consensus |
| read_max_client_conn | int | Number of maximum read-only client connections that the proxy accepts |
| read_max_server_conn | int | Number of maximum read-only connections that the server accepts |
| read_server_conn_timeout | interval | Timeout for the server read-only connections |
| read_server_conn_keepalive | interval | Interval between the server read-only connection keep-alive |
| read_listen_addrs | text[] | Listen address for read-only connections |
| read_consensus_grace_period | interval | Duration for which proxy continues to route even upon loss of consensus |

### `bdr.sequence_kind`

Expand All @@ -258,7 +270,7 @@ An internal state table storing the type of each non-local sequence. We recommen

#### `bdr.sequence_kind` columns

| Name | Type | Description |
| ------- | ---- | ----------------------------------------------------------- |
| seqid | oid | Internal OID of the sequence |
| Name | Type | Description |
|---------|------|-----------------------------------------------------------------------------|
| seqid | oid | Internal OID of the sequence |
| seqkind | char | Internal sequence kind (`l`=local,`t`=timeshard,`s`=snowflakeid,`g`=galloc) |
20 changes: 20 additions & 0 deletions product_docs/docs/pgd/5.6/reference/catalogs-visible.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -377,6 +377,26 @@ Uses `bdr.run_on_all_nodes` to gather PGD information from all nodes.
| postgres_version | text | PostgreSQL version on the node |
| bdr_version | text | PGD version on the node |

### `bdr.leader`

Tracks leader nodes across subgroups in the cluster. Shows the status of all write leaders and subscriber-only group leaders (when optimized topology is enabled) in the cluster.

#### `bdr.leader` columns

| Name | Type | Description |
| ---------------- | ---- | ------------------------------ |
| node_group_id | oid | ID of the node group |
| leader_node_id | oid | ID of the leader node |
| generation | int | Generation of the leader node. Leader_kind sets semantics. |
| leader_kind | "char" | Kind of the leader node |

Leader_kind values can be:

| Value | Description |
|-------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| W | Write leader, as per proxy routing. In this case leader is maintained by subgroup Raft instance. <br/> `generation` corresponds to `write_leader_version` of respective `bdr.node_group_routing_info` record. |
| S | Subscriber-only group leader. This designated member of a SO group subscribes to upstream data nodes, and is tasked with publishing upstream changes to remaining SO group members. Leader is maintained by toplevel Raft instance.<br/>`generation` is updated sequentially upon leader change. |

### `bdr.local_consensus_snapshot`

This catalog table contains consensus snapshots created or received by
Expand Down
17 changes: 14 additions & 3 deletions product_docs/docs/pgd/5.6/reference/functions.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -911,6 +911,8 @@ view `pg_replication_slots` (slot active or inactive) to provide a
local check considering all replication slots except the PGD group
slots.

This function also provides status information on subscriber-only nodes that are operating as subscriber-only group leaders in a PGD cluster when [optimized topology](../nodes/subscriber_only/optimizing-so) is enabled.

#### Synopsis

```sql
Expand All @@ -919,8 +921,17 @@ bdr.monitor_local_replslots()

#### Notes

This function returns a record with fields `status` and `message`,
as explained in [Monitoring replication slots](../monitoring/sql/#monitoring-replication-slots).
This function returns a record with fields `status` and `message`.

| Status | Message |
|----------|------------------------------------------------------------|
| UNKNOWN | This node is not part of any BDR group |
| OK | All BDR replication slots are working correctly |
| OK | This node is part of a subscriber-only group |
| CRITICAL | There is at least 1 BDR replication slot which is inactive |
| CRITICAL | There is at least 1 BDR replication slot which is missing |

Further explaination is available in [Monitoring replication slots](../monitoring/sql/#monitoring-replication-slots).

### `bdr.wal_sender_stats`

Expand Down Expand Up @@ -1078,7 +1089,7 @@ This function begins a wait for CAMO transactions to be fully resolved.
bdr.camo_transactions_resolved()
```

### bdr.logical_transaction_status
### `bdr.logical_transaction_status`

To check the status of a transaction that was being committed when the node failed, the application must use this function, passing as parameters the node id of the node the transaction originated from and the transaction id on the origin node.

Expand Down
Loading

1 comment on commit ea1240e

@github-actions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.