Using the last replayed LSN as the value of LSN location if the last received LSN is at the starting point of the WAL segment #227

gavinThinking · 2024-04-07T15:34:48Z

Problem Statement

Issue: #228

Root cause analysis

If the standby instance remains without a primary instance to synchronize with, then the value of pg_last_wal_receive_lsn() will always be the initial value, which is the starting point of the last segment in the local pg_wal folder.

Solution

In most cases, the pg_last_wal_receive_lsn() can accurately retrieve the last received LSN location.

We choose pg_last_wal_receive_lsn() because a standby can lag replaying WAL based eg. on its read only activity. That means the standby that received more data from the primary than the others might have replayed less of them during the monitor or promote action.

Therefore, we still use this pg_last_wal_receive_lsn() of obtaining the LSN location in most situations.
In the scenario where the entire cluster has just restarted:
If the last three bytes (or six hexadecimal digits) of the last received LSN are zeros, indicating that the LSN is the starting point of the last WAL segment in the local pg_wal folder, then the current LSN is not accurate.
In this case, we query the last replayed LSN and compare it with the last received LSN. If the value of the last replay LSN is greater than the last received LSN, we use the last replay LSN as the LSN location.
Note: The scope of the changes only involves scenarios where the cluster is restarting.

When the standby is restarted, it must replay the transaction log to bring the database tables back to their correct state.
So in this scenario, the last replayed LSN is accurate.

pg_is_in_recovery()	pg_is_wal_replay_paused()	pg_last_wal_receive_lsn()	pg_last_wal_replay_lsn()
t	f	1/86000000	1/862B9CC0

…received LSN is at the starting point of the WAL segment

gavinThinking added 2 commits April 7, 2024 22:16

Using the last replayed LSN as the value of LSN location if the last …

2b1ec37

…received LSN is at the starting point of the WAL segment

Update pgsqlms

686a375

gavinThinking mentioned this pull request Apr 9, 2024

When the node restarts, the pg_last_wal_replay_lsn() is used as the LSN location for election. #228

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using the last replayed LSN as the value of LSN location if the last received LSN is at the starting point of the WAL segment #227

Using the last replayed LSN as the value of LSN location if the last received LSN is at the starting point of the WAL segment #227

gavinThinking commented Apr 7, 2024 •

edited

Loading

Using the last replayed LSN as the value of LSN location if the last received LSN is at the starting point of the WAL segment #227

Are you sure you want to change the base?

Using the last replayed LSN as the value of LSN location if the last received LSN is at the starting point of the WAL segment #227

Conversation

gavinThinking commented Apr 7, 2024 • edited Loading

Problem Statement

Root cause analysis

Solution

gavinThinking commented Apr 7, 2024 •

edited

Loading