Using the last replayed LSN as the value of LSN location if the last received LSN is at the starting point of the WAL segment #227
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem Statement
Issue: #228
Root cause analysis
If the standby instance remains without a primary instance to synchronize with, then the value of pg_last_wal_receive_lsn() will always be the initial value, which is the starting point of the last segment in the local pg_wal folder.
Solution
In most cases, the pg_last_wal_receive_lsn() can accurately retrieve the last received LSN location.
Therefore, we still use this pg_last_wal_receive_lsn() of obtaining the LSN location in most situations.
In the scenario where the entire cluster has just restarted:
If the last three bytes (or six hexadecimal digits) of the last received LSN are zeros, indicating that the LSN is the starting point of the last WAL segment in the local pg_wal folder, then the current LSN is not accurate.
In this case, we query the last replayed LSN and compare it with the last received LSN. If the value of the last replay LSN is greater than the last received LSN, we use the last replay LSN as the LSN location.
Note: The scope of the changes only involves scenarios where the cluster is restarting.
When the standby is restarted, it must replay the transaction log to bring the database tables back to their correct state.
So in this scenario, the last replayed LSN is accurate.