Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pghoard restore_command optimization #618

Merged
merged 2 commits into from
May 21, 2024

Conversation

egor-voynov-aiven
Copy link
Contributor

@egor-voynov-aiven egor-voynov-aiven commented Mar 22, 2024

Changes:

  • Added a new DownloadResultsProcessor thread, which validates downloaded files and saves them to the target: pg_wal/<wal_name>.tmp -> pg_wal/<wal_name>.pghoard.prefetch.
    Files with "prefetch" suffix can be copied to the destination without extra checks now.
  • Renaming prefetched WAL to target is carried out directly in the pghoard_postgres_command_go
  • changed items type for "pending_download_ops" dict from "dict" to "dataclass" (PendingDownloadOp)

[BF-2247]

We observed that the startup process was only using 60 to 80% of a CPU. This means that we’re not supplying WAL files fast enough or that the restore_command takes too long to complete.
The idea would be to bypass the HTTP-call needed to check if we have a WAL file present from the restore_command to the pghoard main process.

pghoard/webserver.py Fixed Show fixed Hide fixed
os.unlink(prefetch_target_path)
return e
with self.server.lock:
os.rename(src, dst)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
os.unlink(prefetch_target_path)
return e
with self.server.lock:
os.rename(src, dst)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
@@ -443,6 +450,8 @@
xlog_file = Path(target_path).resolve()
if data_dir not in xlog_file.parents:
raise HttpResponse(f"Invalid xlog file path {target_path}, it should be in data directory", status=400)
if not xlog_file.parent.is_dir():

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
@codecov-commenter
Copy link

codecov-commenter commented Mar 25, 2024

Codecov Report

Attention: Patch coverage is 87.68116% with 17 lines in your changes are missing coverage. Please review.

Project coverage is 90.79%. Comparing base (5505b86) to head (30eb4e5).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #618      +/-   ##
==========================================
- Coverage   91.01%   90.79%   -0.22%     
==========================================
  Files          31       31              
  Lines        4917     4954      +37     
==========================================
+ Hits         4475     4498      +23     
- Misses        442      456      +14     
Files Coverage Δ
pghoard/common.py 90.96% <100.00%> (ø)
pghoard/pghoard.py 85.66% <100.00%> (+0.08%) ⬆️
pghoard/postgres_command.py 91.39% <50.00%> (-1.87%) ⬇️
pghoard/webserver.py 89.42% <88.18%> (+0.01%) ⬆️

... and 3 files with indirect coverage changes

@egor-voynov-aiven egor-voynov-aiven force-pushed the egor-voynov-improve-restore-command branch 2 times, most recently from a36a724 to ebe7234 Compare March 26, 2024 08:36
pghoard/webserver.py Dismissed Show dismissed Hide dismissed
@egor-voynov-aiven egor-voynov-aiven force-pushed the egor-voynov-improve-restore-command branch 3 times, most recently from 23db920 to 92b17da Compare March 26, 2024 09:46
@egor-voynov-aiven egor-voynov-aiven marked this pull request as ready for review March 26, 2024 09:54
@egor-voynov-aiven egor-voynov-aiven force-pushed the egor-voynov-improve-restore-command branch from 92b17da to d03e403 Compare March 26, 2024 10:34
pghoard/webserver.py Outdated Show resolved Hide resolved
Copy link
Contributor

@0xlianhu 0xlianhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new thread shutdown needs to be fixed.

Copy link
Contributor

@0xlianhu 0xlianhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: why don't we change the python version of restore_command as well? I.e., pghoard/restore.py

@egor-voynov-aiven egor-voynov-aiven force-pushed the egor-voynov-improve-restore-command branch 4 times, most recently from a05baab to 6a53f07 Compare April 4, 2024 12:24
Added a new DownloadResultsProcessor thread, which validates downloaded files and saves them to the target: '<pg_wal>/<wal_name>.tmp' -> '<pg_wal>/<wal_name>.prefetch'.
Files with "prefetch" suffix can be copied to the destination without extra checks now.
Also changed items type for "pending_download_ops" dict from "dict" to "dataclass" (PendingDownloadOp)
Fixed typing issues for webserver.py
renaming prefetched WAL in python CLI "pghoard_postgres_command"
renaming prefetched WAL in "pghoard_postgres_command_go"
@egor-voynov-aiven egor-voynov-aiven force-pushed the egor-voynov-improve-restore-command branch from 30eb4e5 to 7c466c3 Compare April 5, 2024 09:14
@egor-voynov-aiven egor-voynov-aiven dismissed 0xlianhu’s stale review April 5, 2024 12:05

I fixed requesting changes

@egor-voynov-aiven egor-voynov-aiven force-pushed the egor-voynov-improve-restore-command branch from 63d2b2f to 2f6cfc2 Compare April 12, 2024 08:06
@egor-voynov-aiven egor-voynov-aiven force-pushed the egor-voynov-improve-restore-command branch from 2f6cfc2 to c367c2c Compare April 12, 2024 08:46
@0xlianhu 0xlianhu self-requested a review April 15, 2024 08:44
Copy link
Contributor

@0xlianhu 0xlianhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I don't have more comment.

@0xlianhu 0xlianhu merged commit cea3d0b into main May 21, 2024
6 of 7 checks passed
@0xlianhu 0xlianhu deleted the egor-voynov-improve-restore-command branch May 21, 2024 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants