-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: new job to backup pins to s3 #1844
Open
joshghent
wants to merge
69
commits into
main
Choose a base branch
from
feat/794-copy-pins-to-eips
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 64 commits
Commits
Show all changes
69 commits
Select commit
Hold shift + click to select a range
ca658d5
feat: added rough draft of the cron job to copy pins into s3
joshghent cd7a420
chore: add aws-sdk
joshghent 79219bb
feat: add cron
joshghent 2cd96c7
feat: added downloading and uploading of file correctly
joshghent 48cb64d
chore: added required dependencies
joshghent 55367e7
feat: add new backup_urls column to the psa_pin_requests table
joshghent fdb92d3
feat: finished job to backup files to s3
joshghent c910b65
feat: added new test suite for pins backup
joshghent bff06de
feat: add github action to run cron job
joshghent b34dd12
feat: added cron pins backup script
joshghent 14cbd8c
chore: removed todo
joshghent ef2600e
feat: added test mocks
joshghent 9e6f035
chore: add mocks for export car
joshghent ec58dd2
chore: update package-lock, fixed fetch reference
joshghent 0f0cbe5
feat: swapped to a class tructure
joshghent d009be1
chore: update the job to create a new backup class instance
joshghent e63a4a1
feat: added more logging, updated query
joshghent e4e22a0
fix: updated test mocks
joshghent 381ba65
fix: updated to make sql valid
joshghent 9e01bcb
Fix issues with updating rows
flea89 7271673
chore: remove redundant argument
joshghent 1010da8
chore: update variable names
joshghent 5c0cc8f
chore: update to use npm for the cron
joshghent 4b34fdb
chore: updated type definition to accept a pool not client for pg
joshghent 4b59c7b
chore: remove commented code
joshghent 1826006
feat: added way to not re-upload files we already have
joshghent e01dda0
chore: updated package lock
joshghent 3c529e5
Make linter happier
flea89 d417922
feat: `w3 open <cid>` to open cid on w3s.link in browser (#1892)
olizilla 7815d72
feat: put write to cluster behind a flag (#1785)
olizilla 396222f
chore: updated package lock
joshghent e3a13da
chore: update package lock
joshghent 6fad50f
feat: pass through the query limit on workflow dispatch
joshghent 5c7bca4
feat: updated the backup urls column to default to an empty arr
joshghent f07c61b
chore: added dagula
joshghent 36e7e08
feat: added default to empty array
joshghent 2d3d56a
feat: added default on the init data
joshghent 86b3de6
feat: updated query to return correct data
joshghent 7ceeaf9
chore: updated return types
joshghent 7368e13
chore: update package lock
joshghent e127dc0
chore: remove todo comment
joshghent bf8aa7a
Add fetch to cron pin-backup
flea89 4abd92f
Update types
flea89 874830c
Add Minio to cron
flea89 38b96a4
Add getS3 to cron utils
flea89 6aa93ef
update tests to be more e2endy
flea89 15fc4fd
Comment out size limiting
flea89 9d2fe99
more type fixes
flea89 ae3b044
Use s3 v3 and updats to backup updates
flea89 5bc3522
tidy tests up
flea89 1777df2
Set concurrency to avoid multiple jobs running together
flea89 dc431fc
Use ipfs node id rather than cluster one
flea89 f621c31
Create a car file for upload and test it holds the expected content
flea89 a266c32
Add abort controller
flea89 b7aa9db
Merge branch 'main' into feat/794-copy-pins-to-eips
flea89 c9e4dbb
Align api package with main
flea89 6e1f121
Improve logging and more tests
flea89 d573121
Improve testing and error handling
flea89 fd02b6a
Remove only from test
flea89 395dff7
Update logging
flea89 c58990b
Update node modules
flea89 52984d6
Update setup node action
flea89 694788e
Fix dependecies
flea89 9dec63f
Increase cron timeout
flea89 61990b2
Improve logging and remove stale code
flea89 749e20d
Cache instances of Dagula and use 1 libp2p
flea89 be24923
Better instanciation of resources
flea89 7a48df2
Default to quiet logging, but allow for more verbose when running man…
flea89 8eae8f6
Update cron help text
flea89 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
name: Cron Backup Pins | ||
|
||
on: | ||
schedule: | ||
- cron: '*/30 * * * *' | ||
workflow_dispatch: | ||
flea89 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
inputs: | ||
limit: | ||
description: 'The limit of records to backup' | ||
default: 10000 | ||
|
||
jobs: | ||
update: | ||
name: Backup Pins | ||
runs-on: ubuntu-latest | ||
strategy: | ||
matrix: | ||
env: ['staging', 'production'] | ||
concurrency: psa-pin-backup | ||
steps: | ||
- uses: actions/checkout@v3 | ||
with: | ||
fetch-depth: 0 | ||
- name: Checkout latest cron release tag | ||
run: | | ||
LATEST_TAG=$(git describe --tags --abbrev=0 --match='cron-*') | ||
git checkout $LATEST_TAG | ||
- uses: actions/setup-node@v3 | ||
with: | ||
node-version: 16 | ||
- uses: bahmutov/npm-install@v1 | ||
- name: Run job | ||
env: | ||
DEBUG: '*' | ||
ENV: ${{ matrix.env }} | ||
STAGING_PG_CONNECTION: ${{ secrets.STAGING_PG_CONNECTION }} | ||
STAGING_RO_PG_CONNECTION: ${{ secrets.STAGING_PG_CONNECTION }} # no replica for staging | ||
PROD_PG_CONNECTION: ${{ secrets.PROD_PG_CONNECTION }} | ||
PROD_RO_PG_CONNECTION: ${{ secrets.PROD_RO_PG_CONNECTION }} | ||
QUERY_LIMIT: ${{ github.event.inputs.limit }} | ||
run: npm run start:pins:backup -w packages/cron | ||
|
||
- name: Heartbeat | ||
if: ${{ success() }} | ||
run: ./packages/tools/scripts/cli.js heartbeat --token ${{ secrets.OPSGENIE_KEY }} --name cron-web3storage-backup-pins |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just schedule for the max amount of time a job can run for 6h?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly the job has 2 goals:
For 2 I guess it's ideal to keep moving stuff as promptly as possible (30 min make sense, even less than that?) while we know the first runs of the job will be super slow (since they will have to go through all the historical data).
Isn't a solution to satisfy both words to keep the
schedule
as is and setconcurrency
on the job?