Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colossus: Rework sync and cleanup #5194

Open
wants to merge 21 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
8a32117
Gitignore .venv for localstack purposes
Lezek123 Oct 24, 2024
7696bb9
Merge remote-tracking branch 'upstream/master' into archive_script
Lezek123 Oct 24, 2024
6485251
Colossus: Archive script
Lezek123 Oct 25, 2024
e3a3074
Colossus: Add proper-lockfile to deps
Lezek123 Oct 25, 2024
0d9ba5b
Colossus: Add @types/proper-lockfile dep
Lezek123 Oct 25, 2024
b8f5e8d
Colossus: Add @types/proper-lockfile dep
Lezek123 Oct 25, 2024
df625e7
storageCleanup test: Give nodes more time to sync
Lezek123 Oct 25, 2024
f653de1
Merge remote-tracking branch 'origin/archive_script' into archive_script
Lezek123 Oct 25, 2024
6e74234
Colossus: Add util:search-archives command
Lezek123 Oct 25, 2024
abcf781
Colossus archive script: Add storage classes
Lezek123 Oct 28, 2024
2b45e12
Colossus archive script: Support for faster compression / no compress…
Lezek123 Oct 29, 2024
7b7bf26
Archive script: Optimizations, bug fixes, stats logging
Lezek123 Nov 1, 2024
5238b65
Archive script: Sync all objects, ignore bucket assignments
Lezek123 Nov 6, 2024
3cb0713
Upload timeout issues fix attempt
Lezek123 Nov 15, 2024
c550c0e
Fix failure handling and adjust timeouts
Lezek123 Nov 15, 2024
dff1040
Better logs
Lezek123 Nov 15, 2024
f76e84d
Fix: Destroy fileStream after failed upload
Lezek123 Nov 19, 2024
53fbddf
Always destroy fileStream after successful/failed upload
Lezek123 Nov 21, 2024
29a12fb
Sync and cleanup rework
Lezek123 Nov 25, 2024
0094a61
Update changelog
Lezek123 Nov 26, 2024
2520e15
Colossus cleanup: Additional safety mechanism
Lezek123 Nov 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,5 @@ runtime-inputs/
devops/infrastructure

joystream.tar.gz

.venv
3 changes: 3 additions & 0 deletions colossus.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,10 @@ RUN yarn workspace storage-node build
RUN yarn cache clean

FROM node:18 as final

WORKDIR /joystream
# 7zip and zstd are required by the archive script
RUN apt-get update && apt-get install -y p7zip-full zstd
COPY --from=builder /joystream /joystream
RUN yarn --frozen-lockfile --production

Expand Down
4 changes: 2 additions & 2 deletions setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ if [[ "$OSTYPE" == "linux-gnu" ]]; then
# code build tools
sudo apt-get update -y
sudo apt-get install -y coreutils clang llvm jq curl gcc xz-utils sudo pkg-config \
unzip libc6-dev make libssl-dev python3 cmake protobuf-compiler libprotobuf-dev
unzip libc6-dev make libssl-dev python3 cmake protobuf-compiler libprotobuf-dev p7zip-full

# Docker: do not replace existing installation to avoid distrupting running containers
if ! command -v docker &> /dev/null
Expand All @@ -23,7 +23,7 @@ elif [[ "$OSTYPE" == "darwin"* ]]; then
fi
# install additional packages
brew update
brew install coreutils gnu-tar jq curl llvm gnu-sed cmake protobuf || :
brew install coreutils gnu-tar jq curl llvm gnu-sed cmake protobuf p7zip || :
echo "It is recommended to setup Docker desktop from: https://www.docker.com/products/docker-desktop"
echo "It is also recommended to install qemu emulators with following command:"
echo "docker run --privileged --rm tonistiigi/binfmt --install all"
Expand Down
11 changes: 11 additions & 0 deletions storage-node/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
### 4.3.0

- **New feature:** `archive` mode / command, which allows downloading, compressing and uploading all data objects to an external S3 bucket that can be used as a backup.
- **Optimizations:** The way data objects / data object ids are queried and processed during sync and cleanup has been optimized:
- `DataObjectDetailsLoader` and `DataObjectIdsLoader` were implemented. They allow loading data objects / data object ids in batches using a connection query and avoid fetching redundant data from the GraphQL server.
- Sync and cleanup services now process tasks in batches of `10_000` to avoid overflowing the memory.
- Synchronous operations like `sort` or `filter` on larger arrays of data objects have been optimized (for example, by replacing `.filter(Array.includes(...))` with `.filter(Set.has(...))`).
- A safety mechanism was added to avoid removing "deleted" objects for which a related `DataObjectDeleted` event cannot be found in storage squid.
- Improved logging during cleanup.


### 4.2.0

- Fix `util:cleanup` script (call `loadDataObjectIdCache` first)
Expand Down
585 changes: 385 additions & 200 deletions storage-node/README.md

Large diffs are not rendered by default.

7 changes: 6 additions & 1 deletion storage-node/package.json
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
{
"name": "storage-node",
"description": "Joystream storage subsystem.",
"version": "4.2.0",
"version": "4.3.0",
"author": "Joystream contributors",
"bin": {
"storage-node": "./bin/run"
},
"bugs": "https://github.com/Joystream/joystream/issues",
"dependencies": {
"@apollo/client": "^3.3.21",
"@aws-sdk/client-s3": "^3.675.0",
"@aws-sdk/s3-request-presigner": "^3.675.0",
"@elastic/ecs-winston-format": "^1.3.1",
"@joystream/metadata-protobuf": "^2.15.0",
"@joystream/opentelemetry": "1.0.0",
Expand Down Expand Up @@ -36,6 +38,7 @@
"await-lock": "^2.1.0",
"base64url": "^3.0.1",
"blake3-wasm": "^2.1.5",
"chokidar": "4.0.1",
"cors": "^2.8.5",
"cross-fetch": "^3.1.4",
"express": "4.17.1",
Expand All @@ -52,6 +55,7 @@
"node-cache": "^5.1.2",
"openapi-editor": "^0.3.0",
"promise-timeout": "^1.3.0",
"proper-lockfile": "^4.1.2",
"react": "^18.2.0",
"read-chunk": "^3.2.0",
"rimraf": "^3.0.2",
Expand Down Expand Up @@ -81,6 +85,7 @@
"@types/mocha": "^5",
"@types/node": "^18.6.0",
"@types/pg": "^8.6.1",
"@types/proper-lockfile": "^4.1.4",
"@types/swagger-ui-express": "^4.1.2",
"@types/ws": "^5.1.2",
"@typescript-eslint/eslint-plugin": "3.8.0",
Expand Down
Loading
Loading