-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cert Renewal script #14667
Closed
Closed
Cert Renewal script #14667
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
OK, there were two problems: 1. A timeout of 5s appears to be now too short for Google Cloud Storage. I am not sure why but we timeout substantially more frequently. I have observed this myself on my laptop. Just this morning I saw it happen to Daniel. 2. When using an `aiohttp.AsyncIterablePayload`, it is *critical* to always check if the coroutine which actually writes to GCS (which is stashed in the variable `request_task`) is still alive. In the current `main`, we do not do this which causes hangs (in particular the timeout exceptions are never thrown ergo we never retry). To understand the second problem, you must first recall how writing works in aiogoogle. There are two Tasks and an `asyncio.Queue`. The terms "writer" and "reader" are somewhat confusing, so let's use left and right. The left Task has the owning reference to both the source "file" and the destination "file". In particular, it is the *left* Task which closes both "files". Moreover, the left Task reads chunks from the source file and places those chunks on the `asyncio.Queue`. The right Task takes chunks off the queue and writes those chunks to the destination file. This situation can go awry in two ways. First, if the right Task encounters any kind of failure, it will stop taking chunks off of the queue. When the queue (which has a size limit of one) is full, the left Task will hang. The system is stuck. The left Task will wait forever for the right Task to empty the queue. The second scenario is exactly the same except that the left Task is trying to add the "stop" message to the queue rather than a chunk. In either case, it is critical that the left Task waits simultaneously on the queue operation *and* on the right Task completing. If the right Task has died, no further writes can occur and the left Task must raise an exception. In the first scenario, we do not observe the right Task's exception because that will be done when we close the `InsertObjectStream` (which represents the destination "file"). --- I also added several types, assertions, and a few missing `async with ... as resp:` blocks.
[copy] fix the TimeoutError and ServerDisconnected issues in copy
Merge upstream 0.2.95
Allow to select a pool for a job through a label
Use Spot VMs on GCP
Release 0.2.96
Co-authored-by: Patrick Schultz <[email protected]>
0.2.97 merge from upstream
…s file (#203) * ORGANIZATION_DOMAIN --> GITHUB_ORGANIZATION * Add github_organization to global.tfvars template * Use github_organization in main.tf for credentials paths * Update credentials paths
…ght take a while]
…s ci_storage_uri in CI Steps
Merge upstream HEAD(b7bde56, 2024-05-14) Stop writing to V2 tables
Merge upstream HEAD(e68103e, 2024-05-14) Remove V2 tables
Merge upstream HEAD(dc7fce0, 2024-05-14) Use CI's credentials for image pushing instead of gcr-push
Merge upstream HEAD(13de4e6, 2024-05-14) Add job groups [migration might take a while!]
Merge upstream HEAD(6a6c38d, 2024-05-21) Expose HAIL_CI_STORAGE_URI as ci_storage_uri in CI Steps
Merge upstream HEAD(bea04d9, 2024-05-21) [release] 0.2.130 (hail-is#14454)
* Try adding extra route * Add another route --------- Co-authored-by: Michael Franklin <[email protected]>
…l causing Error: Column 'time_completed' in where clause is ambiguous. (#340)
…ob_resources_v3. (#341)
This 22.04.4 version updates the kernel from 5.19 to 6.5, which is not supported by NVIDIA-Linux 530.30.2. Update that to the current latest, and verify that that supports L4 GPUs as used by G2 VMs.
…tials This secret was superseded in upstream PR hail-is#14031 and later deleted. That PR replaced use of /registry-push-credentials/credentials.json with $GOOGLE_APPLICATION_CREDENTIALS instead, which is presumably already activated for gcloud purposes.
Recent upstream changes include a rewrite of several of the web page templates. Hence this merge drops the functionality locally added by PRs #270, #272, and #311 to make the command <PRE> resizeable and to add job state quick links. We may re-add these improvements later by reimplementing them in the new template code.
Merge upstream 0.2.132 release (678e1f5, 2024-07-09).
…hen collecting batches and jobs. (#345) * Fix for /api/v1alpha/batches/completed, picking only ROOT_JOB_GROUP when collecting batches and jobs. * Removing DISTINCT
* Add support for public-ip-address in dataproc * Use the same code style as previous code --------- Co-authored-by: Michael Franklin <[email protected]>
Sorry, mistakenly tried to merge upstream facepalm |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This script helps run some of the steps from https://populationgenomics.readthedocs.io/en/latest/hail.html#updating-tls-https-certificates through an automatic script that can be fetched and run.