Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default to ubuntu for GCP and avoid key pair checking #1641
Default to ubuntu for GCP and avoid key pair checking #1641
Changes from 8 commits
7b00703
685327e
289db1e
17e35b2
3f86961
f4507bb
d1cc352
56b925f
ccd7365
f5c2044
31f906a
fcf05ec
0985954
66d7dbb
2e871e7
f85b4f3
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: get_image_info() and s/infos/info elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: how do we guarantee that the ubuntu & debian tags have the same size, DEFAULT_GCP_IMAGE_GB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The image size can be got using
gcloud compute images describe projects/deeplearning-platform-release/global/images/common-cu113-v20230501-ubuntu-2004-py37
, and it seems both of them have the same size of 50GB. Added a comment for the hack.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reading get_image_size() for the first time and this seems a bit unclear. If user passes in a custom image that does not have diskSizeGb, returning a default 50GB seems like a guess (maybe ok), rather than we the func name suggests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the images should have that field, but this is just a safeguard to avoid the function raising the error, as the image size check is not critical. Added a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remnant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are using ubuntu image now, we use
grep ubuntu-2004
to find the correct images.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, meant that
could be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, good catch! Removed. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L340-343 is out of touch w/ L344. Remove/move?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rephrased the comments and moved it to near the definition of the image tag (the module-level variable). PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe consider moving this to
sky/skylet/providers/command_runner.py
after #1910 is merged?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add something like the following to document intention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Added. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: since we do a
grep ray ... || ...
, this meansray
is not found. Why is pip3 uninstall necessary? Why addray-cpp
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use the
grep {{ray_version}}
to check the correct ray version is installed on the remote VM. It is possible that ray with another version is pre-installed on the VM. For example, the ubuntu image on GCP has theray==2.4.0
installed, which will causepip
problem if we directlypip install -U ray[default]==2.0.1
(causing the ray package corrupted.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Can we copy this comment into the j2 files? Something like "Ensure only one Ray version (which is our ray_version) is installed, regardless of if the image comes pre-installed with another Ray version."