-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TSPS-324 Add download command #12
Conversation
terralab/.terralab-cli-config
Outdated
@@ -1,5 +1,6 @@ | |||
# Teaspoons service API URL | |||
TEASPOONS_API_URL=https://tsps.dsde-dev.broadinstitute.org |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops revert this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 with some questions/suggestions
from terralab.logic import pipelines_logic | ||
|
||
LOGGER = logging.getLogger(__name__) | ||
|
||
|
||
@click.command() | ||
@click.argument("pipeline_name") | ||
@click.command(short_help="Submit a pipeline run") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should there be a short help for all of our commands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possibly, it might make things clearer for developers. the way the help text works is that the top-level help output (what you see when you run terralab -h
) comes from the docstrings, but if the docstrings are long or multiline, it gets cut off in sometimes arbitrary places with ...
at the end. so adding the short_help here lets you enforce what shows up there, and the longer docstring is used for the command-specific -h
output. so I think I'd only added it in places where the top-level help output was getting truncated weirdly, but we could just add it everywhere if we prefer. what do you think?
"""Submit a pipeline run""" | ||
"""Submit a pipeline run | ||
|
||
PIPELINE_NAME is the name of the pipeline to run""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why this was added, where is PIPELINE_NAME
coming from?
If you're going to want to add info for the parameters of a function we should follow one of the many patterns that exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it comes from the fact that when you run e.g. terralab submit -h
the help text begins with a usage hint that looks like Usage: terralab submit [OPTIONS] PIPELINE_NAME
, so somewhere in the click documentation there was a suggestion to explain the PIPELINE_NAME
field in this help text.
that said, we can definitely choose a more conventional pattern to follow, like perhaps
Parameters
----------
PIPELINE_NAME : str
The name of the pipeline to run
I don't have strong feelings about this, let me know what you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently the output of terralab download -h
is
✗ terralab download -h
Usage: terralab download [OPTIONS] PIPELINE_NAME JOB_ID
Download all output files from a pipeline run
PIPELINE_NAME is the name of the pipeline that was run
JOB_ID is the uuid identifier for the pipeline run
Options:
-d, --local_destination DIRECTORY
optional location to download results to.
defaults to the current directory.
-h, --help Show this message and exit.
i.e. click doesn't describe the arguments, so if we want descriptions we need to add them ourselves. I'm all for making it more consistent with the "Options" formatting, though, like:
✗ terralab download -h
Usage: terralab download [OPTIONS] PIPELINE_NAME JOB_ID
Download all output files from a pipeline run
PIPELINE_NAME the name of the pipeline that was run
JOB_ID the uuid identifier for the pipeline run
Options:
-d, --local_destination DIRECTORY
optional location to download results to.
defaults to the current directory.
-h, --help Show this message and exit.
unfortunately I don't think we'll get the indentation/spacing to match up exactly
help="optional location to download results to. defaults to the current directory.", | ||
) | ||
@handle_api_exceptions | ||
def download(pipeline_name: str, job_id: uuid.UUID, local_destination: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should job_id be type str
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, yes, good catch. are you making the other uuid->str changes in your PR or should I make them here?
@@ -75,3 +82,76 @@ def upload_file_with_signed_url(local_file_path, signed_url): | |||
except Exception as e: | |||
LOGGER.error(add_blankline_after(f"Error uploading file: {e}")) | |||
exit(1) | |||
|
|||
|
|||
class SignedUrlDownload: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oooo nice, would be nice to have a descripton of what the class does/is used for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call, adding!
## download action | ||
|
||
|
||
def get_result_and_download_pipeline_run_outputs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we eventually have a status command in the CLI that will just try to grab the status of the pipeline run or will it just be this download call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, will add a CLI status command next
) -> Optional[str]: | ||
job_status: str = pipeline_run_response.job_report.status | ||
if job_status == "SUCCEEDED": | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might wanna raise
if it's meant to handle non success statuses and its getting a success status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, also renamed the function to be a bit more descriptive
terralab/utils.py
Outdated
def download_with_pbar(download: SignedUrlDownload) -> str: | ||
"""Helper function to take a SignedUrlDownload object and perform a download with a progress bar. | ||
Return the local file path of the downloaded file.""" | ||
download_block_size = 1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what makes this the optimal number? (and should it be configurable?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was in an example I'd found, so no justification there. it looks like 8192
is the accepted number: https://stackoverflow.com/questions/48719893/why-is-the-block-size-for-python-httplibs-reads-hard-coded-as-8192-bytes
doesn't seem like it should be configurable, but let me know if you disagree?
Quality Gate passedIssues Measures |
Description
Add
terralab download
command. Checks that the requested pipeline run's status is SUCCEEDED, then downloads all the output files in parallel.How it looks mid-download (in this screenshot the first two files have already completed):
How it looks post download:
Note: created TSPS-378 to address the newline weirdness in above screenshot.
Help menus:
Jira Ticket
https://broadworkbench.atlassian.net/browse/TSPS-324