-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abort Download Object #80
Conversation
…nto cancel-single-download
/// Operation builders | ||
pub mod builders; | ||
/// Abstractions for responses and consuming data streams. | ||
pub mod output; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix: Be consistent across operations. e.g. upload output module is not public but it re-exports the output type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated.
/// | ||
/// Wraps potentially multiple streams of binary data into a single coherent stream. | ||
/// The data on this stream is sequenced into the correct order. | ||
#[derive(Debug)] | ||
pub struct Body { | ||
inner: UnorderedBody, | ||
pub struct Output { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion/discuss: Should this be consistent with the other operations output types (e.g. DownloadOutput
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yeah, DownloadOutput makes more sense since there are different types of outputs. I have updated it.
@@ -42,35 +42,35 @@ pub struct ChunkOutput { | |||
// TODO: Do we want to expose something to yield multiple chunks in a single call, like | |||
// recv_many/collect, etc.? We can benchmark to see if we get a significant performance boost once | |||
// we have a better scheduler in place. | |||
impl Body { | |||
/// Create a new empty body | |||
impl Output { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future: We may want to have a conversion that allows you to just access the bytes off the stream (i.e. user only cares about the contents and not any of the intermediate metadata), we can wait though as it can be added in a backwards compatible way
fn into_body(self) -> Body
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, currently, body vs. output is not very different; it's only .next()
vs. .next().data
. We can add this util in the future if we add more things to the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, reverted the name change.
} | ||
|
||
#[tokio::test(flavor = "multi_thread", worker_threads = 2)] | ||
async fn test_abort_download() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't have to necessarily use it here but one thing to make you aware of is you can leverage capture_test_logs
(example) which you can use to assert the logs contain certain things (e.g. to verify particular branches were hit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, leaving minor comments.
@@ -42,35 +42,35 @@ pub struct ChunkOutput { | |||
// TODO: Do we want to expose something to yield multiple chunks in a single call, like | |||
// recv_many/collect, etc.? We can benchmark to see if we get a significant performance boost once | |||
// we have a better scheduler in place. | |||
impl Body { | |||
/// Create a new empty body | |||
impl DownloadOutput { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent naming: We currently have UploadOutput
UploadObjectsOutput
and DownloadObjectsOutput
, which are single structs representing the end result of an operation.
But this represents a sequence of structs.
Also, we'll eventually need something representing the end result of a download operation (where we'll put stuff like whether checksum validation happened). I would have expected that to be namedDownloadOutput
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this continues to be a sequence of ChunkOutput
, I'm fine just continuing to call this Body
If we were going to change it to be a sequence of enums (representing, Discovery, Chunk, Final, etc), then maybe I'd name it Events
instead, but I don't think that's the plan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, discussed offline, I have reverted the naming to body.
); | ||
} | ||
} | ||
|
||
if let Err(err) = comp_tx.send(resp).await { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it OK that ErrorKind::OperationCancelledgets sent on
comp_tx`? That could result in MANY chunks sending their cancellation error on the chunk transmitter
Do all of these end up reported to the user via the Body/Output iterator? Or does it filter errors so the user only sees the 1st? Can we guarantee that the ACTUAL error gets reported to the user before any cancellation errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As soon as we encounter an error, we close the channel so user will only see the first error.
Can we guarantee that the ACTUAL error gets reported to the user before any cancellation errors?
Tokio does guarantee that "All data sent on Sender will become available on Receiver in the same order as it was sent." However, if the channel is full, all of them will be waiting at an await point. In that case (not 100% sure), the order won't be guaranteed. We will need a fair bit of complexity to work around that, i.e., maybe collect all the errors and then only send those that are not OperationCancelled
errors. Not sure if there is a simpler way to do this. @aajtodd any ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to send operation cancelled across the channel at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking there can be a case where only the operation cancelled error is there, but yeah, we can just not send that error at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A user wouldn't act on that error right? I wouldn't think so anyway, it doesn't tell them anything they don't already know if they cancelled the operation, thus it probably doesn't make sense to yield it to them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have updated it to ignore OperationCancelled
errors.
Co-authored-by: Michael Graeb <[email protected]>
Issue #, if available:
#65
Description of changes:
This PR implements the drop functionality for downloads. Here are the changes:
abort
function since it closes the body. That behavior is not right becausejoin
function should wait for the download to complete instead of aborting it and then only waiting for the in-progress parts to complete. I have removedjoin
function as it didn't made much sense, customers should consume the output if they want to wait for the download to complete. If we don't close the body in thejoin
function, it may hang waiting for the body channel to clear.abort
function which cancels any in-progress work and any future work.Renamebody->output
, this makes it more generic because we are not just delivering body from it. We are delivering metadata and any error that occurs will be delivered to this channel. If there is an error, the channel will be closed and we will not get any more data from this channel. This consolidates our error reporting to a single place instead of join + body channel.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.