Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HandleDeserialize Error from MiddleWare #2924

Closed
1 task
sathishdv opened this issue Dec 6, 2024 · 7 comments
Closed
1 task

HandleDeserialize Error from MiddleWare #2924

sathishdv opened this issue Dec 6, 2024 · 7 comments
Assignees
Labels
bug This issue is a bug. p2 This is a standard priority issue

Comments

@sathishdv
Copy link

Describe the bug

while running a high load dealing with DynamoDB with Provisioned Capacity started getting these errors

"github.com/aws/smithy-go/transport/http.(*errorCloseResponseBodyMiddleware).HandleDeserialize(0x41d445?, {0x1d3e698?, 0xc000535d40?}, {{0x170bfa0?, 0xc000535830?}}, {0x1d2e760?, 0xc0054577c0?})" pod=pod_name time=2024-12-06T18:28:49.581311541Z
log="	/build/vendor/github.com/aws/aws-sdk-go-v2/aws/transport/http/response_error_middleware.go:31 +0x53 fp=0xc00288e588 sp=0xc00288e4e0 pc=0xd82093" pod=pod_name time=2024-12-06T18:28:49.581294711Z

This error causing pods to crash and continous restarts.

SDK Version: v1.32.6

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

This error should not happen, if not atleast these should be error type and retryable

Current Behavior

This error is not handled properly and causing crashes

Reproduction Steps

Can see this consistently on high load. Espcially after auto-scaled up on Provisioned capacity. In the event of provisionedExceeded error, we retry until the operation succeeds.

Possible Solution

No response

Additional Information/Context

No response

SDK version used

v1.32.6

Environment details (Version of Go (go version)? OS name and version, etc.)

v1.32.6

@sathishdv sathishdv added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Dec 6, 2024
@lucix-aws lucix-aws transferred this issue from aws/aws-sdk-go Dec 6, 2024
@RanVaknin
Copy link
Contributor

RanVaknin commented Dec 6, 2024

Hi @sathishdv ,

My guess is that the service behaves unexpectedly under high load and causes a panic in that middleware. Perhaps a connection that is closed prematurely.

Can you please enable the SDK wire logs, and see if you can capture the logs before the panic?

	cfg, err := config.LoadDefaultConfig(context.TODO(), config.WithRegion("us-east-1"), config.WithClientLogMode(aws.LogRequest|aws.LogResponseWithBody))

Thanks,
Ran~

@RanVaknin RanVaknin self-assigned this Dec 6, 2024
@RanVaknin RanVaknin added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Dec 6, 2024
@sathishdv
Copy link
Author

Hi @RanVaknin Thanks for the response. Yes, the connection seems to be closed prematurely. I can see the error propagated and logs this error

deserialization failed, failed to decode response body, read tcp <<ip_address>>:42570-><<ip_address>>:443: use of closed network connection

Is it ok to retry such errors? I believe connection getting closed abruptly, prolly retry should should resolve, but the problem is SDK doesn't provide any nicer way to handle these kind of errors/exceptions. I have to check the strings.Contain for any of the keywords 😊

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Dec 8, 2024
@sathishdv
Copy link
Author

Hi @RanVaknin in fact the above error when retried the operation resulting in context cancelled. Looks like it is unable to establish connection, so pods getting restarted and the AWS Config initializes and then works.

@RanVaknin
Copy link
Contributor

Hi @sathishdv ,

This is related to #2737

Currently in our backlog to investigate. We also got your internal support ticket.

I'm going to close this as duplicate and ask you to check the other ticket for further updates on the matter.

Thanks again,
Ran~

@RanVaknin RanVaknin closed this as not planned Won't fix, can't repro, duplicate, stale Dec 12, 2024
Copy link

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

@sathishdv
Copy link
Author

Hi @RanVaknin thanks for the update.

@eliebleton-manomano
Copy link

I'm observing panics down stack from HandleDeserialize when calling s3.ListObjectV2 with about ~15 concurrent calls. All is well without concurrency. This issue was closed as a duplicate of DynamoDB specific issue - just wanted to chime in and point out it could be endpoint agnostic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

3 participants