Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OOM lakectl FS upload bug #8349

Merged
merged 4 commits into from
Nov 7, 2024
Merged

Fix OOM lakectl FS upload bug #8349

merged 4 commits into from
Nov 7, 2024

Conversation

idanovo
Copy link
Contributor

@idanovo idanovo commented Nov 5, 2024

Closes #8088

@idanovo idanovo self-assigned this Nov 5, 2024
@idanovo idanovo added bug Something isn't working include-changelog PR description should be included in next release changelog labels Nov 5, 2024
Copy link

github-actions bot commented Nov 5, 2024

E2E Test Results - DynamoDB Local - Local Block Adapter

13 passed

Copy link

github-actions bot commented Nov 5, 2024

E2E Test Results - Quickstart

11 passed

Copy link
Contributor

@itaiad200 itaiad200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain how this PR solves the linked issue?

@idanovo idanovo requested a review from arielshaqed November 6, 2024 08:17
@idanovo idanovo requested a review from Isan-Rivkin November 6, 2024 08:55
@idanovo
Copy link
Contributor Author

idanovo commented Nov 6, 2024

@itaiad200 Yes, follow me-

From the uploadObject/UploadPartObject we have this code:

	req, err := http.NewRequestWithContext(ctx, http.MethodPut, preSignURL, body)
        ...
	putResp, err := u.uploader.HTTPClient.Do(req)

the type of the body we pass to NewRequestWithContext is local.fileWrapper which implement seek but doesn't implement close
so the http.NewRequestFromContext function get to this code:

	rc, ok := body.(io.ReadCloser)
	if !ok && body != nil {
		rc = io.NopCloser(body)
	}

So we get a request from body that implements close (NopCloser) but doesn't implement seek

Now when getting to this line- putResp, err := u.uploader.HTTPClient.Do(req)
If our HTTP client is the retryable client the Do function will eventually get to this code:

	case io.ReadSeeker:
         ...
	// Read all in so we can reset
	case io.Reader:
		buf, err := io.ReadAll(body)

Now, because our body doesn't implement seek but does implement read, we will call io.ReadAll and read the entire file into the memory and crush.

Copy link
Contributor

@Isan-Rivkin Isan-Rivkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NICE Research!
Sometimes small solutions solve big problems :)

Copy link
Contributor

@itaiad200 itaiad200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clear explanation. How was this tested?

@idanovo
Copy link
Contributor Author

idanovo commented Nov 6, 2024

Thanks for the clear explanation. How was this tested?

@itaiad200 I built a docker image that runs lakectl fs upload and tries to upload a 1GB file using the --memory=512m flag.
First I saw that the bug reproduced when running it using the master branch, and then tested that my code made it work.

Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, best!

@idanovo idanovo merged commit da13042 into master Nov 7, 2024
38 checks passed
@idanovo idanovo deleted the fix-oom-lakectl-fs-upload branch November 7, 2024 16:01
@arielshaqed
Copy link
Contributor

https://imgflip.com/i/99kzey
Corgi meme: Wow / much memory / very progress / close() / very fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working include-changelog PR description should be included in next release changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: lakectl fs upload Causes OOM Error During Large File Uploads
4 participants