-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRT engine fails to read objects larger than the window size with localstack #508
Comments
@unredundant Thanks for the detailed report. I'm not able to recreate this locally. val fileText = javaClass.classLoader.getResource("aws/sdk/kotlin/example/shrek.txt")?.readText() ?: error("expected shrek script")
val fileContent = fileText.encodeToByteArray()
val fileKey = "swamp"
S3Client.fromEnvironment { region = "us-east-2" }.use { s3 ->
println("original content length: ${fileContent.size}")
val putResp = s3.putObject {
bucket = "<my-bucket>"
key = fileKey
body = ByteStream.fromBytes(fileContent)
}
println(putResp)
s3.getObject(GetObjectRequest { bucket = "aaron22-throwaway"; key = fileKey }) { resp ->
println("content length: ${resp.body?.contentLength}")
val bs = ByteStream.fromString(resp.body!!.decodeToString())
println("reconstituted content length: ${bs.contentLength}")
}
} All of these print out a content length of I'd be interested to see what the implementation of Also couple friendly pointers:
|
Thanks for the quick response, really appreciate it :)
Hmmm, so here is my current quick and dirty implementation that is causing problems. @Service
class S3FileStorageService @Autowired constructor(val client: S3Client) : IFileStorageService {
// todo encryption?
override suspend fun uploadFile(key: String, stream: InputStream): Boolean {
val body = ByteStream.fromBytes(stream.readBytes())
val result = client.putObject {
this.bucket = "testerino"
this.key = key
this.body = body
}
return true
}
override suspend fun downloadFile(key: String): ByteStream = client.getObject(GetObjectRequest.invoke {
this.bucket = "testerino"
this.key = key
}) {
it.body!!
}
} I've tried a couple of misc modifications, like switching the return type of What's odd is that, I would assume that if my response was being closed mid stream, that I would get non-deterministic results, but I get this exact character count on every run 🤔
Yep, I pretty much only have these cuz I was going crazy trying to figure out exactly where this truncation is happening :') definitely room to streamline, but none of them should really be causing this truncation correct?
Good call! I copy pasta'd that interface from an old synchronous impl. will definitely change that |
I might try this against an actual s3 bucket, I'm wondering if there is some possible localstack weirdness going on? |
Agreed that is odd if that is what is happening, although you could always add a delay somewhere and see if your results change a bit. override suspend fun downloadFile(key: String): ByteStream = client.getObject(GetObjectRequest.invoke {
this.bucket = "testerino"
this.key = key
}) {
it.body!!
} Either way this is definitely an issue. The response body is only valid inside the given closure and so any attempt to read it out of the closure is invalid. You said you tried
It wouldn't hurt to rule it out. Let us know what you find, very interested to understand what is going on here. |
So, I'm still waiting on company SRE to get me a real bucket with perms to test against, but in the meantime, I went ahead and cut out all of this code into a demo repo. I've confirmed that I get the same error as in my actual work repo. https://github.com/unredundant/kt-aws-s3-bug-demo I wanted to try it on my personal machine, but I have an m1 mac and I was experiencing the same error as this report #473 so I figured I'd just push it :) Would be really curious to hear if you get the same result in the repo i posted |
Ok, I haven't cracked the code yet, but a little debugging has led me down an interesting path. The content length of the file is So i don't think the problem is that the connection is closing ahead of the read, but rather that something is causing |
Interesting. So 16384 is the default window size of the underlying availableForRead is just the amount that is immediately available for read without suspending. I'd be interested to know what I'll try and take a look at this as well and see if I can reproduce. |
So, it looks like And then looking at the available segments, Also just another weird little thing I noticed, you guys have a check in place private suspend fun readRemainingSuspend(buffer: SdkByteBuffer, limit: Int): ByteArray {
check(currSegment.value == null) { "current segment should be drained already" }
// ... that references the field which is strange, because currsegment itself isn't nullable... if it were wouldn't the definition need to be |
I'm not sure what the debugger is doing but the code itself is checking the inner I was able to reproduce with your example repo, thanks for such a detailed and easy reproduction. OK starting to get a better picture of what is happening with some trace logging of the CRT engine.
It looks like what is happening is the window fills up and then the socket is closed for read. I believe we have a small bug where this error is not getting propagated correctly here. I believe this ought to be passing the cause in such that it reads Indeed when I do pass the underlying
As to why this is happening I'm not sure yet but it may be with localstack. I'd be interested if you're able to test this on a real S3 bucket and still see the same thing. |
I don't see this behavior with the diff --git a/file-storage/build.gradle.kts b/file-storage/build.gradle.kts
index 209ec10..7dafd6e 100644
--- a/file-storage/build.gradle.kts
+++ b/file-storage/build.gradle.kts
@@ -68,7 +74,9 @@ testing {
// AWS
implementation("com.amazonaws:aws-java-sdk-core:1.12.145") // Used ONLY for localstack credential provider
+ implementation("aws.smithy.kotlin:http-client-engine-ktor:0.7.7-SNAPSHOT")
+ implementation("com.squareup.okhttp3:okhttp:4.9.2")
// Mockk
implementation("io.mockk:mockk:1.12.1")
diff --git a/file-storage/src/testIntegration/kotlin/testerino/S3ClientConfig.kt b/file-storage/src/testIntegration/kotlin/testerino/S3ClientConfig.kt
index 79d3271..fb3e776 100644
--- a/file-storage/src/testIntegration/kotlin/testerino/S3ClientConfig.kt
+++ b/file-storage/src/testIntegration/kotlin/testerino/S3ClientConfig.kt
@@ -9,9 +9,14 @@ import org.springframework.context.annotation.Configuration
import org.springframework.context.annotation.Profile
import org.testcontainers.containers.localstack.LocalStackContainer
+import aws.smithy.kotlin.runtime.http.engine.ktor.KtorEngine
+
@Profile("integration-test")
@Configuration
class S3ClientConfig {
@Autowired
private lateinit var localstack: LocalStackContainer
@@ -19,6 +24,7 @@ class S3ClientConfig {
@Bean
fun s3Client(): S3Client = S3Client {
region = "us-east-1"
+ httpClientEngine = KtorEngine()
endpointResolver = AwsEndpointResolver { _, _ ->
AwsEndpoint(
localstack.getEndpointOverride(LocalStackContainer.Service.S3).toString()
The explicit dependency on Full disclosure, the Hopefully this can unblock you. I'll leave it open for now to see if anyone else has similar issues working with localstack. I'm still interested if you see this on a real bucket though so please let us know if you see issues there. |
decodeToString
_only_ when retrieved from getObject
Can confirm that the workaround you posted works for me. Only caveat is I had to use I will follow up with info on the real bucket attempt soon. Really appreciate the help! Very excited to see this SDK moving towards stable :) |
I was able to run this against a live s3 bucket without the Ktor engine override, so this does seem to be localstack specific |
Thanks for confirming. I'll open a ticket with the CRT team to see if they can track down anything but since it's specific to localstack no further action will be taken on this right now. |
Hi, I ran into the same issue on localstack. I had to switch to My code is fairly simple : suspend fun downloadToFile(keyName: String, bucketName: String? = null): File? {
val request = GetObjectRequest {
key = keyName
bucket = bucketName ?: defaultBucket
}
return try {
amazonS3Client.getObject(request) { resp ->
if (resp.body != null) {
val file = kotlin.io.path.createTempFile()
resp.body?.writeToFile(file)
file.toFile()
} else {
null
}
}
} catch (e: Exception) {
null
}
} Calling this function from a spring boot controller results in a corrupted temp file (size was totally random). |
This is a very old issue that is probably not getting as much attention as it deserves. We encourage you to check if this is still an issue in the latest release and if you find that this is still a problem, please feel free to provide a comment or open a new issue. |
Describe the bug
Hey, this is strange, and is making me worry that I'm just doing something really dumb, but given that this SDK is in beta perhaps there is some genuine weirdness going on here.
I am trying to set up a simple read/write S3 service in spring-boot using the kotlin aws sdk, and it seems like there is some really odd truncating going on with the
ByteStream
class, but only when theByteStream
comes from a downloaded object.I have two very simple tests
where
fileStorageService
implements a very minimal interfaceThe weirdness is that in the first test,
result.contentLength!! shouldBeExactly ByteStream.fromString(result.decodeToString()).contentLength!!
fails, while in the second test, it succeeds (as expected).In the first test, I get an error
Expected behavior
Decoding a byte stream and then encoding back to a byte stream should result in, if not completely identical streams, then at least streams with identical content lengths.
Current behavior
It seems to work, except in the case that a bytestream has been pulled from a
getObject
requestSteps to Reproduce
Pretty simple, upload a file to S3 (in my case I'm using localstack to try all of this locally).
The file I'm using to test is the full transcript of Shrek, link here. This isn't a hard requirement for reproducing, but at the same time... it totally is.
Then just try to upload it, download it, and compare the decoded bytestream to the expected body.
Possible Solution
No real idea, I'm wondering if there is any possibility that it has something to do with using localstack? but I would really rather not provision an actual bucket just to compare against my failed local testing.
Context
I'm just a simple man, trying to write to his bucket and read it back.
AWS Kotlin SDK version used
0.11.0-beta
Platform (JVM/JS/Native)
JVM
Operating System and version
Mac OS Big Sur
The text was updated successfully, but these errors were encountered: