Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lib-storage] Upload fails if application provides checksum for >5 MB file #6742

Open
3 of 4 tasks
trivikr opened this issue Dec 17, 2024 · 2 comments
Open
3 of 4 tasks
Labels
bug This issue is a bug. p3 This is a minor priority issue queued This issues is on the AWS team's backlog

Comments

@trivikr
Copy link
Member

trivikr commented Dec 17, 2024

Checkboxes for prior research

Describe the bug

Upload fails if application provides checksum for >5 MB file

Regression Issue

  • Select this option if this issue appears to be a regression.

SDK version number

@aws-sdk/[email protected], @aws-sdk/[email protected]

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

All, verified in v22.11.0

Reproduction Steps

import { createReadStream, createWriteStream } from "fs";
import { createHash } from "crypto";
import { S3 } from "@aws-sdk/client-s3";
import { Upload } from "@aws-sdk/lib-storage";

const SIZE_IN_MB = 6;
const content = "helloworld";
const Key = `${content}_${SIZE_IN_MB}MB.txt`;

const SIZE_IN_BYTES = SIZE_IN_MB * 1024 * 1024;
const repetitions = Math.floor(SIZE_IN_BYTES / content.length);

const hash = createHash("sha256");
const writeStream = createWriteStream(Key);
for (let i = 0; i < repetitions; i++) {
  writeStream.write(content);
  hash.update(content);
}
writeStream.end();
await new Promise((resolve) => writeStream.on("close", resolve));

const client = new S3();
const Bucket = "test-flexible-checksums"; // Replace with your test bucket name.
const Body = createReadStream(Key);
const ChecksumSHA256 = hash.digest("base64");

const upload = new Upload({
  client,
  params: { Bucket, Key, Body, ChecksumSHA256 },
});
await upload.done();

Observed Behavior

When SIZE_IN_MB is greater than 5, the following error is thrown

/local/home/trivikr/workspace/test/node_modules/@smithy/smithy-client/dist-cjs/index.js:835
  const response = new exceptionCtor({
                   ^

BadDigest: The SHA256 you specified did not match the calculated checksum.
    at throwDefaultError (/local/home/trivikr/workspace/test/node_modules/@smithy/smithy-client/dist-cjs/index.js:835:20)
    at /local/home/trivikr/workspace/test/node_modules/@smithy/smithy-client/dist-cjs/index.js:844:5
    at de_CommandError (/local/home/trivikr/workspace/test/node_modules/@aws-sdk/client-s3/dist-cjs/index.js:4919:14)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
    at async /local/home/trivikr/workspace/test/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20
    at async /local/home/trivikr/workspace/test/node_modules/@aws-sdk/middleware-sdk-s3/dist-cjs/index.js:485:18
    at async /local/home/trivikr/workspace/test/node_modules/@smithy/middleware-retry/dist-cjs/index.js:320:38
    at async /local/home/trivikr/workspace/test/node_modules/@aws-sdk/middleware-flexible-checksums/dist-cjs/index.js:263:18
    at async /local/home/trivikr/workspace/test/node_modules/@aws-sdk/middleware-sdk-s3/dist-cjs/index.js:110:22
    at async /local/home/trivikr/workspace/test/node_modules/@aws-sdk/middleware-sdk-s3/dist-cjs/index.js:138:14 {
  '$fault': 'client',
  '$metadata': {
    httpStatusCode: 400,
    requestId: '85PD79WA0QPKN7EW',
    extendedRequestId: '1qhp9msss1ay+fS0glD9F/68M9FOXmzOsoejN/DRw/xZ/ViZs9tu1gqpg2XdglxZgQKp2Rm6uJY=',
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  },
  Code: 'BadDigest',
  RequestId: '85PD79WA0QPKN7EW',
  HostId: '1qhp9msss1ay+fS0glD9F/68M9FOXmzOsoejN/DRw/xZ/ViZs9tu1gqpg2XdglxZgQKp2Rm6uJY='
}

When SIZE_IN_MB is less than or equal to 5, then no error is thrown.

Expected Behavior

No error thrown when application provides Checksum for >5 MB file.

Possible Solution

No response

Additional Information/Context

No response

@trivikr trivikr added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. queued This issues is on the AWS team's backlog p3 This is a minor priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Dec 17, 2024
@trivikr
Copy link
Member Author

trivikr commented Dec 17, 2024

This bug can be fixed when S3 allows passing x-amz-checksum-type as per blog post

The CreateMultiPartUpload API introduces a new HTTP header, x-amz-checksum-type, which lets you specify the type of checksum to use. You can choose either a full object checksum (calculated by combining the checksums of all individual parts) or a composite checksum.

@trivikr
Copy link
Member Author

trivikr commented Dec 17, 2024

While we wait for model to updated to allow passing x-amz-checksum-type, there are two workarounds.

Workaround 1: Let SDK compute the checksum

Pass ChecksumAlgorithm="Sha256" instead of passing the checksum value in ChecksumSHA256 for >5 MB files.
The SDK will compute the checksum for each part.

Test code

class CustomHandler extends NodeHttpHandler {
  constructor() {
    super();
  }

  printChecksumHeaders(prefix, headers) {
    for (const [header, value] of Object.entries(headers)) {
      if (
        header.startsWith("x-amz-checksum-") ||
        header.startsWith("x-amz-sdk-checksum-")
      ) {
        console.log(`${prefix}['${header}']: '${value}'`);
      }
    }
  }

  async handle(request, options) {
    const response = await super.handle(request, options);
    console.log();
    console.log("------------------");
    this.printChecksumHeaders("request", request.headers);
    this.printChecksumHeaders("response", response.response.headers);
    console.log("------------------");
    console.log();
    return response;
  }
}

const client = new S3({ requestHandler: new CustomHandler() });
const Bucket = "test-flexible-checksums"; // Replace with your test bucket name.
const Body = createReadStream(Key);
const ChecksumAlgorithm = "SHA256";

const upload = new Upload({
  client,
  params: { Bucket, Key, Body, ChecksumAlgorithm },
});
await upload.done();

Note that SDK sends the 6 MB file in two parts, and computes checksums for each part

------------------
request['x-amz-checksum-algorithm']: 'SHA256'
response['x-amz-checksum-algorithm']: 'SHA256'
response['x-amz-checksum-type']: 'COMPOSITE'
------------------


------------------
request['x-amz-sdk-checksum-algorithm']: 'SHA256'
request['x-amz-checksum-sha256']: 'jl28wIE8B/50cPRez5JaVTYNlvVk39FWKapxP8KEu88='
response['x-amz-checksum-sha256']: 'jl28wIE8B/50cPRez5JaVTYNlvVk39FWKapxP8KEu88='
------------------


------------------
request['x-amz-sdk-checksum-algorithm']: 'SHA256'
request['x-amz-checksum-sha256']: 'JU0DrgdnLiihtY/7GHhqvAmJv50Va9RhaNLVwUUu9NU='
response['x-amz-checksum-sha256']: 'JU0DrgdnLiihtY/7GHhqvAmJv50Va9RhaNLVwUUu9NU='
------------------


------------------
------------------

Workaround 2: Use PutObject from client-s3 instead of Upload from lib-storage

class CustomHandler extends NodeHttpHandler {
  constructor() {
    super();
  }

  printChecksumHeaders(prefix, headers) {
    for (const [header, value] of Object.entries(headers)) {
      if (
        header.startsWith("x-amz-checksum-") ||
        header.startsWith("x-amz-sdk-checksum-")
      ) {
        console.log(`${prefix}['${header}']: '${value}'`);
      }
    }
  }

  async handle(request, options) {
    const response = await super.handle(request, options);
    console.log();
    console.log("------------------");
    this.printChecksumHeaders("request", request.headers);
    this.printChecksumHeaders("response", response.response.headers);
    console.log("------------------");
    console.log();
    return response;
  }
}

const client = new S3({ requestHandler: new CustomHandler() });
const Bucket = "test-flexible-checksums"; // Replace with your test bucket name.
const Body = createReadStream(Key);

await client.putObject({ Bucket, Key, Body, ChecksumSHA256 });

The PutObject call will send the provided checksum when making the call

------------------
request['x-amz-checksum-sha256']: 'ZRRKEcEAxGazUzgqh+rSEecXfI27XNZQ8Uv7aMOX64s='
response['x-amz-checksum-sha256']: 'ZRRKEcEAxGazUzgqh+rSEecXfI27XNZQ8Uv7aMOX64s='
response['x-amz-checksum-type']: 'FULL_OBJECT'
------------------

Between the two workarounds, we recommend using Upload without checksum, as Upload from @aws-sdk/lib-storage is recommended for large files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p3 This is a minor priority issue queued This issues is on the AWS team's backlog
Projects
None yet
Development

No branches or pull requests

1 participant