Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Producer condition change + Rename Azcopy tests #127

Merged
merged 41 commits into from
Jul 30, 2024

Conversation

saxenapranav
Copy link
Collaborator

@saxenapranav saxenapranav commented Jul 9, 2024

Changes:

  1. PathInformation to contain information if the path is implicit.
  2. If the rename source is implicit, create a directory before starting the rename
  3. In AbstractAbfsIntegrationTest.java, added createMultipleAzCopyPaths which can create multple dirs and files with azcopy parallely. Would accept the list of dirs and list of file paths from the required test.
  4. Fix conditioning of when producer can enumerate: size of queue should be lesser than the max consumption lag (default 5000).
  5. CreateNonRecursive HDFS API on atomic path has to take lease on parent dir.

dfs full run:


:::: AGGREGATED TEST RESULT ::::

============================================================
HNS-OAuth

[WARNING] Tests run: 147, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 740, Failures: 0, Errors: 0, Skipped: 153
[WARNING] Tests run: 414, Failures: 0, Errors: 0, Skipped: 48

============================================================
HNS-SharedKey

[ERROR] testUpdateDeepDirectoryStructureToRemote(org.apache.hadoop.fs.azurebfs.contract.ITestAbfsFileSystemContractDistCp) Time elapsed: 3.967 s <<< FAILURE!

[ERROR] testHttpReadTimeout(org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemE2E) Time elapsed: 6.34 s <<< ERROR!

[WARNING] Tests run: 147, Failures: 0, Errors: 0, Skipped: 5
[ERROR] Tests run: 740, Failures: 0, Errors: 1, Skipped: 105
[ERROR] Tests run: 414, Failures: 1, Errors: 0, Skipped: 35

============================================================
NonHNS-SharedKey

[ERROR] testHttpReadTimeout(org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemE2E) Time elapsed: 6.691 s <<< ERROR!

[WARNING] Tests run: 147, Failures: 0, Errors: 0, Skipped: 10
[ERROR] Tests run: 724, Failures: 0, Errors: 1, Skipped: 346
[WARNING] Tests run: 414, Failures: 0, Errors: 0, Skipped: 38

============================================================
AppendBlob-HNS-OAuth

[WARNING] Tests run: 147, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 740, Failures: 0, Errors: 0, Skipped: 157
[WARNING] Tests run: 414, Failures: 0, Errors: 0, Skipped: 72

Time taken: 26 mins 43 secs.

blob test run:

============================================================
NonHNS-SharedKey

[ERROR] testValidateSeekBounds(org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemRandomRead) Time elapsed: 2.515 s <<< FAILURE!

[WARNING] Tests run: 147, Failures: 0, Errors: 0, Skipped: 11
[ERROR] Tests run: 724, Failures: 1, Errors: 0, Skipped: 278
[WARNING] Tests run: 414, Failures: 0, Errors: 0, Skipped: 39

Time taken: 7 mins 23 secs.
azureuser@pranav-ind-vm:~/AbfsHadoop/hadoop-tools/hadoop-azure$ git log
commit 00cec1c (HEAD -> sp/azcopyTests, origin/sp/azcopyTests)
Author: Pranav Saxena <>
Date: Mon Jul 8 06:51:41 2024 -0700

test run improvement + refactor

@github-actions github-actions bot added the trunk label Jul 9, 2024
@saxenapranav
Copy link
Collaborator Author

saxenapranav commented Jul 12, 2024

:::: AGGREGATED TEST RESULT ::::

============================================================
NonHNS-SharedKey

[WARNING] Tests run: 147, Failures: 0, Errors: 0, Skipped: 11
[WARNING] Tests run: 725, Failures: 0, Errors: 0, Skipped: 278
[WARNING] Tests run: 415, Failures: 0, Errors: 0, Skipped: 39

Time taken: 7 mins 59 secs.
azureuser@pranav-ind-vm:~/AbfsHadoop/hadoop-tools/hadoop-azure$ git log
commit f327af3 (HEAD -> sp/azcopyTests, origin/sp/azcopyTests)
Author: Pranav Saxena <>
Date: Thu Jul 11 23:42:36 2024 -0700

fixes

@saxenapranav
Copy link
Collaborator Author

saxenapranav commented Jul 16, 2024

NonHNS-SharedKey

[ERROR] testUpdateDeepDirectoryStructureToRemote(org.apache.hadoop.fs.azurebfs.contract.ITestAbfsFileSystemContractDistCp) Time elapsed: 3.929 s <<< FAILURE!

[WARNING] Tests run: 148, Failures: 0, Errors: 0, Skipped: 11
[WARNING] Tests run: 725, Failures: 0, Errors: 0, Skipped: 278
[ERROR] Tests run: 415, Failures: 1, Errors: 0, Skipped: 39

Time taken: 6 mins 58 secs.
azureuser@pranav-ind-vm:~/AbfsHadoop/hadoop-tools/hadoop-azure$ git log
commit 9bd4eb5 (HEAD -> sp/azcopyTests, origin/sp/azcopyTests)
Author: Pranav Saxena
Date: Tue Jul 16 03:40:05 2024 -0700

test added

@saxenapranav saxenapranav changed the title Rename Azcopy tests Producer condition change + Rename Azcopy tests Jul 16, 2024
@@ -519,12 +527,13 @@ private PathInformation getPathInformation(Path path,

return new PathInformation(true,
abfsClient.checkIsDir(op.getResult()),
extractEtagHeader(op.getResult()));
extractEtagHeader(op.getResult()),
op.getResult() instanceof AbfsHttpOperation.AbfsHttpOperationWithFixedResultForGetFileStatus);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didnt get the need for instance check here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getPathStatus on blobClient will give instance of AbfsHttpOperationWithFixedResultForGetFileStatus for an implicit path. Hence used.

@saxenapranav
Copy link
Collaborator Author

saxenapranav commented Jul 24, 2024

with blob endpoing


:::: AGGREGATED TEST RESULT ::::

============================================================
NonHNS-SharedKey

[WARNING] Tests run: 148, Failures: 0, Errors: 0, Skipped: 11
[WARNING] Tests run: 727, Failures: 0, Errors: 0, Skipped: 278
[WARNING] Tests run: 415, Failures: 0, Errors: 0, Skipped: 39

Time taken: 8 mins 5 secs.
azureuser@pranav-ind-vm:~/AbfsHadoop/hadoop-tools/hadoop-azure$ git log
commit 4158fbf (HEAD -> sp/createNonRecursive, origin/sp/createNonRecursive)
Author: Pranav Saxena


Date: Mon Jul 22 23:31:31 2024 -0700

added test

@saxenapranav
Copy link
Collaborator Author


:::: AGGREGATED TEST RESULT ::::

============================================================
HNS-OAuth

[WARNING] Tests run: 148, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 743, Failures: 0, Errors: 0, Skipped: 156
[WARNING] Tests run: 415, Failures: 0, Errors: 0, Skipped: 49

============================================================
HNS-SharedKey

[WARNING] Tests run: 148, Failures: 0, Errors: 0, Skipped: 5
[WARNING] Tests run: 743, Failures: 0, Errors: 0, Skipped: 108
[WARNING] Tests run: 415, Failures: 0, Errors: 0, Skipped: 36

============================================================
NonHNS-SharedKey

[WARNING] Tests run: 148, Failures: 0, Errors: 0, Skipped: 11
[WARNING] Tests run: 727, Failures: 0, Errors: 0, Skipped: 278
[WARNING] Tests run: 415, Failures: 0, Errors: 0, Skipped: 39

============================================================
AppendBlob-HNS-OAuth

[WARNING] Tests run: 148, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 743, Failures: 0, Errors: 0, Skipped: 160
[WARNING] Tests run: 415, Failures: 0, Errors: 0, Skipped: 73

Time taken: 28 mins 15 secs.
azureuser@pranav-ind-vm:~/AbfsHadoop/hadoop-tools/hadoop-azure$ git log
commit 4158fbf (HEAD -> sp/createNonRecursive, origin/sp/createNonRecursive)
Author: Pranav Saxena <>
Date: Mon Jul 22 23:31:31 2024 -0700

added test

Copy link
Collaborator

@anujmodi2021 anujmodi2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some thoughts and comments

@@ -417,6 +423,10 @@ public class AbfsConfiguration{
FS_AZURE_BLOB_DIR_DELETE_MAX_THREAD, DefaultValue = DEFAULT_FS_AZURE_BLOB_DELETE_THREAD)
private int blobDeleteDirConsumptionParallelism;

@BooleanConfigurationValidatorAnnotation(ConfigurationKey =
FS_AZURE_LEASE_CREATE_NON_RECURSIVE, DefaultValue = DEFAULT_FS_AZURE_LEASE_CREATE_NON_RECURSIVE)
private boolean leaseOnCreateNonRecursive;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: boolean variable. Better to change to isLeaseOnCreateNonRecursiveEnabled

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken.

final TracingContext tracingContext, final boolean isNamespaceEnabled)
throws AzureBlobFileSystemException {
return createPath(path, isFile, overwrite, permissions, isAppendBlob, eTag,
contextEncryptionAdapter, tracingContext, isNamespaceEnabled, false);
AbfsLease abfsLease = null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something...
Can you help me recall why we need this lease business now and not earlier??

Or was it just missed earlier?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet it was missed. In case of createNonRecursive on atomic path, we have to take lease on the parent directory.

@@ -603,7 +603,7 @@ public void deleteFilesystem(TracingContext tracingContext)
public OutputStream createFile(final Path path,
final FileSystem.Statistics statistics, final boolean overwrite,
final FsPermission permission, final FsPermission umask,
TracingContext tracingContext) throws IOException {
final boolean isRecursiveCreate, TracingContext tracingContext) throws IOException {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter passed to store has name isNonRecursiveCreate and parameter accepted in store has name isRecursiveCreate.
This seems confusing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems buggy as well...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 with the confusion. It has to be all about nonRecursiveCreate. Can you please explain what seems wrong here please :).

@@ -417,7 +426,7 @@ public FSDataOutputStream createNonRecursive(final Path f, final FsPermission pe
+ f.getName() + " because parent folder does not exist.");
}

return create(f, permission, overwrite, bufferSize, replication, blockSize, progress);
return createInternal(f, permission, overwrite, blockSize, true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like these parameters were not used. Should we still keep them to reduce unnecessary diffs??
No, issues in removing as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken.

return createInternal(f, permission, overwrite, blockSize, false);
}

private FSDataOutputStream createInternal(final Path f,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this refraction we are doing only for Blob Endpoint. this new parameter is not used by DFS Client. Can we have these handling only in ABFSBlobClient?

Let's discuss this once offline if its possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new field is just to tell BlobClient, if the createPath is for createNonRecursive HDFS API or create HDFS API. The required orchestration for blob on createNonRecursive is done in blobClient only. This field is just to propagate the information to the client about what HDFS API has invoked it.

…ave lease before checking parent existence so that it can be ensured that no parallel rename is taking place on the atomic dir. Added tests; removed unrequried code
@saxenapranav
Copy link
Collaborator Author

blob non-hns:

:::: AGGREGATED TEST RESULT ::::

============================================================
NonHNS-SharedKey

[ERROR] testUpdateDeepDirectoryStructureToRemote(org.apache.hadoop.fs.azurebfs.contract.ITestAbfsFileSystemContractDistCp) Time elapsed: 4.058 s <<< FAILURE!

[WARNING] Tests run: 148, Failures: 0, Errors: 0, Skipped: 11
[WARNING] Tests run: 729, Failures: 0, Errors: 0, Skipped: 278
[ERROR] Tests run: 415, Failures: 1, Errors: 0, Skipped: 39

Time taken: 7 mins 37 secs.
azureuser@pranav-ind-vm:~/AbfsHadoop/hadoop-tools/hadoop-azure$ git log
commit c37f21d (HEAD -> sp/azcopyTests, origin/sp/azcopyTests)
Author: Pranav Saxena <>
Date: Thu Jul 25 22:00:50 2024 -0700

flow of statistic incremenet

@saxenapranav
Copy link
Collaborator Author

on dfs endpoint:


:::: AGGREGATED TEST RESULT ::::

============================================================
HNS-OAuth

[WARNING] Tests run: 148, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 745, Failures: 0, Errors: 0, Skipped: 158
[WARNING] Tests run: 415, Failures: 0, Errors: 0, Skipped: 49

============================================================
HNS-SharedKey

[WARNING] Tests run: 148, Failures: 0, Errors: 0, Skipped: 5
[WARNING] Tests run: 745, Failures: 0, Errors: 0, Skipped: 110
[WARNING] Tests run: 415, Failures: 0, Errors: 0, Skipped: 36

============================================================
NonHNS-SharedKey

[WARNING] Tests run: 148, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 729, Failures: 0, Errors: 0, Skipped: 351
[WARNING] Tests run: 415, Failures: 0, Errors: 0, Skipped: 39

============================================================
AppendBlob-HNS-OAuth

[WARNING] Tests run: 148, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 745, Failures: 0, Errors: 0, Skipped: 162
[WARNING] Tests run: 415, Failures: 0, Errors: 0, Skipped: 73

Time taken: 26 mins 7 secs.
azureuser@pranav-ind-vm:~/AbfsHadoop/hadoop-tools/hadoop-azure$ git log
commit c37f21d (HEAD -> sp/azcopyTests, origin/sp/azcopyTests)
Author: Pranav Saxena <>
Date: Thu Jul 25 22:00:50 2024 -0700

flow of statistic incremenet

@@ -321,6 +320,7 @@ public AbfsRestOperation listPath(final String relativePath,
/**
* Get Rest Operation for API <a href = https://learn.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/create></a>.
* Create a path (file or directory) in the current filesystem.
*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No changes in this file, additional changes can be removed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken.

@@ -451,6 +456,37 @@ public abstract AbfsRestOperation createPath(final String path,
final ContextEncryptionAdapter contextEncryptionAdapter,
final TracingContext tracingContext, boolean isNamespaceEnabled) throws AzureBlobFileSystemException;

public AbfsRestOperation createNonRecursivePath(final String pathStr,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add javadocs for this method

umask),
false,
null, null, tracingContext, isNamespaceEnabled);
return createAbfsOutputStreamInstance(statistics, tracingContext,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is no check for appendBlob and contextEncryptionAdapter needed for non recursive create ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected.

getPathStatus(parentPath.toString(), false, tracingContext,
contextEncryptionAdapter);
} catch (AbfsRestOperationException ex) {
if (ex.getStatusCode() == HttpURLConnection.HTTP_OK) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition is not clear if it is 200 response why throw exception ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be 404. corrected.

}
throw ex;
} finally {
abfsCounters.incrementCounter(CALL_GET_FILE_STATUS, 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And why getFileStatus increment ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetFileStatus API is getting called, and there is a test that asserts it. Hence added here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see only a GetPathStatus API call

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, its for that, in trunk, it calls tryGetFileStatus -> getFileStatus -> store.getFileStatus -> client.getPathStatus

isAppendBlob, eTag, contextEncryptionAdapter, tracingContext,
isNamespaceEnabled);
} finally {
abfsCounters.incrementCounter(CALL_CREATE, 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This finally method should also release lease

}
throw ex;
} finally {
abfsCounters.incrementCounter(CALL_GET_FILE_STATUS, 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also dont see any call for this API

@saxenapranav
Copy link
Collaborator Author

@anmolanmol1234 @anujmodi2021 , thanks for all the reviews! Have done the following for fs.createNonRecursive:

instead of

 final FileStatus parentFileStatus = tryGetFileStatus(parent, tracingContext);

    if (parentFileStatus == null) {
      throw new FileNotFoundException("Cannot create file "
          + f.getName() + " because parent folder does not exist.");
    }

    return create(f, permission, overwrite, bufferSize, replication, blockSize, progress);

now doing:

    try (CreateNonRecursiveCheckActionTaker actionTaker = getAbfsStore().createNonRecursivePreCheck(
          qualifiedPath, tracingContext)) {
        return create(f, permission, overwrite, bufferSize, replication,
            blockSize, progress);
      }

Here CreateNonRecursiveCheckActionTaker is a closable object. Now, store.createNonRecursivePreCheck will forward to client.createNonRecursivePreCheck for endpoint related check. The precheck can attain some resources like lease (depends on conditions and endpoint) until create is done. CreateNonRecursiveCheckActionTaker will have info of that resource. On close of CreateNonRecursiveCheckActionTaker , it will release that resource (here lease).

Copy link
Collaborator

@anujmodi2021 anujmodi2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
Thanks for taking comments and resolving my queries.

@saxenapranav saxenapranav merged commit d3b275b into wasbDepCodeReview Jul 30, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants