Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove atomic dir action in fileStatus and listStatus from store class #123

Merged
merged 89 commits into from
Jul 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
c036404
RenameHandler + DfsRenameHandler
saxenapranav May 6, 2024
ea2bef1
pathinformation; shorten prechecks for dfs case
saxenapranav May 7, 2024
870361b
prechecks for blob handler
saxenapranav May 7, 2024
c2bacc3
deleteHandler
saxenapranav May 7, 2024
0005ca9
new config: fs.azure.blob.implicit.check.enabled
saxenapranav May 7, 2024
c3372af
Merge branch 'implicitConfg' into renameDelete
saxenapranav May 7, 2024
75edc7b
Merge branch 'azureBlobClient' into renameDelete
saxenapranav May 7, 2024
f17b0b7
ListActionTaker
saxenapranav May 8, 2024
eb642bc
delete code done
saxenapranav May 8, 2024
798e23e
wip
saxenapranav May 8, 2024
8282cc4
copy progress
saxenapranav May 9, 2024
fe41cca
copy Blob impl
saxenapranav May 9, 2024
0ad1214
renameAtomicity
saxenapranav May 9, 2024
4d48fe8
listStatus and getPathStatus to do renameAtomic
saxenapranav May 9, 2024
0b9b032
test fixture
saxenapranav May 10, 2024
4ffef8f
Merge branch 'azureBlobClient' into renameDelete
saxenapranav May 13, 2024
d5acd2f
createPath, getBlobProperty in blob for meanwhile; to be removed later
saxenapranav May 13, 2024
710ca79
Merge branch 'azureBlobClient' into renameDelete
saxenapranav May 14, 2024
df84b40
no renameHandler abstract class; have blobRenameHandler from the blob…
saxenapranav May 14, 2024
2c67654
deleteHandler
saxenapranav May 14, 2024
e030dc7
rename and delete internal method
saxenapranav May 14, 2024
bab4bb8
remove unnecessary class
saxenapranav May 14, 2024
0fc7ca8
createCallback and readCallback in store
saxenapranav May 14, 2024
ceb2dc7
Revert "new config: fs.azure.blob.implicit.check.enabled"
saxenapranav May 14, 2024
5661eb2
important changes for renameatomicity
saxenapranav May 15, 2024
1b3ee3f
abfsbloblease in renamehandler
saxenapranav May 15, 2024
97b6b03
client forwards; tests additions;
saxenapranav May 15, 2024
ca6feac
all current delete tests: good.
saxenapranav May 15, 2024
f96afce
rename required changes to make test run
saxenapranav May 15, 2024
ce181de
call pathStatus on nonRoot path; is src path is not there, throw IOEx…
saxenapranav May 16, 2024
2fc652e
checkParentDestination metadata test refactor
saxenapranav May 16, 2024
71f8405
producer / consumer logic braught in.
saxenapranav May 16, 2024
a865e34
added new test for implicit / producer-consumer; src changes
saxenapranav May 16, 2024
32add3b
Assert that delete operation failure should stop List producer.
saxenapranav May 17, 2024
52c1822
etag check before starting rename
saxenapranav May 17, 2024
9359ae0
tests for liststatus and getPathStatus recovery
saxenapranav May 20, 2024
e8b48f1
tests wip
saxenapranav May 20, 2024
ed6ecdc
resue same method
saxenapranav May 20, 2024
7d574a5
Merge branch 'azureBlobClient' into renameDelete
saxenapranav May 21, 2024
896204d
blob copy idempotency check in src; added tests
saxenapranav May 21, 2024
6115a08
tests complete
saxenapranav May 21, 2024
328e097
Merge branch 'azureBlobClient' into renameDelete
saxenapranav May 21, 2024
ab31f75
new delete tests + contract working fine
saxenapranav May 21, 2024
e446e75
all tests working
saxenapranav May 22, 2024
a2cc587
tracing context of final blob operation to have op count for rename/d…
saxenapranav May 22, 2024
0f6db9d
test fixes
saxenapranav May 22, 2024
d22b99c
asf license on new src files
saxenapranav May 22, 2024
27cd964
leaseTimerTask class
saxenapranav May 27, 2024
5adb6f0
pr refactors
saxenapranav May 27, 2024
4d995a3
pr refactors
saxenapranav May 27, 2024
a50dab1
test fixed
saxenapranav May 27, 2024
91fb94a
fix test
saxenapranav May 27, 2024
747a51d
remove unwanted code for test
saxenapranav May 28, 2024
20155f3
callbacks to have tracingContext
saxenapranav May 28, 2024
07dfbdb
correction
saxenapranav May 28, 2024
8756887
added javadocs
saxenapranav May 28, 2024
3ec178b
remove unwanted code
saxenapranav May 28, 2024
7fda902
preRename to happen only on directory rename
saxenapranav May 28, 2024
21d0e4f
changes for new integeration
saxenapranav Jun 5, 2024
29c3c41
fix for unicode test
saxenapranav Jun 5, 2024
e134292
integ issue resovled
saxenapranav Jun 5, 2024
2e4fb2d
Merge branch 'wasbDeprecation_Dev' into renameDelete
saxenapranav Jun 10, 2024
aa9866b
assumption in testProducerStopOnRenameFailure
saxenapranav Jun 10, 2024
c7099db
queue public ops should be synced; minor refactors. (#114)
saxenapranav Jun 12, 2024
d0cb5b9
correction of test of testDeleteIdempotencyTriggerHttp404 -> not to r…
saxenapranav Jun 13, 2024
a961ab1
Merge branch 'azureBlobClient' into renameDelete
saxenapranav Jun 13, 2024
2792325
remove pathUtils
saxenapranav Jun 13, 2024
182967e
take action on renamePending json from client abstract method
saxenapranav Jun 13, 2024
a705138
getClient to get mocking correct in the test
saxenapranav Jun 13, 2024
736b32a
to return BLOB_PATH_NOT_FOUND is already added in the azureBlobClient…
saxenapranav Jun 13, 2024
652c15b
review comments
saxenapranav Jun 14, 2024
5303166
removed callbacks; test fixes;
saxenapranav Jun 17, 2024
ec869ee
added asf and javadocs on RenameAtomicityTestUtils
saxenapranav Jun 17, 2024
6f26b81
remove overwrite overload
saxenapranav Jun 17, 2024
8c5af2e
javadocs
saxenapranav Jun 18, 2024
ebde8d9
Merge branch 'azureBlobClient' into renameDelete
saxenapranav Jun 18, 2024
61da75d
Rename delete review comments. (#120)
saxenapranav Jun 21, 2024
5f92b37
Merge branch 'wasbDepCodeReview' into renameDelete
saxenapranav Jun 21, 2024
8cb4c6f
consume JsonProcessingException if json string is invalid
saxenapranav Jun 24, 2024
dd6f5dc
Merge branch 'renameDelete' of github.com:ABFSDriver/AbfsHadoop into …
saxenapranav Jun 24, 2024
5afeaef
import checks
saxenapranav Jun 24, 2024
e816e96
remove casting from ListActionTaker for client, it s going to be Abfs…
saxenapranav Jun 24, 2024
7b53575
nits refactors;
saxenapranav Jun 25, 2024
d6de0c4
Merge branch 'wasbDepCodeReview' into renameDelete
saxenapranav Jun 25, 2024
603caea
refactors
saxenapranav Jun 25, 2024
52bcd11
release the lease when the rename is not fully successful
saxenapranav Jun 25, 2024
0253dc2
Merge branch 'wasbDepCodeReview' into renameDelete
saxenapranav Jun 30, 2024
fd85f5c
Remove atomic dir action in fileStatus and listStatus from store clas…
saxenapranav Jul 1, 2024
4c9c55c
javadocs nit correction
saxenapranav Jul 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1195,8 +1195,6 @@ public FileStatus getFileStatus(final Path path,

perfInfo.registerSuccess(true);

getClient().takeGetPathStatusAtomicRenameKeyAction(path, tracingContext);

return new VersionedFileStatus(
transformedOwner,
transformedGroup,
Expand Down Expand Up @@ -1313,26 +1311,20 @@ public String listStatus(final Path path, final String startFrom,
Path entryPath = new Path(File.separator + entry.name());
entryPath = entryPath.makeQualified(this.uri, entryPath);

final boolean actionTakenOnRenamePendingJson
= getClient().takeListPathAtomicRenameKeyAction(entryPath,
(int) contentLength,
tracingContext);
if (!actionTakenOnRenamePendingJson) {
fileStatuses.add(
new VersionedFileStatus(
owner,
group,
fsPermission,
hasAcl,
contentLength,
isDirectory,
1,
blockSize,
lastModifiedMillis,
entryPath,
entry.eTag(),
encryptionContext));
}
fileStatuses.add(
new VersionedFileStatus(
owner,
group,
fsPermission,
hasAcl,
contentLength,
isDirectory,
1,
blockSize,
lastModifiedMillis,
entryPath,
entry.eTag(),
encryptionContext));
}

perfInfo.registerSuccess(true);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@
import org.apache.hadoop.fs.azurebfs.AbfsConfiguration;
import org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore;
import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants;
import org.apache.hadoop.fs.azurebfs.constants.FSOperationType;
import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations;
import org.apache.hadoop.fs.azurebfs.constants.HttpQueryParams;
import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsInvalidChecksumException;
Expand Down Expand Up @@ -602,6 +603,7 @@ public AbfsRestOperation listPath(final String relativePath, final boolean recur
requestHeaders);

op.execute(tracingContext);
fixAtomicEntriesInListResults(op, tracingContext);
if (isEmptyListResults(op.getResult()) && is404CheckRequired) {
// If the list operation returns no paths, we need to check if the path is a file.
// If it is a file, we need to return the file in the list.
Expand All @@ -623,6 +625,34 @@ public AbfsRestOperation listPath(final String relativePath, final boolean recur
return op;
}

private void fixAtomicEntriesInListResults(final AbfsRestOperation op,
final TracingContext tracingContext) throws AzureBlobFileSystemException {
/*
* Crashed HBase log rename recovery is done by Filesystem.getFileStatus and
* Filesystem.listStatus.
*/
if (tracingContext == null
|| tracingContext.getOpType() != FSOperationType.LISTSTATUS
|| op == null || op.getResult() == null
|| op.getResult().getStatusCode() != HTTP_OK) {
return;
}
BlobListResultSchema listResultSchema
= (BlobListResultSchema) op.getResult().getListResultSchema();
if (listResultSchema == null) {
return;
}
List<BlobListResultEntrySchema> filteredEntries = new ArrayList<>();
for (BlobListResultEntrySchema entry : listResultSchema.paths()) {
if (!takeListPathAtomicRenameKeyAction(entry.path(),
(int) (long) entry.contentLength(), tracingContext)) {
filteredEntries.add(entry);
}
}

listResultSchema.withPaths(filteredEntries);
}

private boolean isEmptyListResults(AbfsHttpOperation result) {
return result != null && result.getStatusCode() == HTTP_OK &&
result.getListResultSchema() != null &&
Expand Down Expand Up @@ -1107,7 +1137,18 @@ public AbfsRestOperation getPathStatus(final String path,
final TracingContext tracingContext,
final ContextEncryptionAdapter contextEncryptionAdapter)
throws AzureBlobFileSystemException {
return this.getPathStatus(path, tracingContext, contextEncryptionAdapter, true);
AbfsRestOperation op = this.getPathStatus(path, tracingContext,
contextEncryptionAdapter, true);
/*
* Crashed HBase log-folder rename can be recovered by FileSystem#getFileStatus
* and FileSystem#listStatus calls.
*/
if (tracingContext != null
&& tracingContext.getOpType() == FSOperationType.GET_FILESTATUS
&& op.getResult() != null && checkIsDir(op.getResult())) {
takeGetPathStatusAtomicRenameKeyAction(new Path(path), tracingContext);
}
return op;
}

/**
Expand Down Expand Up @@ -1440,9 +1481,16 @@ public boolean isAtomicRenameKey(String key) {
return isKeyForDirectorySet(key, azureAtomicRenameDirSet);
}

@Override
/**
* Action to be taken when atomic-key is present on a getPathStatus path.
*
* @param path path of the pendingJson for the atomic path.
* @param tracingContext tracing context.
*
* @throws AzureBlobFileSystemException server error or the path is renamePending json file and action is taken.
*/
public void takeGetPathStatusAtomicRenameKeyAction(final Path path,
final TracingContext tracingContext) throws IOException {
final TracingContext tracingContext) throws AzureBlobFileSystemException {
if (path == null || path.isRoot() || !isAtomicRenameKey(path.toUri().getPath())) {
return;
}
Expand Down Expand Up @@ -1494,10 +1542,19 @@ public void takeGetPathStatusAtomicRenameKeyAction(final Path path,
}
}

@Override
/**
* Action to be taken when atomic-key is present on a listPath path.
*
* @param path path of the pendingJson for the atomic path.
* @param renamePendingJsonLen length of the pendingJson file.
* @param tracingContext tracing context.
*
* @return true if action is taken.
* @throws AzureBlobFileSystemException server error
*/
public boolean takeListPathAtomicRenameKeyAction(final Path path,
final int renamePendingJsonLen, final TracingContext tracingContext)
throws IOException {
throws AzureBlobFileSystemException {
if (path == null || path.isRoot() || !isAtomicRenameKey(
path.toUri().getPath()) || !path.toUri()
.getPath()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1222,31 +1222,6 @@ public boolean isMetricCollectionEnabled() {
return isMetricCollectionEnabled;
}

/**
* Action to be taken when atomic-key is present on a getPathStatus path.
*
* @param path path of the pendingJson for the atomic path.
* @param tracingContext tracing context.
*
* @throws IOException server error or the path is renamePending json file and action is taken.
*/
public abstract void takeGetPathStatusAtomicRenameKeyAction(final Path path,
final TracingContext tracingContext) throws IOException;

/**
* Action to be taken when a pendingJson is child of an atomic-key listing.
*
* @param path path of the pendingJson for the atomic path.
* @param renamePendingJsonLen length of the json file
* @param tracingContext tracing context.
*
* @return if path is atomicRenameJson and action is taken.
*
* @throws IOException server error
*/
public abstract boolean takeListPathAtomicRenameKeyAction(final Path path,
final int renamePendingJsonLen, final TracingContext tracingContext) throws IOException;

class TimerTaskImpl extends TimerTask {
TimerTaskImpl() {
runningTimerTask = this;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1329,18 +1329,6 @@ public String decodeAttribute(byte[] value) throws UnsupportedEncodingException
return new String(value, XMS_PROPERTIES_ENCODING_ASCII);
}

@Override
public void takeGetPathStatusAtomicRenameKeyAction(final Path path,
final TracingContext tracingContext) throws IOException {

}

@Override
public boolean takeListPathAtomicRenameKeyAction(final Path path,
final int renamePendingJsonLen, final TracingContext tracingContext) throws IOException {
return false;
}

private String convertXmsPropertiesToCommaSeparatedString(final Map<String,
String> properties) throws CharacterCodingException {
StringBuilder commaSeparatedProperties = new StringBuilder();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ int getMaxConsumptionParallelism() {
/**
* Orchestrates the rename operation.
*/
public AbfsClientRenameResult execute() throws IOException {
public AbfsClientRenameResult execute() throws AzureBlobFileSystemException {
PathInformation pathInformation = new PathInformation();
boolean result = false;
if (preCheck(src, dst, pathInformation)) {
Expand Down Expand Up @@ -158,7 +158,7 @@ public AbfsClientRenameResult execute() throws IOException {
}
}

private boolean finalSrcRename() throws IOException {
private boolean finalSrcRename() throws AzureBlobFileSystemException {
tracingContext.setOperatedBlobCount(operatedBlobCount.get() + 1);
try {
return renameInternal(src, dst);
Expand All @@ -168,8 +168,7 @@ private boolean finalSrcRename() throws IOException {
}

@VisibleForTesting
public RenameAtomicity getRenameAtomicity(final PathInformation pathInformation)
throws IOException {
public RenameAtomicity getRenameAtomicity(final PathInformation pathInformation) {
return new RenameAtomicity(src,
dst,
new Path(src.getParent(), src.getName() + RenameAtomicity.SUFFIX),
Expand Down Expand Up @@ -515,8 +514,8 @@ private PathInformation getPathInformation(Path path,
TracingContext tracingContext)
throws AzureBlobFileSystemException {
try {
AbfsRestOperation op = abfsClient.getPathStatus(path.toString(), false,
tracingContext, null);
AbfsRestOperation op = abfsClient.getPathStatus(path.toString(),
tracingContext, null, true);

return new PathInformation(true,
abfsClient.checkIsDir(op.getResult()),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
import org.apache.hadoop.classification.VisibleForTesting;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem;
import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsDriverException;
import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsRestOperationException;
import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AzureBlobFileSystemException;
import org.apache.hadoop.fs.azurebfs.contracts.services.AppendRequestParameters;
Expand Down Expand Up @@ -122,7 +123,7 @@ public RenameAtomicity(final Path renameJsonPath,
/**
* Redo the rename operation from the JSON file.
*/
public void redo() throws IOException {
public void redo() throws AzureBlobFileSystemException {
byte[] buffer = readRenamePendingJson(renameJsonPath, renamePendingJsonLen);
String contents = new String(buffer, Charset.defaultCharset());
try {
Expand Down Expand Up @@ -203,15 +204,15 @@ void createRenamePendingJson(Path path, byte[] bytes)
* @return Length of the JSON file.
*/
@VisibleForTesting
public int preRename() throws IOException {
public int preRename() throws AzureBlobFileSystemException {
String makeRenamePendingFileContents = makeRenamePendingFileContents(
srcEtag);

try {
createRenamePendingJson(renameJsonPath,
makeRenamePendingFileContents.getBytes(StandardCharsets.UTF_8));
return makeRenamePendingFileContents.length();
} catch (IOException e) {
} catch (AzureBlobFileSystemException e) {
/*
* Scenario: file has been deleted by parallel thread before the RenameJSON
* could be written and flushed. In such case, there has to be one retry of
Expand Down Expand Up @@ -246,7 +247,7 @@ private boolean isPreRenameRetriableException(IOException e) {
return false;
}

public void postRename() throws IOException {
public void postRename() throws AzureBlobFileSystemException {
deleteRenamePendingJson();
}

Expand All @@ -272,14 +273,17 @@ private void deleteRenamePendingJson() throws AzureBlobFileSystemException {
* @return JSON string which represents the operation.
*/
private String makeRenamePendingFileContents(String eTag) throws
JsonProcessingException {
AzureBlobFileSystemException {

final RenamePendingJsonFormat renamePendingJsonFormat = new RenamePendingJsonFormat();
renamePendingJsonFormat.setOldFolderName(src.toUri().getPath());
renamePendingJsonFormat.setNewFolderName(dst.toUri().getPath());
renamePendingJsonFormat.setETag(eTag);

return objectMapper.writeValueAsString(renamePendingJsonFormat);
try {
return objectMapper.writeValueAsString(renamePendingJsonFormat);
} catch (JsonProcessingException e) {
throw new AbfsDriverException(e);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is this new exception type needed ?

}
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -294,6 +294,10 @@ public String getPosition() {
return position;
}

public FSOperationType getOpType() {
return opType;
}

/**
* Sets the ingress handler.
*
Expand Down
Loading