Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-17496. DataNode supports more fine-grained dataset lock based on blockid. #6764

Open
wants to merge 12 commits into
base: trunk
Choose a base branch
from
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@ public interface DataNodeLockManager<T extends AutoCloseDataSetLock> {
*/
enum LockLevel {
BLOCK_POOl,
VOLUME
VOLUME,
DIR
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@ private String generateLockName(LockLevel level, String... resources) {
+ resources[0] + "volume lock :" + resources[1]);
}
return resources[0] + resources[1];
} else if (resources.length == 3 && level == LockLevel.DIR) {
return resources[0] + resources[1] + resources[2];
hfutatzhanghb marked this conversation as resolved.
Show resolved Hide resolved
} else {
throw new IllegalArgumentException("lock level do not match resource");
}
Expand Down Expand Up @@ -153,7 +155,7 @@ public DataSetLockManager() {
public AutoCloseDataSetLock readLock(LockLevel level, String... resources) {
if (level == LockLevel.BLOCK_POOl) {
return getReadLock(level, resources[0]);
} else {
} else if (level == LockLevel.VOLUME){
AutoCloseDataSetLock bpLock = getReadLock(LockLevel.BLOCK_POOl, resources[0]);
AutoCloseDataSetLock volLock = getReadLock(level, resources);
volLock.setParentLock(bpLock);
Expand All @@ -162,14 +164,25 @@ public AutoCloseDataSetLock readLock(LockLevel level, String... resources) {
resources[0]);
}
return volLock;
} else {
AutoCloseDataSetLock bpLock = getReadLock(LockLevel.BLOCK_POOl, resources[0]);
AutoCloseDataSetLock volLock = getReadLock(LockLevel.VOLUME, resources[0], resources[1]);
volLock.setParentLock(bpLock);
AutoCloseDataSetLock dirLock = getReadLock(level, resources);
dirLock.setParentLock(volLock);
if (openLockTrace) {
LOG.debug("Sub lock " + resources[0] + resources[1] + resources[2] + " parent lock " +
resources[0] + resources[1]);
}
return dirLock;
}
}

@Override
public AutoCloseDataSetLock writeLock(LockLevel level, String... resources) {
if (level == LockLevel.BLOCK_POOl) {
return getWriteLock(level, resources[0]);
} else {
} else if (level == LockLevel.VOLUME) {
AutoCloseDataSetLock bpLock = getReadLock(LockLevel.BLOCK_POOl, resources[0]);
AutoCloseDataSetLock volLock = getWriteLock(level, resources);
volLock.setParentLock(bpLock);
Expand All @@ -178,6 +191,17 @@ public AutoCloseDataSetLock writeLock(LockLevel level, String... resources) {
resources[0]);
}
return volLock;
} else {
AutoCloseDataSetLock bpLock = getReadLock(LockLevel.BLOCK_POOl, resources[0]);
AutoCloseDataSetLock volLock = getReadLock(LockLevel.VOLUME, resources[0], resources[1]);
volLock.setParentLock(bpLock);
AutoCloseDataSetLock dirLock = getWriteLock(level, resources);
dirLock.setParentLock(volLock);
if (openLockTrace) {
LOG.debug("Sub lock " + resources[0] + resources[1] + resources[2] + " parent lock " +
resources[0] + resources[1]);
}
return dirLock;
}
}

Expand Down Expand Up @@ -224,8 +248,13 @@ public void addLock(LockLevel level, String... resources) {
String lockName = generateLockName(level, resources);
if (level == LockLevel.BLOCK_POOl) {
lockMap.addLock(lockName, new ReentrantReadWriteLock(isFair));
} else if (level == LockLevel.VOLUME) {
lockMap.addLock(resources[0], new ReentrantReadWriteLock(isFair));
lockMap.addLock(lockName, new ReentrantReadWriteLock(isFair));
} else {
lockMap.addLock(resources[0], new ReentrantReadWriteLock(isFair));
lockMap.addLock(generateLockName(LockLevel.VOLUME, resources[0], resources[1]),
new ReentrantReadWriteLock(isFair));
lockMap.addLock(lockName, new ReentrantReadWriteLock(isFair));
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.hdfs.protocol.Block;
Expand Down Expand Up @@ -127,6 +129,31 @@ public static File idToBlockDir(File root, long blockId) {
return new File(root, path);
}

/**
* Take an example. We hava a block with blockid mapping to:
* "/data1/hadoop/hdfs/datanode/current/BP-xxxx/current/finalized/subdir0/subdir0"
* We return "subdir0/subdir0"
* @param blockId blockId
* @return The two-level subdir name
*/
public static String idToBlockDirSuffixName(long blockId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's different between idToBlockDir and idToBlockDirSuffixName?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idToBlockDirSuffixName method is used to generate SubDir lock name, just like BlockPool id or storageuuid.

int d1 = (int) ((blockId >> 16) & 0x1F);
int d2 = (int) ((blockId >> 8) & 0x1F);
return DataStorage.BLOCK_SUBDIR_PREFIX + d1 + SEP +
DataStorage.BLOCK_SUBDIR_PREFIX + d2;
}

public static List<String> getAllSubDirNameForDataSetLock() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get what this method want to do, and why there are double layer loop (include the invoke enter) and user 0x1F as end condition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is used to generate all subdir lock name. We use 0x1F as end condition because the layout of DataNode.
Each blockpool dir under one volume has two layer subdir, like subdir1/subdir31.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When DataNode starts, we will initial all dataset subdir lock in advance. Every volume has its subdir dataset locks.
In current layout of datanode's storage directory, we have 32 multipy 32 subdirs in each volume.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is good idea to follow the current directory layout to construct lock based blockid. since:
a. We have to keep same logic between the directory layout and block/subdir lock, right? However it implements at different place without any constraint, which will involve some cost for the future improvement when change the directory layout.
b. I think we could decouple block/subdir lock from directory layout, define another rule for block/subdir lock with blockid only(such as id mod 1000, split one volume lock to 1000 locks is enough to improve the performance), which I think it make sense to block/subdir.
What do you think about? Thanks.

List<String> res = new ArrayList<>();
for (int d1 = 0; d1 <= 0x1F; d1++) {
for (int d2 = 0; d2 <= 0x1F; d2++) {
res.add(DataStorage.BLOCK_SUBDIR_PREFIX + d1 + SEP +
DataStorage.BLOCK_SUBDIR_PREFIX + d2);
}
}
return res;
}

/**
* @return the FileInputStream for the meta data of the given block.
* @throws FileNotFoundException
Expand Down
Loading
Loading