-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-17685: Option to explicitly choose DFS client lease renewal interval #7215
base: trunk
Are you sure you want to change the base?
HDFS-17685: Option to explicitly choose DFS client lease renewal interval #7215
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks alright by me.
<value>0</value> | ||
<description> | ||
If set between 0 and 30000 inclusive, HDFS clients will renew leases for files they are writing at this interval. | ||
If dfs.client.lease.renewal.interval.ms is not set and ipc.client.rpc-timeout.ms is set between 0 and 60000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is magic about the 60000
maximum value of ipc.client.rpc-timeout.ms
? I guess that comes from existing behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It comes from existing behavior, yes. The lease renewal interval isn't allowed to be longer than half of the NameNode's timeout, which is hard-coded at 60 seconds.
return min; | ||
} | ||
|
||
for (DFSClient c : dfsClients) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you combine the two loop to one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I think it would make the code more complicated. The logic I want is that the lowest valid value from getLeaseRenewalIntervalMs()
takes precedence over any value from getHdfsTimeout()
. So, doing that in one loop would require tracking two min
s.
Description of PR
Currently, DFSClients send lease renewals to the NameNode at an interval equal to half of {{ipc.client.rpc-timeout.ms}}. This logic dates back to 2009 in HDFS-278. At my company, we are interested in using short DFS client timeouts (< 10 seconds). However, we're currently hesitant to do so because that would flood the NameNode with lease renewals.
I propose a setting {{dfs.client.lease.renewal.interval.ms}} that, if nonzero, would be used as the lease renewal interval instead of {{ipc.client.rpc-timeout.ms / 2}}. This would be useful for advanced users that want to control their RPC timeout and lease renewals separately.
How was this patch tested?
Unit tests are included in this PR for the new logic. I have also tested the new feature in my company's HBase infrastructure.
For code changes:
Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?not applicableIf adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?not applicableIf applicable, have you updated thenot applicableLICENSE
,LICENSE-binary
,NOTICE-binary
files?