-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YARN-11578. Cache fs supports chmod in LogAggregationFileController. #6120
Conversation
The check introduced in YARN-10901 to avoid a warn message in NN logs in certain situations (when /tmp/logs is not owned by the yarn user), but it adds 3 NameNode calls (create, setpermission, delete) during log aggregation collection, for every NM. Meaning, when a YARN job completes, at the YARN log aggregation phase this check is done for every job, from every NodeManager. In 30 minutes 4.2 % of all the NameNode calls were due to this in a cluster. "write" calls need a Namesystem writeLock as well, so the impact is bigger. Change-Id: I65468aa972860d3b62050fcb41b8b06e417ee8bb
💔 -1 overall
This message was automatically generated. |
.../java/org/apache/hadoop/yarn/logaggregation/filecontroller/LogAggregationFileController.java
Show resolved
Hide resolved
Change-Id: I0fc25b90662fbec29e8be056db55d1a1c970e4d4
🎊 +1 overall
This message was automatically generated. |
} catch (UnsupportedOperationException use) { | ||
LOG.info("Unable to set permissions for configured filesystem since" | ||
+ " it does not support this {}", remoteFS.getScheme()); | ||
fsSupportsChmod = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for reviewers. Prior to this change, fsSupportsChmod
was only set to false when the UnsupportedOperationException
was thrown, otherwise fsSupportsChmod
was not modified (by default it was set to true).
Now fsSupportsChmod
can be updated (set to true when everything succeeds) or false when anything fails. Alternatively we could set it to false only when UnsupportedOperationException
is thrown and set it to true only when everything succeeds otherwise keeping it as-is.
Or keep the original behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the naming and the context I think we can go ahead with the modified behaviour: only set it to true when everything succeeds.
LOG.info("Unable to set permissions for configured filesystem since" | ||
+ " it does not support this {}", remoteFS.getScheme()); | ||
} catch (IOException e) { | ||
LOG.warn("Failed to check if FileSystem supports permissions on " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the log we should use {}.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. Fixed it.
Change-Id: I6a2852172b66b3b50f36765b0ba9dcb62fa97081
🎊 +1 overall
This message was automatically generated. |
Thanks @tomicooler for the patch, LGTM. @slfan1989 thanks for the review, do you have anything, or are you ok with this being merged? |
Additionally, @tomicooler can you please do the backports for 3.3 and 3.2 branches? |
…pache#6120) Change-Id: Ib02eb32aaca799a9a23ec7c38f0d5a0578fe5c80 # Conflicts: # hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/filecontroller/LogAggregationFileController.java # hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/filecontroller/TestLogAggregationFileController.java
…pache#6120) Change-Id: I9e0fb4c6635fd365e259605ee85c91d53ba7528f
There were conflicts on both branches, need to wait for Yetus. |
The check introduced in YARN-10901 to avoid a warn message in NN logs in certain situations (when /tmp/logs is not owned by the yarn user), but it adds 3 NameNode calls (create, setpermission, delete) during log aggregation collection, for every NM.
Meaning, when a YARN job completes, at the YARN log aggregation phase this check is done for every job, from every NodeManager.
In 30 minutes 4.2 % of all the NameNode calls were due to this in a cluster. "write" calls need a Namesystem writeLock as well, so the impact is bigger.
Change-Id: I65468aa972860d3b62050fcb41b8b06e417ee8bb
Description of PR
Added a static concurrent cache that maps the
<FS class type + Log Path>
to the checkresult
.Assumptions:
If these assumptions are not met, we might need to come up with a different idea.
How was this patch tested?
Updated the unit tests to verify that caching works.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?