Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-19305: Fix ProcessEnvironment ClassCastException in Shell.java #7106

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

zhangbutao
Copy link

@zhangbutao zhangbutao commented Oct 10, 2024

Description of PR

See HADOOP-19305 and HIVE-28191

How was this patch tested?

Tested with the Apache Hive Tests.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@slfan1989
Copy link
Contributor

@zhangbutao I will review this PR later. I hope that after resolving this issue, we can update Hadoop 3.4.0 in Hive.

@zhangbutao
Copy link
Author

@zhangbutao I will review this PR later. I hope that after resolving this issue, we can update Hadoop 3.4.0 in Hive.

Thanks @slfan1989 . Please check my comment. Change Shell.java can fix the issue, and imrpove hive's UTs code can also resolve this issue. Changing Shell.java is more simple but i am not sure when hadoop release the new version. Imrpoving hive's UTs may need some time to explore&test the env related codes.
https://issues.apache.org/jira/browse/HADOOP-19305?focusedCommentId=17888197&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17888197

@slfan1989
Copy link
Contributor

@zhangbutao Thank you for your contribution! I will respond later today.

@steveloughran
Copy link
Contributor

while this change seems valid, why aren't you just using Shell.setEnvironment() getting at the patch either through reflection or (cleaner) putting some accessor class inside the same package?

@steveloughran
Copy link
Contributor

Also, can you try doing this PR against branch-3.4.1, as this is the one we want to build an RC off this week. get in in there and the next hadoop release will work

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 30s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 48m 37s trunk passed
+1 💚 compile 19m 53s trunk passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 compile 18m 11s trunk passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 checkstyle 1m 19s trunk passed
+1 💚 mvnsite 1m 46s trunk passed
+1 💚 javadoc 1m 18s trunk passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javadoc 0m 53s trunk passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 spotbugs 2m 41s trunk passed
+1 💚 shadedclient 42m 0s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 58s the patch passed
+1 💚 compile 19m 0s the patch passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javac 19m 0s the patch passed
+1 💚 compile 18m 28s the patch passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 javac 18m 28s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 17s the patch passed
+1 💚 mvnsite 1m 42s the patch passed
+1 💚 javadoc 1m 12s the patch passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javadoc 0m 53s the patch passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 spotbugs 2m 46s the patch passed
+1 💚 shadedclient 44m 6s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 33s hadoop-common in the patch passed.
+1 💚 asflicense 1m 5s The patch does not generate ASF License warnings.
248m 49s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7106/1/artifact/out/Dockerfile
GITHUB PR #7106
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 803e0335ccc6 5.15.0-119-generic #129-Ubuntu SMP Fri Aug 2 19:25:20 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 3e4803c
Default Java Private Build-1.8.0_422-8u422-b05-1~20.04-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_422-8u422-b05-1~20.04-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7106/1/testReport/
Max. process+thread count 1273 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7106/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@zhangbutao
Copy link
Author

while this change seems valid, why aren't you just using Shell.setEnvironment() getting at the patch either through reflection or (cleaner) putting some accessor class inside the same package?

@steveloughran Thanks your suggestion. Not sure if I understand what you mean correctly.
Hive does not use the Shell directly. The error stacktrace in HADOOP-19305 shows that it is called indirectly by Hive through RawLocalFileSystem :
org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:952) ~[hadoop-common-3.4.0.jar:?]
So i don't think it is good to use Shell.setEnvironment().

BTW, the root cause is not from Shell.java, but this change can as a workaroud to skip the exception in Hive. It would be good if we can merge this as a improvement. I am also trying to fix it at the root & Hive side. :)

@zhangbutao
Copy link
Author

Also, can you try doing this PR against branch-3.4.1, as this is the one we want to build an RC off this week. get in in there and the next hadoop release will work

@steveloughran #7107 Fix in branch-3.4.1

@steveloughran
Copy link
Contributor

@zhangbutao afraid RC3 is already out, and as this is a test only failure I don't think it is a blocker. Why doesn't the setEnvironment method work for your tests?

@zhangbutao
Copy link
Author

zhangbutao commented Oct 10, 2024

@zhangbutao afraid RC3 is already out, and as this is a test only failure I don't think it is a blocker.

@steveloughran No worry, i can also fix this at Hive side. See commit apache/hive@fa7797b in apache/hive#5500 . In fact, This related code snippet is useless & incorrect, but it triggered this issue. I just remove the code snippet. I think i won't take much time to fix the useless test code.

Why doesn't the setEnvironment method work for your tests?

Sorry i didn't get you about the setEnvironment. Do you mean the change like this:

index 91868365b13..8f2387ebaa3 100644
--- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
+++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
@@ -977,7 +977,7 @@ private void runCommand() throws IOException {
       builder.environment().clear();
     }

-    builder.environment().putAll(this.environment);
+    setEnvironment(environment);

I tested this setEnvironment(environment), it can also fix my issue.

@slfan1989
Copy link
Contributor

slfan1989 commented Oct 11, 2024

@zhangbutao

Thank you for your continued attention to this issue! The change seems feasible, but we may need to find a workaround.

  1. Hadoop is currently in the process of releasing version 3.4.1 (which is very resource-intensive for RM, especially with Steve investing a lot of time). The RC3 for 3.4.1 has already been released, and if the vote passes, Steve will release hadoop-3.4.1. Therefore, if we want to merge this PR, it might have to wait until the release of 3.4.2.

  2. If we can make adjustments in Hive, could we first implement unit test modifications in Hive to address this issue? This way, we can apply hadoop-3.4.0 to the Hive branch.

@zhangbutao
Copy link
Author

@zhangbutao

Thank you for your continued attention to this issue! The change seems feasible, but we may need to find a workaround.

  1. Hadoop is currently in the process of releasing version 3.4.1 (which is very resource-intensive for RM, especially with Steve investing a lot of time). The RC3 for 3.4.1 has already been released, and if the vote passes, Steve will release hadoop-3.4.1. Therefore, if we want to merge this PR, it might have to wait until the release of 3.4.2.
  2. If we can make adjustments in Hive, could we first implement unit test modifications in Hive to address this issue? This way, we can apply hadoop-3.4.0 to the Hive branch.

I have found the solution in Hive side. So free feel to merge it in Hadoop 3.4.2. BTW, if Hadoop 3.4.1 is released out, i think we can update hadoop version to 3.4.1 in Hive. :)

@slfan1989
Copy link
Contributor

@zhangbutao
Thank you for your continued attention to this issue! The change seems feasible, but we may need to find a workaround.

  1. Hadoop is currently in the process of releasing version 3.4.1 (which is very resource-intensive for RM, especially with Steve investing a lot of time). The RC3 for 3.4.1 has already been released, and if the vote passes, Steve will release hadoop-3.4.1. Therefore, if we want to merge this PR, it might have to wait until the release of 3.4.2.
  2. If we can make adjustments in Hive, could we first implement unit test modifications in Hive to address this issue? This way, we can apply hadoop-3.4.0 to the Hive branch.

I have found the solution in Hive side. So free feel to merge it in Hadoop 3.4.2. BTW, if Hadoop 3.4.1 is released out, i think we can update hadoop version to 3.4.1 in Hive. :)

@zhangbutao

hadoop-3.4.1 may take some time to be released; I personally feel it could be at least 1-2 weeks, or even longer. Would it be possible for us to upgrade the supported Hadoop version in Hive to 3.4.0 first? This way, when Hadoop 3.4.1 is released, we can submit a PR for the upgrade version.

@slfan1989
Copy link
Contributor

slfan1989 commented Oct 11, 2024

I have found the solution in Hive side. So free feel to merge it in Hadoop 3.4.2. BTW, if Hadoop 3.4.1 is released out, i think we can update hadoop version to 3.4.1 in Hive. :)

@zhangbutao I can approve this PR, but we still need Steve's consent, as he is very knowledgeable and insightful in this area.

@@ -977,7 +977,9 @@ private void runCommand() throws IOException {
builder.environment().clear();
}

builder.environment().putAll(this.environment);
if (!environment.isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhangbutao Personally, I think adding a check is reasonable, but can we modify it further based on Steve's suggestions?

cc: @steveloughran

@zhangbutao
Copy link
Author

zhangbutao commented Oct 11, 2024

hadoop-3.4.1 may take some time to be released; I personally feel it could be at least 1-2 weeks, or even longer. Would it be possible for us to upgrade the supported Hadoop version in Hive to 3.4.0 first? This way, when Hadoop 3.4.1 is released, we can submit a PR for the upgrade version.

We may also need Apache Tez to upgrade Hadoop version to 3.4.0 at the same time. So i am not sure if we can quickly upgrade Hadoop version in Hive.
@ayushtkn may give some thought about upgrading hadoop version in Tez side.
Refer to Tez side about upgrading Hadoop version to 3.4.0 apache/tez#342 (review)

@Hexiaoqiao
Copy link
Contributor

@zhangbutao Hi, what is different to #7107 , if they are the same one, please choose one to close, thanks.

@zhangbutao
Copy link
Author

@zhangbutao Hi, what is different to #7107 , if they are the same one, please choose one to close, thanks.

@Hexiaoqiao it is a backport to branch3.4.1. If 3.4.1 is released out, please feel free to close it. thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants