-
Notifications
You must be signed in to change notification settings - Fork 665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi am facing an issue with submit a job from java. #16
Comments
Please provide more details: your script, its log and error messages. |
this is the logs. Log Type: directory.info this is how my project structure is. spark-application --> scala1 class // am calling the java class from this class. --> java class // this will submit another spark application to the yarn cluster. Another spark-application --> scala2 class if am invoking a java class from scala1.class by using spark-submit in the local spark-submit --class scala2.class is getting triggered and working good. if am invoking a java class from scala1.class by using spark-submit in yarn spark-submit --class scala2.class is getting triggered and facing the issue or error.
this is spark submit command am using. spark-submit --class scala1 --master yarn --deploy-mode cluster --num-executors 2 --executor-cores 2 --conf spark.yarn.executor.memoryOverhead=600 --conf spark.yarn.submit.waitAppCompletion=false /home/ankushuser/kafka_retry/kafka_retry_test/sparkflightaware/target/sparkflightaware-0.0.1-SNAPSHOT-jar-with-dependencies.jar if I run this spark-submit command locally it is invoking the java class and the spark-submit command for scala2 application is working good. If I run it in yarn then am facing the issue. |
Can you also please include error messages you are getting? Thanks, |
Am getting the above error message alone. Thanks, |
Same issue here. Everything submits without any warnings or errors, but then nothing happens and the yarn logs report 'Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster' on two nodes in the cluster.(nothing else on any other node) I have instead resorted to invoking spark-submit from my java code, which is working. |
@JackDavidson hi did you copy the jar files to worker nodes? our existing spark application or the command that we are running might not run on the head node itself. Work around use livy and store your jar war in a hdfs or any location. you can use it to post the spark-submit command. |
@ankushreddy My project's jar file (there is only one, and it is a fat jar) is stored in HDFS. I haven't ever done any manual copying of jars anywhere. This works with spark-submit, but maybe submitting from java is different in some way? The jar I am submitting does not contain the class that spark cant find though. The spark libraries under spark home do contain that class though. But I wonder, if I add a dependency on spark-yarn in my submitted fat jar, would spark then be able to find the missing class? Livy sounds like a great alternative. I'll start looking into that |
@JackDavidson in our case we included all the dependencies along with the jar so we didn't face any issues with missing dependencies. I would suggest you to look at livy in most of the cases that should solve the problem. |
Hi when am invoking
https://github.com/mahmoudparsian/data-algorithms-book/blob/master/misc/how-to-submit-spark-job-to-yarn-from-java-code.md
this class from a running spark-submit from local it is getting invoked and able to submit the spark-submit to the yarn cluster.
But when am invoking the class by running a spark-submit which is being submitted to yarn then this particular how-to-submit-spark-job-to-yarn-from-java-code.md class is getting accepted but it is not moving to a running state. and getting failed by throwing an error.
Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster
Application application_1493671618562_0072 failed 5 times due to AM Container for appattempt_1493671618562_0072_000005 exited with exitCode: 1
For more detailed output, check the application tracking page: http://headnode.internal.cloudapp.net:8088/cluster/app/application_1493671618562_0072 Then click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e02_1493671618562_0072_05_000001
Exit code: 1
Exception message: /mnt/resource/hadoop/yarn/local/usercache/helixuser/appcache/application_1493671618562_0072/container_e02_1493671618562_0072_05_000001/launch_container.sh: line 26: $PWD:$PWD/spark_conf:$PWD/spark.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/:/usr/hdp/current/hadoop-client/lib/:/usr/hdp/current/hadoop-hdfs-client/:/usr/hdp/current/hadoop-hdfs-client/lib/:/usr/hdp/current/hadoop-yarn-client/:/usr/hdp/current/hadoop-yarn-client/lib/:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/:$PWD/mr-framework/hadoop/share/hadoop/common/:$PWD/mr-framework/hadoop/share/hadoop/common/lib/:$PWD/mr-framework/hadoop/share/hadoop/yarn/:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/:$PWD/mr-framework/hadoop/share/hadoop/hdfs/:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
Stack trace: ExitCodeException exitCode=1: /mnt/resource/hadoop/yarn/local/usercache/helixuser/appcache/application_1493671618562_0072/container_e02_1493671618562_0072_05_000001/launch_container.sh: line 26: $PWD:$PWD/spark_conf:$PWD/spark.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/:/usr/hdp/current/hadoop-client/lib/:/usr/hdp/current/hadoop-hdfs-client/:/usr/hdp/current/hadoop-hdfs-client/lib/:/usr/hdp/current/hadoop-yarn-client/:/usr/hdp/current/hadoop-yarn-client/lib/:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/:$PWD/mr-framework/hadoop/share/hadoop/common/:$PWD/mr-framework/hadoop/share/hadoop/common/lib/:$PWD/mr-framework/hadoop/share/hadoop/yarn/:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/:$PWD/mr-framework/hadoop/share/hadoop/hdfs/:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
at org.apache.hadoop.util.Shell.runCommand(Shell.java:933)
at org.apache.hadoop.util.Shell.run(Shell.java:844)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1123)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:225)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
Thank you for your help.
Thanks,
Ankush Reddy.
The text was updated successfully, but these errors were encountered: