[Bug] Error in Executor setup After Permanent UDF Deletion #6812

hzxiongyinke · 2024-11-18T10:59:01Z

Code of Conduct

I agree to follow this project's Code of Conduct

Search before asking

I have searched in the issues and found no similar issues.

Describe the bug

Hello everyone,

I've encountered an issue with Kyuubi that I'm hoping the community can help with.

I created a permanent UDF in a Kyuubi instance, and later, due to requirement changes, I deleted this UDF through another driver. However, any SQL executed by the previously initiated driver now results in an error, indicating that the UDF cannot be found. Currently, the only solution I have is to restart the Kyuubi engine.

Affects Version(s)

1.8.0

Kyuubi Server Log Output

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 279.0 failed 4 times, most recent failure: Lost task 0.3 in stage 279.0 (TID 251) (core-xxxx.cn-shanghai.emr.aliyuncs.com executor 29): java.io.FileNotFoundException:  [ErrorMessage]: File not found: .GalaxyResource/bigdata_emr_sh/xxx in bucket xxx
        at com.aliyun.jindodata.api.spec.JdoNativeResult.get(JdoNativeResult.java:54)
        at com.aliyun.jindodata.api.spec.protos.coder.JdolistDirectoryReplyDecoder.decode(JdolistDirectoryReplyDecoder.java:23)
        at com.aliyun.jindodata.api.JindoCommonApis.listDirectory(JindoCommonApis.java:112)
        at com.aliyun.jindodata.call.JindoListCall.execute(JindoListCall.java:65)
        at com.aliyun.jindodata.common.JindoHadoopSystem.listStatus(JindoHadoopSystem.java:665)
        at com.aliyun.jindodata.common.JindoHadoopSystem.listStatus(JindoHadoopSystem.java:60)
        at org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:851)
        at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:820)
        at org.apache.spark.util.Utils$.fetchFile(Utils.scala:544)
        at org.apache.spark.executor.Executor.$anonfun$updateDependencies$13(Executor.scala:1010)
        at org.apache.spark.executor.Executor.$anonfun$updateDependencies$13$adapted(Executor.scala:1002)
        at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:985)
        at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)
        at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)
        at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)
        at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)
        at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:984)
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:1002)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:506)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)

Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2673)
        at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2609)
        at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2608)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2608)
        at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1182)
        at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1182)
        at scala.Option.foreach(Option.scala:407)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1182)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2861)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2803)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2792)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:952)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2241)
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:269)

Kyuubi Engine Log Output

spark executor log error:
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] Executor: Running task 0.0 in stage 283.0 (TID 264)
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] Executor: Fetching oss://xxx/xxxx/xxx with timestamp 1731900662592
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] HadoopLoginUserInfo: TOKEN: YARN_AM_RM_TOKEN
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] HadoopLoginUserInfo: User: xxxx, authMethod: SIMPLE, ugi: xxxx (auth:SIMPLE)
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] JindoHadoopSystem: Initialized native file system: 
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] FsStats: cmd=getFileStatus, src=oss://xxxx/.xxxx/xxx/xxxx, dst=null, size=0, parameter=null, time-in-ms=77, version=6.2.0
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] FsStats: cmd=list, src=oss://xxxx/.xxxx/xxxx/xxxx, dst=null, size=0, parameter=null, time-in-ms=26, version=6.2.0
24/11/18 16:06:18 ERROR [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] Executor: Exception in task 0.0 in stage 283.0 (TID 264)
java.io.FileNotFoundException:  [ErrorMessage]: File not found: .xxxx/xxxx/xxxx in bucket xxxx
	at com.aliyun.jindodata.api.spec.JdoNativeResult.get(JdoNativeResult.java:54) ~[jindo-core-6.2.0.jar:?]
	at com.aliyun.jindodata.api.spec.protos.coder.JdolistDirectoryReplyDecoder.decode(JdolistDirectoryReplyDecoder.java:23) ~[jindo-core-6.2.0.jar:?]
	at com.aliyun.jindodata.api.JindoCommonApis.listDirectory(JindoCommonApis.java:112) ~[jindo-core-6.2.0.jar:?]
	at com.aliyun.jindodata.call.JindoListCall.execute(JindoListCall.java:65) ~[jindo-sdk-6.2.0.jar:?]
	at com.aliyun.jindodata.common.JindoHadoopSystem.listStatus(JindoHadoopSystem.java:665) ~[jindo-sdk-6.2.0.jar:?]
	at com.aliyun.jindodata.common.JindoHadoopSystem.listStatus(JindoHadoopSystem.java:60) ~[jindo-sdk-6.2.0.jar:?]
	at org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:851) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:820) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at org.apache.spark.util.Utils$.fetchFile(Utils.scala:544) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at org.apache.spark.executor.Executor.$anonfun$updateDependencies$13(Executor.scala:1010) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at org.apache.spark.executor.Executor.$anonfun$updateDependencies$13$adapted(Executor.scala:1002) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:985) ~[scala-library-2.12.15.jar:?]
	at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) ~[scala-library-2.12.15.jar:?]
	at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) ~[scala-library-2.12.15.jar:?]
	at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) ~[scala-library-2.12.15.jar:?]
	at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) ~[scala-library-2.12.15.jar:?]
	at scala.collection.mutable.HashMap.foreach(HashMap.scala:149) ~[scala-library-2.12.15.jar:?]
	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:984) ~[scala-library-2.12.15.jar:?]
	at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:1002) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:506) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_392]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_392]
	at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_392]
24/11/18 16:06:18 INFO [dispatcher-Executor] YarnCoarseGrainedExecutorBackend: Got assigned task 265

Kyuubi Server Configurations

No response

Kyuubi Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
No. I cannot submit a PR at this time.

github-actions · 2024-11-18T10:59:27Z

Hello @hzxiongyinke,
Thanks for finding the time to report the issue!
We really appreciate the community's efforts to improve Apache Kyuubi.

hzxiongyinke · 2024-11-18T12:12:54Z

cc @yaooqinn @pan3793

yaooqinn · 2024-11-19T03:45:13Z

The failed Spark application didn't even access the missing jar file, did it?

hzxiongyinke added kind:bug This is a clearly a bug priority:major labels Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Error in Executor setup After Permanent UDF Deletion #6812

[Bug] Error in Executor setup After Permanent UDF Deletion #6812

hzxiongyinke commented Nov 18, 2024

github-actions bot commented Nov 18, 2024

hzxiongyinke commented Nov 18, 2024

yaooqinn commented Nov 19, 2024

[Bug] Error in Executor setup After Permanent UDF Deletion #6812

[Bug] Error in Executor setup After Permanent UDF Deletion #6812

Comments

hzxiongyinke commented Nov 18, 2024

Code of Conduct

Search before asking

Describe the bug

Affects Version(s)

Kyuubi Server Log Output

Kyuubi Engine Log Output

Kyuubi Server Configurations

Kyuubi Engine Configurations

Additional context

Are you willing to submit PR?

github-actions bot commented Nov 18, 2024

hzxiongyinke commented Nov 18, 2024

yaooqinn commented Nov 19, 2024