Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.6.0, 2.0.0, 2.1.0
-
None
-
None
Description
In TaskResultGetter, if we get an error when deserialize TaskEndReason, TaskScheduler won't have a chance to handle the failed task and the task just hangs.
def enqueueFailedTask(taskSetManager: TaskSetManager, tid: Long, taskState: TaskState, serializedData: ByteBuffer) { var reason : TaskEndReason = UnknownReason try { getTaskResultExecutor.execute(new Runnable { override def run(): Unit = Utils.logUncaughtExceptions { val loader = Utils.getContextOrSparkClassLoader try { if (serializedData != null && serializedData.limit() > 0) { reason = serializer.get().deserialize[TaskEndReason]( serializedData, loader) } } catch { case cnd: ClassNotFoundException => // Log an error but keep going here -- the task failed, so not catastrophic // if we can't deserialize the reason. logError( "Could not deserialize TaskEndReason: ClassNotFound with classloader " + loader) case ex: Exception => {} } scheduler.handleFailedTask(taskSetManager, tid, taskState, reason) } }) } catch { case e: RejectedExecutionException if sparkEnv.isStopped => // ignore it } }
In my specific case, I got a NoClassDefFoundError and the failed task hangs forever.
Attachments
Issue Links
- relates to
-
HIVE-13525 HoS hangs when job is empty
- Closed
- links to