[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[jira] [Created] (FLINK-10500) Let ExecutionGraphDriver react to fail signal

Till Rohrmann created FLINK-10500:

             Summary: Let ExecutionGraphDriver react to fail signal
                 Key: FLINK-10500
                 URL: https://issues.apache.org/jira/browse/FLINK-10500
             Project: Flink
          Issue Type: Sub-task
          Components: Distributed Coordination
    Affects Versions: 1.7.0
            Reporter: Till Rohrmann
             Fix For: 1.7.0

In order to scale down when there are not enough resources available or if TMs died, the {{ExecutionGraphDriver}} needs to learn about a failure. Depending on the failure type and the available set of resources, it can then decide to scale the job down or simply restart. In the scope of this issue, the {{ExecutionGraphDriver}} should simply call into the {{RestartStrategy}}.

This message was sent by Atlassian JIRA