[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[jira] [Created] (FLINK-10240) Flexible scheduling strategy is needed for batch job

Zhu Zhu created FLINK-10240:

             Summary: Flexible scheduling strategy is needed for batch job
                 Key: FLINK-10240
                 URL: https://issues.apache.org/jira/browse/FLINK-10240
             Project: Flink
          Issue Type: New Feature
          Components: Distributed Coordination
            Reporter: Zhu Zhu

Currently in Flink we have 2 schedule mode:

1. EAGER mode starts all tasks at once, mainly for streaming job

2. LAZY_FROM_SOURCES mode starts a task once its input data is consumable


However, in batch job, input data ready does not always mean the task can work at once. 

One example is the hash join operation, where the operator first consumes one side(we call it build side) to setup a table, then consumes the other side(we call it probe side) to do the real join work. If the probe side is started early, it just get stuck on back pressure as the join operator will not consume data from it before the building stage is done, causing a waste of resources.

If we have the probe side task start after the build stage is done, both the build and probe side can have more computing resources as they are staggered.


That's way we think a flexible scheduling strategy is needed, allowing job owners to customize the vertex schedule order and constraints. Better resource utilization usually means better performance.

This message was sent by Atlassian JIRA