osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [DISCUSS] Task speculative execution for Flink batch


Thanks Yangyu for the nice design doc! One thing to consider is the
granularity of speculation. Multiple task may propagate data through
pipeline mode. In such case, fixing a single task may not be enough. But
you might be able to fix this problem by increasing the granularity of
speculation. The traditional case of a single speculative task can be
considered as a special case of this.

Xiaowei

On Sat, Nov 17, 2018 at 10:27 PM Tao Yangyu <ryantaocer@xxxxxxxxx> wrote:

> Hi all,
>
> After refined, the detailed design doc is here:
>
> https://docs.google.com/document/d/1X_Pfo4WcO-TEZmmVTTYNn44LQg5gnFeeaeqM7ZNLQ7M/edit?usp=sharing
>
> Your kind reviews and comments are very appreciated and will help so much
> the feature to be completed.
>
> Best,
> Ryan
>
>
> Tao Yangyu <ryantaocer@xxxxxxxxx> 于2018年11月7日周三 下午4:49写道:
>
> > Thanks so much for your all feedbacks!
> >
> > Yes, as mentioned above by Jin Sun, the design currently targets batch to
> > explore the general framework and basic modules. The strategy could be
> also
> > applied to stream with some extended code, for example, the result
> > commitment.
> >
> > Jin Sun <isunjin@xxxxxxxxx> 于2018年11月7日周三 上午8:38写道:
> >
> >> I think this is target for batch at the very beginning, the idea should
> >> be also work for both case, with different algorithm/strategy.
> >>
> >> Ryan, since you are working on this, I will assign FLINK-10644 <
> >> https://issues.apache.org/jira/browse/FLINK-10644> to you.
> >>
> >> Jin
> >>
> >> > On Nov 6, 2018, at 4:45 AM, Till Rohrmann <trohrmann@xxxxxxxxxx>
> wrote:
> >> >
> >> > Thanks for starting this discussion Ryan. I'm looking forward to your
> >> > design document about this feature. Quick question: Will it be a batch
> >> only
> >> > feature? If no, then it needs to take checkpointing into account as
> >> well.
> >> >
> >> > Cheers,
> >> > Till
> >> >
> >> > On Tue, Nov 6, 2018 at 4:29 AM zhijiang <wangzhijiang999@xxxxxxxxxx
> >> .invalid>
> >> > wrote:
> >> >
> >> >> Thanks yangyu for launching this discussion.
> >> >>
> >> >> I really like this proposal. We ever found this scene frequently that
> >> some
> >> >> long tail tasks to delay the total batch job execution time in
> >> production.
> >> >> We also have some thoughts for bringing this mechanism. Looking
> >> forward to
> >> >> your detail design doc, then we can discussion further.
> >> >>
> >> >> Best,
> >> >> Zhijiang
> >> >> ------------------------------------------------------------------
> >> >> 发件人:Tao Yangyu <ryantaocer@xxxxxxxxx>
> >> >> 发送时间:2018年11月6日(星期二) 11:01
> >> >> 收件人:dev <dev@xxxxxxxxxxxxxxxx>
> >> >> 主 题:[DISCUSS] Task speculative execution for Flink batch
> >> >>
> >> >> Hi everyone,
> >> >>
> >> >> We propose task speculative execution for Flink batch in this message
> >> as
> >> >> follows.
> >> >>
> >> >> In the batch mode, the job is usually divided into multiple parallel
> >> tasks
> >> >> executed cross many nodes in the cluster. It is common to encounter
> the
> >> >> performance degradation on some nodes due to hardware problems or
> >> accident
> >> >> I/O busy and high CPU load. This kind of degradation can probably
> >> cause the
> >> >> running tasks on the node to be quite slow that is so called long
> tail
> >> >> tasks. Although the long tail tasks will not fail, they can severely
> >> affect
> >> >> the total job running time. Flink task scheduler does not take this
> >> long
> >> >> tail problem into account currently.
> >> >>
> >> >>
> >> >>
> >> >> Here we propose the speculative execution strategy to handle the
> >> problem.
> >> >> The basic idea is to run a copy of task on another node when the
> >> original
> >> >> task is identified to be long tail. In more details, the speculative
> >> task
> >> >> will be triggered when the scheduler detects that the data processing
> >> >> throughput of a task is much slower than others. The speculative task
> >> is
> >> >> executed in parallel with the original one and share the same failure
> >> retry
> >> >> mechanism. Once either task complete, the scheduler admits its output
> >> as
> >> >> the final result and cancel the other running one. The preliminary
> >> >> experiments has demonstrated the effectiveness.
> >> >>
> >> >>
> >> >> The detailed design doc will be ready soon.  Your reviews and
> comments
> >> will
> >> >> be much appreciated.
> >> >>
> >> >>
> >> >> Thanks!
> >> >>
> >> >> Ryan
> >> >>
> >> >>
> >>
> >>
>