osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [DISCUSS] Support Interactive Programming in Flink Table API


Hi Jiangjie,

Thank you for the explanation about the name of `cache()`, I understand why
you designed this way!

Another idea is whether we can specify a lifecycle for data persistence?
For example, persist (LifeCycle.SESSION), so that the user is not worried
about data loss, and will clearly specify the time range for keeping time.
At the same time, if we want to expand, we can also share in a certain
group of session, for example: LifeCycle.SESSION_GROUP(...), I am not sure,
just an immature suggestion, for reference only!

Bests,
Jincheng

Becket Qin <becket.qin@xxxxxxxxx> 于2018年11月23日周五 下午1:33写道:

> Re: Jincheng,
>
> Thanks for the feedback. Regarding cache() v.s. persist(), personally I
> find cache() to be more accurately describing the behavior, i.e. the Table
> is cached for the session, but will be deleted after the session is closed.
> persist() seems a little misleading as people might think the table will
> still be there even after the session is gone.
>
> Great point about mixing the batch and stream processing in the same job.
> We should absolutely move towards that goal. I imagine that would be a huge
> change across the board, including sources, operators and optimizations, to
> name some. Likely we will need several separate in-depth discussions.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Nov 23, 2018 at 5:14 AM Xingcan Cui <xingcanc@xxxxxxxxx> wrote:
>
> > Hi all,
> >
> > @Shaoxuan, I think the lifecycle or access domain are both orthogonal to
> > the cache problem. Essentially, this may be the first time we plan to
> > introduce another storage mechanism other than the state. Maybe it’s
> better
> > to first draw a big picture and then concentrate on a specific part?
> >
> > @Becket, yes, actually I am more concerned with the underlying service.
> > This seems to be quite a major change to the existing codebase. As you
> > claimed, the service should be extendible to support other components and
> > we’d better discussed it in another thread.
> >
> > All in all, I also eager to enjoy the more interactive Table API, in case
> > of a general and flexible enough service mechanism.
> >
> > Best,
> > Xingcan
> >
> > > On Nov 22, 2018, at 10:16 AM, Xiaowei Jiang <xiaoweij@xxxxxxxxx>
> wrote:
> > >
> > > Relying on a callback for the temp table for clean up is not very
> > reliable.
> > > There is no guarantee that it will be executed successfully. We may
> risk
> > > leaks when that happens. I think that it's safer to have an association
> > > between temp table and session id. So we can always clean up temp
> tables
> > > which are no longer associated with any active sessions.
> > >
> > > Regards,
> > > Xiaowei
> > >
> > > On Thu, Nov 22, 2018 at 12:55 PM jincheng sun <
> sunjincheng121@xxxxxxxxx>
> > > wrote:
> > >
> > >> Hi Jiangjie&Shaoxuan,
> > >>
> > >> Thanks for initiating this great proposal!
> > >>
> > >> Interactive Programming is very useful and user friendly in case of
> your
> > >> examples.
> > >> Moreover, especially when a business has to be executed in several
> > stages
> > >> with dependencies,such as the pipeline of Flink ML, in order to
> utilize
> > the
> > >> intermediate calculation results we have to submit a job by
> > env.execute().
> > >>
> > >> About the `cache()`  , I think is better to named `persist()`, And The
> > >> Flink framework determines whether we internally cache in memory or
> > persist
> > >> to the storage system,Maybe save the data into state backend
> > >> (MemoryStateBackend or RocksDBStateBackend etc.)
> > >>
> > >> BTW, from the points of my view in the future, support for streaming
> and
> > >> batch mode switching in the same job will also benefit in "Interactive
> > >> Programming",  I am looking forward to your JIRAs and FLIP!
> > >>
> > >> Best,
> > >> Jincheng
> > >>
> > >>
> > >> Becket Qin <becket.qin@xxxxxxxxx> 于2018年11月20日周二 下午9:56写道:
> > >>
> > >>> Hi all,
> > >>>
> > >>> As a few recent email threads have pointed out, it is a promising
> > >>> opportunity to enhance Flink Table API in various aspects, including
> > >>> functionality and ease of use among others. One of the scenarios
> where
> > we
> > >>> feel Flink could improve is interactive programming. To explain the
> > >> issues
> > >>> and facilitate the discussion on the solution, we put together the
> > >>> following document with our proposal.
> > >>>
> > >>>
> > >>>
> > >>
> >
> https://docs.google.com/document/d/1d4T2zTyfe7hdncEUAxrlNOYr4e5IMNEZLyqSuuswkA0/edit?usp=sharing
> > >>>
> > >>> Feedback and comments are very welcome!
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Jiangjie (Becket) Qin
> > >>>
> > >>
> >
> >
>