Re: [DISCUSS] Support Interactive Programming in Flink Table API
Thanks for the feedback. Regarding cache() v.s. persist(), personally I
find cache() to be more accurately describing the behavior, i.e. the Table
is cached for the session, but will be deleted after the session is closed.
persist() seems a little misleading as people might think the table will
still be there even after the session is gone.
Great point about mixing the batch and stream processing in the same job.
We should absolutely move towards that goal. I imagine that would be a huge
change across the board, including sources, operators and optimizations, to
name some. Likely we will need several separate in-depth discussions.
Jiangjie (Becket) Qin
On Fri, Nov 23, 2018 at 5:14 AM Xingcan Cui <xingcanc@xxxxxxxxx> wrote:
> Hi all,
> @Shaoxuan, I think the lifecycle or access domain are both orthogonal to
> the cache problem. Essentially, this may be the first time we plan to
> introduce another storage mechanism other than the state. Maybe it’s better
> to first draw a big picture and then concentrate on a specific part?
> @Becket, yes, actually I am more concerned with the underlying service.
> This seems to be quite a major change to the existing codebase. As you
> claimed, the service should be extendible to support other components and
> we’d better discussed it in another thread.
> All in all, I also eager to enjoy the more interactive Table API, in case
> of a general and flexible enough service mechanism.
> > On Nov 22, 2018, at 10:16 AM, Xiaowei Jiang <xiaoweij@xxxxxxxxx> wrote:
> > Relying on a callback for the temp table for clean up is not very
> > There is no guarantee that it will be executed successfully. We may risk
> > leaks when that happens. I think that it's safer to have an association
> > between temp table and session id. So we can always clean up temp tables
> > which are no longer associated with any active sessions.
> > Regards,
> > Xiaowei
> > On Thu, Nov 22, 2018 at 12:55 PM jincheng sun <sunjincheng121@xxxxxxxxx>
> > wrote:
> >> Hi Jiangjie&Shaoxuan,
> >> Thanks for initiating this great proposal！
> >> Interactive Programming is very useful and user friendly in case of your
> >> examples.
> >> Moreover， especially when a business has to be executed in several
> >> with dependencies，such as the pipeline of Flink ML, in order to utilize
> >> intermediate calculation results we have to submit a job by
> >> About the `cache()` , I think is better to named `persist()`, And The
> >> Flink framework determines whether we internally cache in memory or
> >> to the storage system，Maybe save the data into state backend
> >> (MemoryStateBackend or RocksDBStateBackend etc.)
> >> BTW, from the points of my view in the future, support for streaming and
> >> batch mode switching in the same job will also benefit in "Interactive
> >> Programming", I am looking forward to your JIRAs and FLIP!
> >> Best,
> >> Jincheng
> >> Becket Qin <becket.qin@xxxxxxxxx> 于2018年11月20日周二 下午9:56写道：
> >>> Hi all,
> >>> As a few recent email threads have pointed out, it is a promising
> >>> opportunity to enhance Flink Table API in various aspects, including
> >>> functionality and ease of use among others. One of the scenarios where
> >>> feel Flink could improve is interactive programming. To explain the
> >> issues
> >>> and facilitate the discussion on the solution, we put together the
> >>> following document with our proposal.
> >>> Feedback and comments are very welcome!
> >>> Thanks,
> >>> Jiangjie (Becket) Qin