osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints


+1. Being able to analyze the state is a huge operational advantage.
Thanks Gyula for the POC and I would be very interested in contributing to
the work.

--
Rong

On Tue, Aug 21, 2018 at 4:26 AM Till Rohrmann <trohrmann@xxxxxxxxxx> wrote:

> big +1 for this feature. A tool to get your state out of and into Flink
> will be tremendously helpful.
>
> On Mon, Aug 20, 2018 at 10:21 AM Aljoscha Krettek <aljoscha@xxxxxxxxxx>
> wrote:
>
> > +1 I'd like to have something like this in Flink a lot!
> >
> > > On 19. Aug 2018, at 11:57, Gyula Fóra <gyula.fora@xxxxxxxxx> wrote:
> > >
> > > Hi all!
> > >
> > > Thanks for the feedback and I'm happy there is some interest :)
> > > Tomorrow I will start improving the proposal based on the feedback and
> > will
> > > get back to work.
> > >
> > > If you are interested working together in this please ping me and we
> can
> > > discuss some ideas/plans and how to share work.
> > >
> > > Cheers,
> > > Gyula
> > >
> > > Paris Carbone <parisc@xxxxxx> ezt írta (időpont: 2018. aug. 18., Szo,
> > 9:03):
> > >
> > >> +1
> > >>
> > >> Might also be a good start to implement queryable stream state with
> > >> snapshot isolation using that mechanism.
> > >>
> > >> Paris
> > >>
> > >>> On 17 Aug 2018, at 12:28, Gyula Fóra <gyula.fora@xxxxxxxxx> wrote:
> > >>>
> > >>> Hi All!
> > >>>
> > >>> I want to share with you a little project we have been working on at
> > King
> > >>> (with some help from some dataArtisans folks). I think this would be
> a
> > >>> valuable addition to Flink and solve a bunch of outstanding
> production
> > >>> use-cases and headaches around state bootstrapping and state
> analytics.
> > >>>
> > >>> We have built a quick and dirty POC implementation on top of Flink
> 1.6,
> > >>> please check the README for some nice examples to get a quick idea:
> > >>>
> > >>> https://github.com/king/bravo
> > >>>
> > >>> *Short story*
> > >>> Bravo is a convenient state reader and writer library leveraging the
> > >>> Flink’s batch processing capabilities. It supports processing and
> > writing
> > >>> Flink streaming savepoints. At the moment it only supports processing
> > >>> RocksDB savepoints but this can be extended in the future for other
> > state
> > >>> backends and checkpoint types.
> > >>>
> > >>> Our goal is to cover a few basic features:
> > >>>
> > >>>  - Converting keyed states to Flink DataSets for processing and
> > >> analytics
> > >>>  - Reading/Writing non-keyed operators states
> > >>>  - Bootstrap keyed states from Flink DataSets and create new valid
> > >>>  savepoints
> > >>>  - Transform existing savepoints by replacing/changing some states
> > >>>
> > >>>
> > >>> Some example use-cases:
> > >>>
> > >>>  - Point-in-time state analytics across all operators and keys
> > >>>  - Bootstrap state of a streaming job from external resources such as
> > >>>  reading from database/filesystem
> > >>>  - Validate and potentially repair corrupted state of a streaming job
> > >>>  - Change max parallelism of a job
> > >>>
> > >>>
> > >>> Our main goal is to start working together with other Flink
> production
> > >>> users and make this something useful that can be part of Flink. So if
> > you
> > >>> have use-cases please talk to us :)
> > >>> I have also started a google doc which contains a little bit more
> info
> > >> than
> > >>> the readme and could be a starting place for discussions:
> > >>>
> > >>>
> > >>
> >
> https://docs.google.com/document/d/103k6wPX20kMu5H3SOOXSg5PZIaYpwdhqBMr-ppkFL5E/edit?usp=sharing
> > >>>
> > >>> I know there are a bunch of rough edges and bugs (and no tests) but
> our
> > >>> motto is: If you are not embarrassed, you released too late :)
> > >>>
> > >>> Please let me know what you think!
> > >>>
> > >>> Cheers,
> > >>> Gyula
> > >>
> > >>
> >
> >
>