osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints


Thanks,

I guess the first thing that would be great help from anyone interested in
helping is to try it for some streaming state :)

We have tested these tools at King to analyze, transform and perform some
aggregations on our user-states. The major limitation is that it requires
RocksDB savepoints to work but other than that we successfully analyzed a
few hundred gigabytes of state including reading keyed, and broadcast
states from different operators. Also you need to have a savepoint before
you can create a new savepoint (with whatever state).

Once we have some people who have played with it we can probably greatly
improve the API and user experience as it is pretty low level at the
moment. I suggest we use the King git repo <https://github.com/king/bravo> for
now to track some features before it is in a shape that deserves a Flink
PR. We are super happy to take any improvements, code contributions from
anyone so dont hesitate to reach out to me if you have some ideas.

Gyula


Rong Rong <walterddr@xxxxxxxxx> ezt írta (időpont: 2018. aug. 22., Sze,
17:06):

> +1. Being able to analyze the state is a huge operational advantage.
> Thanks Gyula for the POC and I would be very interested in contributing to
> the work.
>
> --
> Rong
>
> On Tue, Aug 21, 2018 at 4:26 AM Till Rohrmann <trohrmann@xxxxxxxxxx>
> wrote:
>
> > big +1 for this feature. A tool to get your state out of and into Flink
> > will be tremendously helpful.
> >
> > On Mon, Aug 20, 2018 at 10:21 AM Aljoscha Krettek <aljoscha@xxxxxxxxxx>
> > wrote:
> >
> > > +1 I'd like to have something like this in Flink a lot!
> > >
> > > > On 19. Aug 2018, at 11:57, Gyula Fóra <gyula.fora@xxxxxxxxx> wrote:
> > > >
> > > > Hi all!
> > > >
> > > > Thanks for the feedback and I'm happy there is some interest :)
> > > > Tomorrow I will start improving the proposal based on the feedback
> and
> > > will
> > > > get back to work.
> > > >
> > > > If you are interested working together in this please ping me and we
> > can
> > > > discuss some ideas/plans and how to share work.
> > > >
> > > > Cheers,
> > > > Gyula
> > > >
> > > > Paris Carbone <parisc@xxxxxx> ezt írta (időpont: 2018. aug. 18.,
> Szo,
> > > 9:03):
> > > >
> > > >> +1
> > > >>
> > > >> Might also be a good start to implement queryable stream state with
> > > >> snapshot isolation using that mechanism.
> > > >>
> > > >> Paris
> > > >>
> > > >>> On 17 Aug 2018, at 12:28, Gyula Fóra <gyula.fora@xxxxxxxxx> wrote:
> > > >>>
> > > >>> Hi All!
> > > >>>
> > > >>> I want to share with you a little project we have been working on
> at
> > > King
> > > >>> (with some help from some dataArtisans folks). I think this would
> be
> > a
> > > >>> valuable addition to Flink and solve a bunch of outstanding
> > production
> > > >>> use-cases and headaches around state bootstrapping and state
> > analytics.
> > > >>>
> > > >>> We have built a quick and dirty POC implementation on top of Flink
> > 1.6,
> > > >>> please check the README for some nice examples to get a quick idea:
> > > >>>
> > > >>> https://github.com/king/bravo
> > > >>>
> > > >>> *Short story*
> > > >>> Bravo is a convenient state reader and writer library leveraging
> the
> > > >>> Flink’s batch processing capabilities. It supports processing and
> > > writing
> > > >>> Flink streaming savepoints. At the moment it only supports
> processing
> > > >>> RocksDB savepoints but this can be extended in the future for other
> > > state
> > > >>> backends and checkpoint types.
> > > >>>
> > > >>> Our goal is to cover a few basic features:
> > > >>>
> > > >>>  - Converting keyed states to Flink DataSets for processing and
> > > >> analytics
> > > >>>  - Reading/Writing non-keyed operators states
> > > >>>  - Bootstrap keyed states from Flink DataSets and create new valid
> > > >>>  savepoints
> > > >>>  - Transform existing savepoints by replacing/changing some states
> > > >>>
> > > >>>
> > > >>> Some example use-cases:
> > > >>>
> > > >>>  - Point-in-time state analytics across all operators and keys
> > > >>>  - Bootstrap state of a streaming job from external resources such
> as
> > > >>>  reading from database/filesystem
> > > >>>  - Validate and potentially repair corrupted state of a streaming
> job
> > > >>>  - Change max parallelism of a job
> > > >>>
> > > >>>
> > > >>> Our main goal is to start working together with other Flink
> > production
> > > >>> users and make this something useful that can be part of Flink. So
> if
> > > you
> > > >>> have use-cases please talk to us :)
> > > >>> I have also started a google doc which contains a little bit more
> > info
> > > >> than
> > > >>> the readme and could be a starting place for discussions:
> > > >>>
> > > >>>
> > > >>
> > >
> >
> https://docs.google.com/document/d/103k6wPX20kMu5H3SOOXSg5PZIaYpwdhqBMr-ppkFL5E/edit?usp=sharing
> > > >>>
> > > >>> I know there are a bunch of rough edges and bugs (and no tests) but
> > our
> > > >>> motto is: If you are not embarrassed, you released too late :)
> > > >>>
> > > >>> Please let me know what you think!
> > > >>>
> > > >>> Cheers,
> > > >>> Gyula
> > > >>
> > > >>
> > >
> > >
> >
>