osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints


+1 I'd like to have something like this in Flink a lot!

> On 19. Aug 2018, at 11:57, Gyula Fóra <gyula.fora@xxxxxxxxx> wrote:
> 
> Hi all!
> 
> Thanks for the feedback and I'm happy there is some interest :)
> Tomorrow I will start improving the proposal based on the feedback and will
> get back to work.
> 
> If you are interested working together in this please ping me and we can
> discuss some ideas/plans and how to share work.
> 
> Cheers,
> Gyula
> 
> Paris Carbone <parisc@xxxxxx> ezt írta (időpont: 2018. aug. 18., Szo, 9:03):
> 
>> +1
>> 
>> Might also be a good start to implement queryable stream state with
>> snapshot isolation using that mechanism.
>> 
>> Paris
>> 
>>> On 17 Aug 2018, at 12:28, Gyula Fóra <gyula.fora@xxxxxxxxx> wrote:
>>> 
>>> Hi All!
>>> 
>>> I want to share with you a little project we have been working on at King
>>> (with some help from some dataArtisans folks). I think this would be a
>>> valuable addition to Flink and solve a bunch of outstanding production
>>> use-cases and headaches around state bootstrapping and state analytics.
>>> 
>>> We have built a quick and dirty POC implementation on top of Flink 1.6,
>>> please check the README for some nice examples to get a quick idea:
>>> 
>>> https://github.com/king/bravo
>>> 
>>> *Short story*
>>> Bravo is a convenient state reader and writer library leveraging the
>>> Flink’s batch processing capabilities. It supports processing and writing
>>> Flink streaming savepoints. At the moment it only supports processing
>>> RocksDB savepoints but this can be extended in the future for other state
>>> backends and checkpoint types.
>>> 
>>> Our goal is to cover a few basic features:
>>> 
>>>  - Converting keyed states to Flink DataSets for processing and
>> analytics
>>>  - Reading/Writing non-keyed operators states
>>>  - Bootstrap keyed states from Flink DataSets and create new valid
>>>  savepoints
>>>  - Transform existing savepoints by replacing/changing some states
>>> 
>>> 
>>> Some example use-cases:
>>> 
>>>  - Point-in-time state analytics across all operators and keys
>>>  - Bootstrap state of a streaming job from external resources such as
>>>  reading from database/filesystem
>>>  - Validate and potentially repair corrupted state of a streaming job
>>>  - Change max parallelism of a job
>>> 
>>> 
>>> Our main goal is to start working together with other Flink production
>>> users and make this something useful that can be part of Flink. So if you
>>> have use-cases please talk to us :)
>>> I have also started a google doc which contains a little bit more info
>> than
>>> the readme and could be a starting place for discussions:
>>> 
>>> 
>> https://docs.google.com/document/d/103k6wPX20kMu5H3SOOXSg5PZIaYpwdhqBMr-ppkFL5E/edit?usp=sharing
>>> 
>>> I know there are a bunch of rough edges and bugs (and no tests) but our
>>> motto is: If you are not embarrassed, you released too late :)
>>> 
>>> Please let me know what you think!
>>> 
>>> Cheers,
>>> Gyula
>> 
>>