osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Clarification around versioning and flink's state serialization


Yes that helps a lot! 

Just to clarify:
- If using Avro types in 1.7, no explicit declaration of serializers needs to be done to have state evolution. But all other evolvable types (e.g Protobuf) still need to be registered and evolved manually?
- If specifying `disableGenericTypes` on my execution context, anything that falls back to Kryo will cause an error.

Would love to see more updated slides if you don't mind.

Thanks for taking the time,
Padarn 


On Fri, Dec 21, 2018 at 10:04 PM Tzu-Li (Gordon) Tai <tzulitai@xxxxxxxxxx> wrote:

On Fri, Dec 21, 2018, 9:55 PM Tzu-Li (Gordon) Tai <tzulitai@xxxxxxxxxx wrote:
Hi,

Yes, if Flink does not recognize your registered state type, it will by default use Kryo for the serialization.
And generally speaking, Kryo does not have good support for evolvable schemas compared to other serialization frameworks such as Avro or Protobuf.

The reason why Flink defaults to Kryo for unrecognizable types has some historical reasons due to the original use of Flink's type serialization stack being used on the batch side, but IMO the short answer is that it would make sense to have a different default serializer (perhaps Avro) for snapshotting state in streaming programs.
However, I believe this would be better suited as a separate discussion thread.

The good news is that with Flink 1.7, state schema evolution is fully supported out of the box for Avro types, such as GenericRecord or code generated SpecificRecords.
If you want to have evolvable schema for your state types, then it is recommended to use Avro as state types.
Support for evolving schema of other data types such as POJOs and Scala case classes is also on the radar for future releases.

Does this help answer your question?

By the way, the slides your are looking at I would consider quite outdated for the topic, since Flink 1.7 was released with much smoother support for state schema evolution.
An updated version of the slides is not yet publicly available, but if you want I can send you one privately.
Otherwise, the Flink docs for 1.7 would also be equally helpful.

Cheers,
Gordon


On Fri, Dec 21, 2018, 8:11 PM Padarn Wilson <padarn.wilson@xxxxxxxx wrote:
Hi all,

I am trying to understand the situation with state serialization in flink. I'm looking at a number of sources, but slide 35 from here crystalizes my confusion:


So, I understand that if 'Flink's own serialization stack' is unable to serialize a type you define, then it will fall back on Kryo generics. In this case, I believe what I'm being told is that state compatibility is difficult to ensure, and schema evolution in your jobs is not possible.

However on this slide, they say
"
       Kryo is generally not  recommended ...

       Serialization frameworks with schema evolution support is recommended: Avro,         Thrift 
"
So is this implying that Flink's non-default serialization stack does not support schema evolution? In this case is it best practice to register custom serializers whenever possible.

Thanks


Grab is hiring. Learn more at https://grab.careers

By communicating with Grab Inc and/or its subsidiaries, associate companies and jointly controlled entities (“Grab Group”), you are deemed to have consented to processing of your personal data as set out in the Privacy Notice which can be viewed at https://grab.com/privacy/

This email contains confidential information and is only for the intended recipient(s). If you are not the intended recipient(s), please do not disseminate, distribute or copy this email and notify Grab Group immediately if you have received this by mistake and delete this email from your system. Email transmission cannot be guaranteed to be secure or error-free as any information therein could be intercepted, corrupted, lost, destroyed, delayed or incomplete, or contain viruses. Grab Group do not accept liability for any errors or omissions in the contents of this email arises as a result of email transmission. All intellectual property rights in this email and attachments therein shall remain vested in Grab Group, unless otherwise provided by law.

Grab is hiring. Learn more at https://grab.careers

By communicating with Grab Inc and/or its subsidiaries, associate companies and jointly controlled entities (“Grab Group”), you are deemed to have consented to processing of your personal data as set out in the Privacy Notice which can be viewed at https://grab.com/privacy/

This email contains confidential information and is only for the intended recipient(s). If you are not the intended recipient(s), please do not disseminate, distribute or copy this email and notify Grab Group immediately if you have received this by mistake and delete this email from your system. Email transmission cannot be guaranteed to be secure or error-free as any information therein could be intercepted, corrupted, lost, destroyed, delayed or incomplete, or contain viruses. Grab Group do not accept liability for any errors or omissions in the contents of this email arises as a result of email transmission. All intellectual property rights in this email and attachments therein shall remain vested in Grab Group, unless otherwise provided by law.