[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[jira] [Created] (FLINK-9913) Improve output serialization only once in RecordWriter

zhijiang created FLINK-9913:

             Summary: Improve output serialization only once in RecordWriter
                 Key: FLINK-9913
                 URL: https://issues.apache.org/jira/browse/FLINK-9913
             Project: Flink
          Issue Type: Improvement
          Components: Network
    Affects Versions: 1.6.0
            Reporter: zhijiang
            Assignee: zhijiang
             Fix For: 1.6.0

Currently the {{RecordWriter}} emits output into multi channels via {{ChannelSelector}}  or broadcasts output to all channels directly. Each channel has a separate {{RecordSerializer}} for serializing outputs, that means the output will be serialized as many times as the number of selected channels.

As we know, data serialization is a high cost operation, so we can get good benefits by improving the serialization only once.

I would suggest the following changes for realizing it.
 # Only one {{RecordSerializer}} is created in {{RecordWriter}} for all the channels.
 # The output is serialized into the intermediate data buffer only once for different channels.
 # The intermediate serialization results are copied into different {{BufferBuilder}}s for different channels.

This message was sent by Atlassian JIRA