OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [DISCUSS] Flink 1.6 features


Since wishes are free:

- Standalone cluster job isolation:
https://issues.apache.org/jira/browse/FLINK-8886
- Proper sliding window joins (not overlapping hoping window joins):
https://issues.apache.org/jira/browse/FLINK-6243
- Sharing state across operators:
https://issues.apache.org/jira/browse/FLINK-6239
- Synchronizing streams: https://issues.apache.org/jira/browse/FLINK-4558

Seconded:
- Atomic cancel-with-savepoint:
https://issues.apache.org/jira/browse/FLINK-7634
- Support dynamically changing CEP patterns :
https://issues.apache.org/jira/browse/FLINK-7129


On Fri, Jun 8, 2018 at 1:31 PM, Stephan Ewen <sewen@xxxxxxxxxx> wrote:

> Hi all!
>
> Thanks for the discussion and good input. Many suggestions fit well with
> the proposal above.
>
> Please bear in mind that with a time-based release model, we would release
> whatever is mature by end of July.
> The good thing is we could schedule the next release not too far after
> that, so that the features that did not quite make it will not be delayed
> too long.
> In some sense, you could read this as as "*what to do first*" list,
> rather than "*this goes in, other things stay out"*.
>
> Some thoughts on some of the suggestions
>
> *Kubernetes integration:* An opaque integration with Kubernetes should be
> supported through the "as a library" mode. For a deeper integration, I know
> that some committers have experimented with some PoC code. I would let Till
> add some thoughts, he has worked the most on the deployment parts recently.
>
> *Per partition watermarks with idleness:* Good point, could one implement
> that on the current interface, with a periodic watermark extractor?
>
> *Atomic cancel-with-savepoint:* Agreed, this is important. Making this
> work with all sources needs a bit more work. We should have this in the
> roadmap.
>
> *Elastic Bloomfilters:* This seems like an interesting new feature - the
> above suggested feature set was more about addressing some longer standing
> issues/requests. However, nothing should prevent contributors to work on
> that.
>
> Best,
> Stephan
>
>
> On Wed, Jun 6, 2018 at 6:23 AM, Yan Zhou [FDS Science] <yzhou@xxxxxxxxxxx>
> wrote:
>
>> +1 on https://issues.apache.org/jira/browse/FLINK-5479
>> [FLINK-5479] Per-partition watermarks in ...
>> <https://issues.apache.org/jira/browse/FLINK-5479>
>> issues.apache.org
>> Reported in ML: http://apache-flink-user-maili
>> ng-list-archive.2336050.n4.nabble.com/Kafka-topic-partition-
>> skewness-causes-watermark-not-being-emitted-td11008.html It's normally
>> not a common case to have Kafka partitions not producing any data, but
>> it'll probably be good to handle this as well. I ...
>>
>> ------------------------------
>> *From:* Rico Bergmann <info@xxxxxxxxxxxxxxx>
>> *Sent:* Tuesday, June 5, 2018 9:12:00 PM
>> *To:* Hao Sun
>> *Cc:* dev@xxxxxxxxxxxxxxxx; user
>> *Subject:* Re: [DISCUSS] Flink 1.6 features
>>
>> +1 on K8s integration
>>
>>
>>
>> Am 06.06.2018 um 00:01 schrieb Hao Sun <hasun@xxxxxxxxxxx>:
>>
>> adding my vote to K8S Job mode, maybe it is this?
>> > Smoothen the integration in Container environment, like "Flink as a
>> Library", and easier integration with Kubernetes services and other proxies.
>>
>>
>>
>> On Mon, Jun 4, 2018 at 11:01 PM Ben Yan <yan.xiao.bin.mail@xxxxxxxxx>
>> wrote:
>>
>> Hi Stephan,
>>
>> Will  [ https://issues.apache.org/jira/browse/FLINK-5479 ]
>> (Per-partition watermarks in FlinkKafkaConsumer should consider idle
>> partitions) be included in 1.6? As we are seeing more users with this
>> issue on the mailing lists.
>>
>> Thanks.
>> Ben
>>
>> 2018-06-05 5:29 GMT+08:00 Che Lui Shum <shumc@xxxxxxxxxx>:
>>
>> Hi Stephan,
>>
>> Will FLINK-7129 (Support dynamically changing CEP patterns) be included
>> in 1.6? There were discussions about possibly including it in 1.6:
>> http://mail-archives.apache.org/mod_mbox/flink-user/201803.m
>> box/%3cCAMq=OU7gru2O9JtoWXn1Lc1F7NKcxAyN6A3e58kxctb4b508RQ@m
>> ail.gmail.com%3e
>>
>> Thanks,
>> Shirley Shum
>>
>> [image: Inactive hide details for Stephan Ewen ---06/04/2018 02:21:47
>> AM---Hi Flink Community! The release of Apache Flink 1.5 has happ]Stephan
>> Ewen ---06/04/2018 02:21:47 AM---Hi Flink Community! The release of Apache
>> Flink 1.5 has happened (yay!) - so it is a good time
>>
>> From: Stephan Ewen <sewen@xxxxxxxxxx>
>> To: dev@xxxxxxxxxxxxxxxx, user <user@xxxxxxxxxxxxxxxx>
>> Date: 06/04/2018 02:21 AM
>> Subject: [DISCUSS] Flink 1.6 features
>> ------------------------------
>>
>>
>>
>> Hi Flink Community!
>>
>> The release of Apache Flink 1.5 has happened (yay!) - so it is a good
>> time to start talking about what to do for release 1.6.
>>
>> *== Suggested release timeline ==*
>>
>> I would propose to release around *end of July* (that is 8-9 weeks from
>> now).
>>
>> The rational behind that: There was a lot of effort in release testing
>> automation (end-to-end tests, scripted stress tests) as part of release
>> 1.5. You may have noticed the big set of new modules under
>> "flink-end-to-end-tests" in the Flink repository. It delayed the 1.5
>> release a bit, and needs to continue as part of the coming release cycle,
>> but should help make releasing more lightweight from now on.
>>
>> (Side note: There are also some nightly stress tests that we created and
>> run at data Artisans, and where we are looking whether and in which way it
>> would make sense to contribute them to Flink.)
>>
>> *== Features and focus areas ==*
>>
>> We had a lot of big and heavy features in Flink 1.5, with FLIP-6, the new
>> network stack, recovery, SQL joins and client, ... Following something like
>> a "tick-tock-model", I would suggest to focus the next release more on
>> integrations, tooling, and reducing user friction.
>>
>> Of course, this does not mean that no other pull request gets reviewed,
>> an no other topic will be examined - it is simply meant as a help to
>> understand where to expect more activity during the next release cycle.
>> Note that these are really the coarse focus areas - don't read this as a
>> comprehensive list.
>>
>> This list is my first suggestion, based on discussions with committers,
>> users, and mailing list questions.
>>
>>   - Support Java 9 and Scala 2.12
>>
>>   - Smoothen the integration in Container environment, like "Flink as a
>> Library", and easier integration with Kubernetes services and other proxies.
>>
>>   - Polish the remaing parts of the FLIP-6 rewrite
>>
>>   - Improve state backends with asynchronous timer snapshots, efficient
>> timer deletes, state TTL, and broadcast state support in RocksDB.
>>
>>   - Extends Streaming Sinks:
>>      - Bucketing Sink should support S3 properly (compensate for eventual
>> consistency), work with Flink's shaded S3 file systems, and efficiently
>> support formats that compress/index arcoss individual rows (Parquet, ORC,
>> ...)
>>      - Support ElasticSearch's new REST API
>>
>>   - Smoothen State Evolution to support type conversion on snapshot
>> restore
>>
>>   - Enhance Stream SQL and CEP
>>      - Add support for "update by key" Table Sources
>>      - Add more table sources and sinks (Kafka, Kinesis, Files, K/V
>> stores)
>>      - Expand SQL client
>>      - Integrate CEP and SQL, through MATCH_RECOGNIZE clause
>>      - Improve CEP Performance of SharedBuffer on RocksDB
>>
>>
>>
>>
>>
>