osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MATCH_RECOGNIZE


That sounds good!
I haven't implemented a JDBC driver yet but this sounds like a good thing
to add to improve Flink's test infrastructure.

2018-08-06 23:45 GMT+02:00 Julian Hyde <jhyde@xxxxxxxxxx>:

> If a JDBC driver is a problem, it shouldn’t be hard to mock a connection
> that can create a statement that can describe itself and execute a query.
> Quidem makes light use of JDBC.
>
> > On Aug 6, 2018, at 10:33 AM, Fabian Hueske <fhueske@xxxxxxxxx> wrote:
> >
> > OK, I see.
> > Flink doesn't have support for JDBC yet.
> > Would need to look into that.
> >
> > 2018-08-02 21:35 GMT+02:00 Julian Hyde <jhyde@xxxxxxxxxx>:
> >
> >> Quidem can run on top of any JDBC data source (you just need to invoke
> >> with a connection factory by implementing a simple SPI). But it requires
> >> queries to terminate (i.e. can’t handle streaming queries). So, if Flink
> >> SQL is were able to run queries on an EMP table, then I think it could
> be
> >> tested using Quidem.
> >>
> >>> On Aug 2, 2018, at 6:27 AM, Fabian Hueske <fhueske@xxxxxxxxx> wrote:
> >>>
> >>> Hi Julian,
> >>>
> >>> It would be great to use the same test suite.
> >>>
> >>> We have quite a few tests in Flink but they are not super well
> organized.
> >>> I would love to have more structure for at least some of the tests.
> >>>
> >>> I had a quick look at how Calcite runs its Quidem tests.
> >>> Not sure if this is a format that we could easily adopt to, but maybe
> its
> >>> possible to put a test data set, queries, and results in a more
> portable
> >>> format.
> >>>
> >>> Best, Fabian
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> 2018-07-31 19:54 GMT+02:00 Julian Hyde <jhyde@xxxxxxxxxx>:
> >>>
> >>>> I’m delighted that Flink is getting full SQL support for
> >> MATCH_RECOGNIZE.
> >>>>
> >>>> Sounds like it might be challenging to share the implementation, but
> >> could
> >>>> we perhaps share the test suite? (I.e. a set of SQL queries and their
> >>>> expected results.)
> >>>>
> >>>> I added a simple test in https://github.com/julianhyde/
> calcite/commit/
> >>>> ee460847643ec17544f310088affd99be4028bb6 <https://github.com/
> >>>> julianhyde/calcite/commit/ee460847643ec17544f310088affd99be4028bb6>
> >> that
> >>>> could be extended.
> >>>>
> >>>> Julian
> >>>>
> >>>>
> >>>>> On Jul 31, 2018, at 8:07 AM, Fabian Hueske <fhueske@xxxxxxxxx>
> wrote:
> >>>>>
> >>>>> Hi everyone,
> >>>>>
> >>>>> I'd like to share the plans for MATCH_RECOGNIZE support in Flink.
> >>>>>
> >>>>> Flink features a so-called CEP library for quite some time [1]. The
> CEP
> >>>>> features is a popular feature and frequently used.
> >>>>> In a nutshell, the library provides a domain-specific API to define
> >> event
> >>>>> patterns. The patterns are translated into a state machine and
> >> evaluated
> >>>> in
> >>>>> a streaming program.
> >>>>>
> >>>>> Even before, we learned about about MATCH_RECOGNIZE, Till (another
> >> Flink
> >>>>> committer) and I gave a few talks about unifying SQL and CEP [2].
> >>>>> Hence, we were quite excited when we learned about MATCH_RECOGNIZE
> and
> >>>> even
> >>>>> more when it was added to Calcite.
> >>>>> Shortly after that, we got a PR [3] which translated the parsed
> >>>>> MATCH_RECOGNIZE clause into patterns of our CEP library.
> >>>>> However, we never really got to the point of merging that
> contribution,
> >>>>> mainly because there were some inconsistencies in the semantics of
> >>>>> MATCH_RECOGNIZE and Flink's CEP library.
> >>>>>
> >>>>> Recently, a Flink committers picked up this feature again, validated
> >> the
> >>>>> the semantics, and made a few corrections [4].
> >>>>> The CEP library is now ready to support a subset of the
> MATCH_RECOGNIZE
> >>>>> features.
> >>>>> Unfortunately, MATCH_RECOGNIZE support won't make it into the
> upcoming
> >>>>> 1.6.0 release, but the plans are to add it for the 1.7.0 release.
> >>>>>
> >>>>> Regarding the idea of sharing parts of the evaluation logic.
> >>>>> Flink has runtime support for a subset of the MATCH_RECOGNIZE clause.
> >>>>> Unfortunately, I am not familiar with the internals of Flink's CEP
> >>>> library
> >>>>> and don't know how portable it is.
> >>>>>
> >>>>> Best, Fabian
> >>>>>
> >>>>> [1]
> >>>>> https://ci.apache.org/projects/flink/flink-docs-
> >>>> release-1.5/dev/libs/cep.html <https://ci.apache.org/
> >>>> projects/flink/flink-docs-release-1.5/dev/libs/cep.html>
> >>>>> [2]
> >>>>> https://www.slideshare.net/tillrohrmann/streaming-
> >>>> analytics-cep-two-sides-of-the-same-coin <https://www.slideshare.net/
> >>>> tillrohrmann/streaming-analytics-cep-two-sides-of-the-same-coin>
> >>>>> [3] https://github.com/apache/flink/pull/4502 <
> >>>> https://github.com/apache/flink/pull/4502>
> >>>>> [4] https://issues.apache.org/jira/browse/FLINK-9593 <
> >>>> https://issues.apache.org/jira/browse/FLINK-9593>
> >>>>>
> >>>>> 2018-07-23 21:03 GMT+02:00 Sergey Nuyanzin <snuyanzin@xxxxxxxxx
> >> <mailto:
> >>>> snuyanzin@xxxxxxxxx>>:
> >>>>>
> >>>>>> looks exciting.
> >>>>>> If it is possible I would like to take a part of it however I'm not
> >> sure
> >>>>>> about this week (I could since August)
> >>>>>>
> >>>>>> On Mon, Jul 23, 2018 at 9:10 PM, Michael Mior <mmior@xxxxxxxxxx
> >>>> <mailto:mmior@xxxxxxxxxx>> wrote:
> >>>>>>
> >>>>>>> This does sound like my idea of fun, but unfortunately I won't have
> >>>>>>> the time to contribute in the near future. I'll keep this on my
> radar
> >>>>>>> though. I also shared this message with all the students in our
> >>>>>>> research group and I wouldn't be surprised if there was someone
> >>>>>>> willing to jump in. Thanks for keeping this moving Julian!
> >>>>>>>
> >>>>>>> --
> >>>>>>> Michael Mior
> >>>>>>> mmior@xxxxxxxxxx <mailto:mmior@xxxxxxxxxx>
> >>>>>>> Le lun. 23 juil. 2018 à 13:54, Julian Hyde <jhyde@xxxxxxxxxx
> >> <mailto:
> >>>> jhyde@xxxxxxxxxx>> a écrit :
> >>>>>>>>
> >>>>>>>> For quite a while we have had partial support for MATCH_RECOGNIZE.
> >> We
> >>>>>>> support it in the parser and validator, but there is no runtime
> >>>>>>> implementation. It’s a shame, because MATCH_RECOGNIZE is an
> >> incredibly
> >>>>>>> powerful SQL feature for both traditional SQL (it’s in Oracle 12c)
> >> and
> >>>>>> for
> >>>>>>> continuous query (aka complex event processing - CEP).
> >>>>>>>>
> >>>>>>>> I figure it’s time to change that. My plan is to implement it
> >>>>>>> incrementally, getting simple queries working to start with, then
> >> allow
> >>>>>>> people to add more complex queries.
> >>>>>>>>
> >>>>>>>> In a dev branch [1], I’ve added a method Enumerables.match[2]. The
> >>>> idea
> >>>>>>> is that if you supply an Enumerable of input data, a finite state
> >>>> machine
> >>>>>>> to figure out when a sequence of rows makes a match (represented
> by a
> >>>>>>> transition function: (state, row) -> state), and a function to
> >> convert
> >>>> a
> >>>>>>> matched set of rows to a set of output rows. The match method is
> >> fairly
> >>>>>>> straightforward, and I almost have it finished.
> >>>>>>>>
> >>>>>>>> The complexity is in generating the finite state machine, emitter
> >>>>>>> function, and so forth.
> >>>>>>>>
> >>>>>>>> Can someone help me with this task? If your idea of fun is
> >>>> implementing
> >>>>>>> database algorithms, this is about as much fun as it gets. You
> >> learned
> >>>>>>> about finite state machines in college - this is your chance to
> >>>> actually
> >>>>>>> write one!
> >>>>>>>>
> >>>>>>>> This might be a good joint project with the Flink community. I
> know
> >>>>>>> Flink are thinking of implementing CEP, and the algorithm we write
> >> here
> >>>>>>> could be shared with Flink (for use via Flink SQL or via the Flink
> >>>> API).
> >>>>>>>>
> >>>>>>>> Julian
> >>>>>>>>
> >>>>>>>> [1] https://github.com/julianhyde/calcite/commits/1935-match-
> >>>> recognize
> >>>>>> <
> >>>>>>> https://github.com/julianhyde/calcite/commits/1935-match-recognize
> <
> >>>> https://github.com/julianhyde/calcite/commits/1935-match-recognize>>
> >>>>>>>>
> >>>>>>>> [2] https://github.com/julianhyde/calcite/commit/ <
> >>>> https://github.com/julianhyde/calcite/commit/>
> >>>>>>> 4dfaf1bbee718aa6694a8ce67d829c32d04c7e87#diff-
> >>>>>>> 8a97a64204db631471c563df7551f408R73 <https://github.com/ <
> >>>> https://github.com/>
> >>>>>>> julianhyde/calcite/commit/4dfaf1bbee718aa6694a8ce67d829c
> >>>> 32d04c7e87#diff-
> >>>>>>> 8a97a64204db631471c563df7551f408R73>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Best regards,
> >>>>>> Sergey
> >>>>
> >>>>
> >>
> >>
>
>