osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MATCH_RECOGNIZE


I agree. It would be good to look over the Flink implementation and see whether we can borrow/steal. To be be clear, Calcite already has support for MATCH_RECOGNIZE in SQL parser, validator, and relational algebra. It just does not have an implementation. And we’re not trying to be better than Flink, just provide a default implementation for engines that don’t have the resources to build their own state-transition engine.

I don’t have any cycles for it right now. Let’s track progress in https://issues.apache.org/jira/browse/CALCITE-1935 <https://issues.apache.org/jira/browse/CALCITE-1935>. 

It would be useful if you logged your proposal for time-series/signal processing in a separate JIRA case, so we can track it too. Let’s continue the discussion on the other thread.

Julian


> On Oct 23, 2018, at 5:18 AM, Julian Feinauer <j.feinauer@xxxxxxxxxxxxxxxxx> wrote:
> 
> Hi Julian,
> 
> I decided to reply to this (old) email, because here some facts are noted.
> Funnily, Apache Flink released their MATCH_RECOGNIZE Implementation yesterday.
> 
> So I recall that you and Zhigiang He did something on this.
> I would like to have such a feature in Calcite (as stated in the other mail) and could try to go into this a bit with a colleague of mine and give a bit of support on this topic (In fact, it sounds like fun to us…).
> Perhaps theres also the chance to learn something from Flinks implementation, as you already had some contacts with them, I think?
> 
> Best
> Julian
> 
> On 2018/07/23 17:53:57, Julian Hyde <j...@xxxxxxxxxx> wrote:
> 
> For quite a while we have had partial support for MATCH_RECOGNIZE. We support it in the parser and validator, but there is no runtime implementation. It’s a shame, because MATCH_RECOGNIZE is an incredibly powerful SQL feature for both traditional SQL (it’s in Oracle 12c) and for continuous query (aka complex event processing - CEP).>
> 
> I figure it’s time to change that. My plan is to implement it incrementally, getting simple queries working to start with, then allow people to add more complex queries.>
> 
> In a dev branch [1], I’ve added a method Enumerables.match[2]. The idea is that if you supply an Enumerable of input data, a finite state machine to figure out when a sequence of rows makes a match (represented by a transition function: (state, row) -> state), and a function to convert a matched set of rows to a set of output rows. The match method is fairly straightforward, and I almost have it finished.>
> 
> The complexity is in generating the finite state machine, emitter function, and so forth.>
> 
> Can someone help me with this task? If your idea of fun is implementing database algorithms, this is about as much fun as it gets. You learned about finite state machines in college - this is your chance to actually write one!>
> 
> This might be a good joint project with the Flink community. I know Flink are thinking of implementing CEP, and the algorithm we write here could be shared with Flink (for use via Flink SQL or via the Flink API).>
> 
> Julian>
> 
> [1] https://github.com/julianhyde/calcite/commits/1935-match-recognize <https://github.com/julianhyde/calcite/commits/1935-match-recognize>>
> 
> [2] https://github.com/julianhyde/calcite/commit/4dfaf1bbee718aa6694a8ce67d829c32d04c7e87#diff-8a97a64204db631471c563df7551f408R73 <https://github.com/julianhyde/calcite/commit/4dfaf1bbee718aa6694a8ce67d829c32d04c7e87#diff-8a97a64204db631471c563df7551f408R73>>
>