[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Perform aggregations across multiple windows

You can re-window after the first aggregation (say, into the global
window) and state will be stored with respect to this window.
On Thu, Jun 21, 2018 at 12:02 PM Harshvardhan Agrawal
<harshvardhan.agr93@xxxxxxxxx> wrote:
> Hello,
> We are currently working on implementing a data pipeline using Beam on top of Flink. We have an unbounded data source that sends us some financial positions data. For each account, we perform certain aggregations (let’s assume it’s summation for simplicity) across all products owned by the account. I’m processing the data in windows of 30 seconds.
> At any time when I am processing a window I want to be able to access the aggregation from the previous window and add it to the current aggregation. The stateful API in beam stores data by (key,window) and hence I can’t really store a global state that I could access with account being the key.
> The only way I could think was to write the data  of the previous window to some DB and then read it in my following window which I don’t think is efficient because I have to wait until the previous window is written to the DB. Do you have suggestions on how to approach this problem?
> Thank you.
> --
> Regards,
> Harshvardhan