[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Flink keyed stream windows

Hi Garvit,

One other point - once you start making HTTP requests, you likely want to use an AsyncFunction to avoid the inefficiencies of your process spending most of its time waiting for the remote server to handle the request.

Which means emitting the results (user + active location) from the window function, and then processing them in a downstream AsyncFunction.

The other choice is to multi-thread your custom process window function, but then reliably recovering from errors becomes challenging.

— Ken

On Aug 13, 2018, at 4:26 AM, vino yang <yanghua1127@xxxxxxxxx> wrote:

Hi Garvit,

Please refer to the Flink official documentation for the window description. [1]
In this scenario, you should use Tumbling Windows. [2]
If you want to call your own API to handle the window, you can extend the process window function to achieve your needs.[3]

Thanks, vino.

Garvit Sharma <garvits45@xxxxxxxxx> 于2018年8月13日周一 下午5:53写道:
Clarification: Its 30 Seconds not 30 minutes.

On Mon, Aug 13, 2018 at 3:20 PM Garvit Sharma <garvits45@xxxxxxxxx> wrote:

I am working on a use case where I have a stream of users active locations and I want to store(by hitting an HTTP API) the latest active location for each of the unique users every 30 minutes.

Since I have a lot of unique users(rpm 1.5 million), how to use Flink's timed windows on keyed stream to solve this problem.

Please help!


Ken Krugler
+1 530-210-6378
Custom big data solutions & training
Flink, Solr, Hadoop, Cascading & Cassandra