[jira] [Created] (FLINK-11172) Remove the max retention time in StreamQueryConfig
Yangze Guo created FLINK-11172:
Summary: Remove the max retention time in StreamQueryConfig
Issue Type: Improvement
Components: Table API & SQL
Affects Versions: 1.8.0
Reporter: Yangze Guo
Assignee: Yangze Guo
[Stream Query Config|https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/query_configuration.html] is an important and useful feature to make a tradeoff between accuracy and resource consumption when some query executed in unbounded streaming data. This feature first proposed in [FLINK-6491|https://issues.apache.org/jira/browse/FLINK-6491].
At the first, *QueryConfig* take two parameters, i.e. minIdleStateRetentionTime and maxIdleStateRetentionTime, to avoid to register many timers if we have more freedom when to discard state. However, this approach may cause new data expired earlier than old data and thus greater accuracy loss appeared in some case. For example, we have an unbounded keyed streaming data. We process key *_a_* in _*t0*_ and _*b*_ in _*t1,*_ *_t0 < t1_*. *_a_* will expired in _*a+maxIdleStateRetentionTime*_ while _*b*_ expired in *_b+maxIdleStateRetentionTime_*. Now, another data with key *_a_* arrived in _*t2 (t1 < t2)*_. But _*t2+minIdleStateRetentionTime*_ < _*a+maxIdleStateRetentionTime*_. The state of key *_a_* will still be expired in _*a+maxIdleStateRetentionTime*_ which is early than the state of key _*b*_. According to the guideline of [LRU|https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_recently_used_(LRU)] that the element has been most heavily used in the past few instructions are most likely to be used heavily in the next few instructions too. The state with key _*a*_ should live longer than the state with key _*b*_. Current approach against this idea.
I think we now have a good chance to remove the maxIdleStateRetentionTime argument in *StreamQueryConfig.* Below are my reasons.
* [FLINK-9423|https://issues.apache.org/jira/browse/FLINK-9423] implement efficient deletes for heap-based timer service. We can leverage the deletion op to mitigate the abuse of timer registration.
* Current approach can cause new data expired earlier than old data and thus greater accuracy loss appeared in some case. Users need to fine-tune these two parameter to avoid this scenario. Directly following the idea of LRU looks like a better solution.
So, I plan to remove maxIdleStateRetentionTime, update the expire time only depends on _*minIdleStateRetentionTime.*_
cc to [~sunjincheng121], [~fhueske]
This message was sent by Atlassian JIRA