Idea for preventing OOM in continuous queries of Flink SQL by implementing a "spillable" StateTable
I am currently dealing with an OOM issue caused by continuous queries in Flink SQL (for example, a GROUP BY clause without a window). After looking into Flink's Runtime & Core module, I found that as time goes by, StateTables in the HeapKeyedStateBackend grows larger and larger, because there are too many potential keys to be put into the memory (and GC is taking up most of the CPU time, usually for 20 seconds and more), eventually crashing the job or even the entire JobManager.
Therefore, I am now thinking of implementing a "spillable" StateTable that could spill the keys together with values into disk (or StateBackends like RocksDB) when heap memory is going to deplete. And also when memory pressure alleviates, the entries could be put back. Therefore, contrary to using RocksDB alone, Flink's throughput would not be affected if there is no need for a "spill" (when heap memory is enough).
I am not sure if the Flink community has some plans in dealing with the OOM issue caused by large number of keys & states in the heap (except for using RocksDB alone because the throughput would be rather slow when heap memory is enough), and whether my plan has any serious flaws, making it not worth doing, like making checkpointing process harder to be consistent, or greatly reducing the throughput because of the additional lookup cost, etc.?
Hope to get any feedbacks, and thank you in advance, especially Fabian who recommended me to this mailing list : )