I found the pictures maybe too big and the net here not so good, so the mail I wrote is not sent sucsessfully last night.
Yes, I used event time.
I found watermarks fired normally when the job started, but it stopped and no changed after running hours.
And I changed as fs state backend, I configured at flink-conf.yaml below:
I found there are many checkpoints saved in the file.
But the watermarks also will be stoppped.
And there’re two enviroments.
The local running well, and the problem occurred on site.
There’re two differences:
Data amount very small in local and very huge on site.
On site configurations:
But not configured at local.
So there’re any influences?
Please help me…
Have you used event time as time semantics? If so, then the possible problem is related to watermark.
Since I don't know the details of your program, it's hard to make a conclusion. You can check if your watermark is firing normally.
I changed as below configurations，and it looks fine when job started.
But there’re no results issued when window ends after running about six hours, and no errors and exceptions.
How can I position the question?
Wednesday, October 10, 2018 2:44:48 PM
答复: No data issued by flink window after a few hours
Cause default state size in one hour is too small，and the max window size is 24 hours, so I used 500M.
MemoryStateBackend stateBackend = new MemoryStateBackend(MAX_STATE_SIZE);//500M
And I found Irrespective of the configured maximal state size, the state cannot be larger than the akka frame size.
So I add a config in flink-comf.yaml:
What else do I have to pay attention to?
I saw the exception image you provided. Based on the exception message, it seems you used the default max state size (5MB).
You can specify the max state size to override the default value. Try :
MemoryStateBackend stateBackend = new MemoryStateBackend(theSizeOfBytes);
Please note that you need to reserve enough memory for Flink.
Please have a look about my last mail.
When the cached window data is too large, how?
Did you mean "computer momery" referring to Memory Statebackend?
The Flink window mechanism is internally based on State, and this is done for fault tolerance.
If you introduce external storage, it will break its design and bring other problems.
"ram to cache the distinct data about sliding window" means I used computer momery not the third part db to cache the data need used in window.
“the data need used in window” means
：such as the sliding window is 1 hour, and I need to count the distinct users, I need to cache the user id about one hour.
Cause there’re no related errors.
vino yang <yanghua1127@xxxxxxxxx>
Wednesday, October 10, 2018 10:49:43 AM
Re: No data issued by flink window after a few hours
Can you explain what "ram to cache the distinct data about sliding window" mean?
The information you provide is too small and will not help others to help you analyze the problem and provide advice.
In addition, regarding the usage of Flink related issues, please only send mail to the user mailing list.
The dev mailing list is mainly used to discuss development related issues.
I used flink window, and when the job begins, we could get the results of windiow.But there’re no results issued after a few hours.
I found the job is still running and no errors, and the data not used window all can be issued.
By the way, I used Flink 1.3.2 and ram to cache the distinct data about sliding window.