increasing the time to detect a dead task manager usually increases the amount of elements that need to be reprocessed in case of a failure. Once a dead task manager is identified, the entire application is rolled back to the latest successful checkpointed/consistent state of the application. So it is desirable to keep this time low in order to keep the time to catch up low. Faul tolerance guarantees should not be affected.
I hope this helps.
Am 15.05.18 um 01:42 schrieb Bajaj, Abhinav: