[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Identifying delay between schedule & run instances

Hi Everyone,

We just wanted to calculate a metric which can talk about what's the delay(if any) between DAG getting active in scheduler & server and then tasks of DAG actually getting kicked off (let's suppose start_date was of 1 hour earlier and schedule was every 10 minutes).

Currently task_instance table has execution_date, start_date, end_date & queued_dttm, we can easily get this metric from the difference of start_date  & execution_date but in case of back fill, execution_date will be of previous schedule occurrence and difference of start_date & execution_date will be skewed, though it will be okay for any future runs to get the delay in scheduling but for back fills, this number won't be trustworthy, any suggestions how to smartly identify this metric, may be by knowing somehow back fill details? Even in DAG table, there is no create_date & update_date notion which can tell me when this DAG was originally brought to existence?

Vardan Gupta