osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bit confused about start_date and schedule_interval related to daily/weekly DAG


Hi Kyle,
The execution_date of the DAG run will always be lagged one day for your
daily DAG and one week for your weekly DAG. Under the hood, airflow will
calculate the execution_date and next execution_date of the task, and only
schedule the task when the current timestamp is bigger than the *next
execution_date.*

If you need date other than `ds` or `ds_nodash`, you can explore the other
default variables from here
<https://airflow.apache.org/code.html#default-variables>.

Cheers,
Kevin Y

On Wed, Apr 18, 2018 at 10:56 AM, Kyle Hamlin <hamlin.kn@xxxxxxxxx> wrote:

> I'm a bit confused with how the scheduler catches up in relation to
> start_date and schedule_interval. I have one dag that runs hourly:
>
> dag = DAG(
>     dag_id='hourly_dag',
>     start_date=days_ago(1),
>     schedule_interval='@hourly',
>     default_args=ARGS)
>
> When I start this DAG fresh it will catch up 24 hours + however many hours
> have passed in the current day all the way up to the most recent hour. This
> makes sense to me.
>
> Now if I have a daily DAG:
>
> dag = DAG(
>     dag_id='daily_dag',
>     start_date=days_ago(1),
>     schedule_interval='0 5 * * *',
>     default_args=ARGS)
>
> Starting this DAG fresh will run yesterday's execution. This is fine since
> I use the execution_date (ds_nodash) to have the task be lagged by one day.
> What I can't seem to wrap my head around is how I would get this DAG to run
> for the current day. I've tried passing is days_ago(0) but the tasks never
> seem to start?
>
> In addition to all that, I have a weekly DAG that must also use the
> execution_date, but it needs the current weeks execution_date.
>
> *How do I get a DAG that is not hourly to have an execution_date of the
> current day or week?*
>