[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Convert Dag Run from Backfill to Scheduled?


While this may work it's clearly not the prescribed way to do this.
Clearing should just work.

I'm trying to understand why the scheduler is not picking up the cleared
task. Clearing should remove the task instance state and set the state of
the related DAG Run to running so that the scheduler picks those up.
Perhaps there's a conflict between the backfill and scheduler-related DAG
Runs? Which DAG runs are set to running? The backfill or scheduler-related
ones?

Originally when I introduced DAG runs, backfill was operating without any
consideration related to DAG runs (DAG runs were a scheduler-specific
construct), later on Bolke added backfill-specific DAG runs and I'm not
100% sure how that works.

Let's get to the bottom of this.

Max

On Fri, May 25, 2018 at 7:48 PM Ruiqin Yang <yrqls21@xxxxxxxxx> wrote:

> If you are sure the update query targets the desired rows, the behavior
> should be the same.
>
> Scott Halgrim <scott.halgrim@xxxxxxxxxx.invalid>于2018年5月25日 周五下午4:23写道:
>
> > So far no ill effects from:
> >
> > update dag_run
> > set run_id = concat('scheduled__', substring(run_id, 10, 19))
> > where dag_id = 'daily'
> > and execution_date > '2017-08-31' and execution_date < '2018-01-11'
> > and run_id like 'backfill_%'
> > order by execution_date;
> >
> > On May 25, 2018, 4:03 PM -0700, Scott Halgrim <scott.halgrim@xxxxxxxxxx
> >,
> > wrote:
> > > Oh wow, that will work? Thanks! Is there any reason for me not to just
> > run a mass UPDATE on those dag runs directly in the metadata database?
> > >
> > > On May 25, 2018, 4:01 PM -0700, Ruiqin Yang <yrqls21@xxxxxxxxx>,
> wrote:
> > > > Airflow is not going to schedule backfill DAG runs, by looking at the
> > dag
> > > > run ID (which will start by 'backfill__'). If you want the scheduler
> to
> > > > schedule those tasks, you can click the DAG run and edit its name
> back
> > to
> > > > 'scheduled__<something>'
> > > >
> > > > Cheers,
> > > > Kevin Y
> > > >
> > > > On Fri, May 25, 2018 at 3:53 PM, Scott Halgrim <
> > > > scott.halgrim@xxxxxxxxxx.invalid> wrote:
> > > >
> > > > > I’ve got four months of dag runs that were scheduled dag runs,
> then I
> > > > > backfilled them. And now when I clear a task from one of those the
> > dag run
> > > > > goes to “running,” but none of the tasks get scheduled (unless I
> > manually
> > > > > backfill each of them)
> > > > >
> > > > > What I really should have done here was just cleared a mid-dag task
> > as
> > > > > well as all downstream tasks for these dag runs, but, well, now I’m
> > here
> > > > > and I’m wondering what the best way to fix this.
> > > > >
> > > > > Thanks!
> > > > >
> > > > >
> >
>