[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using large numbers of sensors, resource consumption

Thanks, Ash, Alexander, and Stefan for your replies.

I am relatively new to airflow and not familiar with the code base. I like
the idea of having a more efficient sensor.

The async approach makes sense, but I don't know how well it would fit
within the existing architecture.

I like that Stefan's "reschedule" approach can fit the current architecture
and could be implemented sooner. From the user point of view, my only
feedback is that the UI should not show sensors that are still running as
failed or up for retry as that would draw attention to things that are
running as expected. I'll add this comment to the JIRA issue.



On Tue, Jul 10, 2018 at 9:44 AM Stefan Seelmann <mail@xxxxxxxxxxxxxxxxxx>

> I also have that requirement and I'm working on a proposal for
> rescheduling tasks. My current PoC can be found at [1] which uses
> up_for_retry state which has some problems. I started to make some
> changes, I hope can make a first proposal this week.
> The basic idea is:
> * A new "reschedule" flag for sensors, if set to True it will raise an
> AirflowRescheduleException (with the new schedule date) that causes a
> reschedule
> * Reschedule requests are recorded in new `task_reschedule` table and
> visualized in the Gantt view.
> * A new TI dependency that checks if a task is ready to be re-scheduled
> Advantages:
> * This change is backward compatible. Existing sensors behave like
> before. But it's possible to set the "reschedule" flag.
> * The timeout and poke_interval are still respected and used to
> calculate the next schedule time
> * Custom sensor implementations can even define the next sensible
> schedule date.
> * This mechanism can also be used by non-sensor operators
> Kind Regards,
> Stefan
> [1] https://github.com/seelmann/incubator-airflow/tree/reschedule-sensor-3
> On 07/10/2018 04:05 PM, Pedro Machado wrote:
> > I have a few DAGs that use time sensors to wait until data is ready,
> which
> > can be several days.
> >
> > I have one daily DAG where, for each execution date, I have to repull the
> > data for the next 7 days to capture changes (late arriving revenue data).
> > This DAG currently starts 7 TimeDeltaSensors for each execution days with
> > delays that range from 0 to 6 days.
> >
> > I was wondering what the recommendation is for cases like this where a
> > large number of sensors is needed.
> >
> > Are there ways to reduce the footprint of these sensors so that they use
> > less CPU and memory?
> >
> > I noticed that in one of the DAGs that Germain Tanguy had in the
> > presentation he shared today a sensor was set to time out every 30
> seconds
> > but had a large retry count so instead of running constantly, it runs
> every
> > 15 minutes for 30 seconds and then dies.
> >
> > Are other people using this pattern? Do you have other suggestions?
> >
> > Thanks,
> >
> > Pedro
> >