Re: Benchmarking of Airflow Scheduler with Celery Executor
On 2018/04/13 17:00:36, Maxime Beauchemin <maximebeauchemin@xxxxxxxxx> wrote:
> If you're concerned about scheduler scalability I'd go with a bigger box.
> The scheduler uses multiprocessing so more CPU power means more throughput.
> Also you may want to provision a beefy MySQL box to make sure that doesn't
> become the bottleneck. 10k tasks heartbeating to the DB every 30 seconds is
> significant load.
> Perhaps Airbnb folks chime in about their scale and hardware setup?
> On Fri, Apr 13, 2018 at 9:14 AM, ramandumcs@xxxxxxxxx <ramandumcs@xxxxxxxxx>
> > Thanks Ry,
> > Just wondering if there is any approximate number on concurrent tasks a
> > scheduler can run on say 16 GB RAM and 8 core machine.
> > If its already been done that would be useful.
> > We did some benchmarking with local executor and observed that each
> > TaskInstance was taking ~100MB of memory so we could only run ~130
> > concurrent tasks on 16 GB RAM and 8 core machine.
> > -Raman Gupta
> > On 2018/04/12 16:32:37, Ry Walker <ry@xxxxxxxxxxxxx> wrote:
> > > Hi Raman -
> > >
> > > First, we’d be happy to help you test this out with Airflow. Or you could
> > > do it yourself by using http://open.astronomer.io/airflow/ (w/ Docker
> > > Engine + Docker Compose) to quickly spin up a test environment.
> > Everything
> > > is hooked to Prometheus/Grafana to monitor how the system reacts to your
> > > workload.
> > >
> > > -Ry
> > > CEO, Astronomer
> > >
> > > On April 12, 2018 at 12:23:46 PM, ramandumcs@xxxxxxxxx (
> > ramandumcs@xxxxxxxxx)
> > > wrote:
> > >
> > > Hi All,
> > > We have requirement to run 10k(s) of concurrent tasks. We are exploring
> > > Airflow's Celery Executor for same. Horizontally Scaling of worker nodes
> > > seem possible but it can only have one active scheduler.
> > > So will Airflow scheduler be able to handle these many concurrent tasks.
> > > Is there any benchmarking number around airflow scheduler's scalability.
> > > Thanks,
> > > Raman
> > >
> With an AWS EC2 i2.8xlarge box, we run ~14k tasks at peek. Though the scheduling delay also spikes to ~30 mins when we are at peek load. Here's some scheduler config we have:
JOB_HEARTBEAT_SEC = 60
MAX_THREADS = 64
MAX_TIS_PER_QUERY = 512
Also we have the biggest Amazon RDS mysql instance.