Airflow - High Availability and Scale Up vs Scale Out

Hi guys,


I have 2 specific questions for the guys using Airflow in production?


  1. How have you achieved High availability? How does the architecture look like? Do you replicate the master node as well?
  2. Scale Up vs Scale Out?
    1. What is the preferred approach you take? 1 beefy Airflow VM with Worker, Scheduler and Webserver using Local Executor or a cluster with multiple workers using Celery Executor.


I think this thread should help others as well with similar question.






