[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

About the DAG discovering not synced between scheduler and webserver


When add a new dag, sometimes we can see:

This DAG isn't available in the web server's DagBag object. It shows up in this list because the scheduler marked it as active in the metadata database.

In the views.py, it will collect DAGs under "DAGS_FOLDER" by instantiate a DagBag object as bellow:

dagbag = models.DagBag(settings.DAGS_FOLDER)

So that webserver will depends on its own timing to collect DAGs, but why not just simply to query metadata db ? since if a DAG is active in DB now it can be visible in web at the time.

Could someone share something behind this design ?