dynamic dag generation
I have a DAG where the input size (rows) may grow or shrink significantly.
The first step (A) determines the size of the input set and groups into
batches of a pre-defined size.
The second step I want to generate a task per batch to perform an upload to
a third party API (google adwords) / computation.
The final step is a sensor which waits for the status of the batch to be
completed and then a final task.
Thoughts so far:
- I don't necessarily need all tasks to execute in parallel I just want to
be able to control the number that do through Pools
- I could potentially calculate the batch size and number of tasks required
at DAG compile time but this would make my DAG loading very slow (as I will
have lots of DAGs doing this)
- Is changing the number of tasks in a DAG dynamically going to screw up
- I found this https://stackoverflow.com/a/51977800 but it feels a bit of a
- I could trigger multiple dagruns but this makes it harder to visualise
and trace through the UI
Or am i approaching this problem in the wrong way?
Thanks for your help,