[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Capturing data changes that happen after the initial data pull

Are the endpoints to fetch data for the previous week the same as the
endpoints you use to fetch daily data (with only the date filter parameters
being different)?

On Wed, Jun 6, 2018 at 1:47 PM Pedro Machado <pedro@xxxxxxxxxxxxxx> wrote:

> I am working with an API that provides daily data the day after the period
> completes. For example, 2018-06-01 data is available on 2018-06-02 at 12
> PM.
> I have a daily DAG that pulls this data and loads it into Redshift.
> The issue is that this data provider says that the data may be revised and
> it won't be finalized until the Tuesday after the end of the week.
> For example, for the week of 2018-05-27 through 2018-06-02, the data will
> be "final" on Tuesday 2018-06-05.
> I'd like to add another DAG that takes care of repulling the data for the
> previous week every Tuesday and I am wondering about the best way to
> implement this.
> Should I just develop another DAG that pulls one week at a time using the
> appropriate dates?
> Is there a way to leverage the existing daily DAG and have another dag
> trigger it with the appropriate execution date? If so, I suppose it would
> create new DAG runs. How will I be able to tell these new dag runs apart
> from the daily ones if they have the same execution dates?
> Thanks,
> Pedro