OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 1.10.1 Release?


1.10.1 isn't out yet, so 1.10.0 :)

I think this only affects days with a schedule interval that is one hour or more frequently, and where the dag timezone is not UTC.

For anyone stuck on this in prod I think the fix is to manually create (via backfill or trigger) a dag run after the tz change over time. This should unblock the scheduler.

Ash

On 30 October 2018 18:43:30 GMT, David Klosowski <davidk@xxxxxxxxxxxxx> wrote:
>Hi Airflow Devs:
>
>Is this timezone issue in Airflow version 1.10.0 or only in 1.10.1?
>
>Thanks.
>
>Regards,
>David
>
>On Tue, Oct 30, 2018 at 11:11 AM Bolke de Bruin <bdbruin@xxxxxxxxx>
>wrote:
>
>> we specifically remove timezone info to determine the next schedule.
>Ie.
>> cron sets exact date times so tz info should not make sense. I’m
>going to
>> have a look now.
>>
>>
>> > On 30 Oct 2018, at 19:01, Ash Berlin-Taylor <ash@xxxxxxxxxx> wrote:
>> >
>> > I've done a bit more digging - the issue is of our tz-aware
>handling
>> inside following_schedule (and previous schedule) - causing it to
>loop.
>> >
>> > This section of the croniter docs seems relevant
>> https://github.com/kiorky/croniter#about-dst
>> >
>> >    Be sure to init your croniter instance with a TZ aware datetime
>for
>> this to work !:
>> >>>> local_date = tz.localize(datetime(2017, 3, 26))
>> >>>> val = croniter('0 0 * * *', local_date).get_next(datetime)
>> >
>> > I think the problem is that we are _not_ passing a TZ aware dag in
>and
>> we should be.
>> >
>> >> On 30 Oct 2018, at 17:35, Bolke de Bruin <bdbruin@xxxxxxxxx>
>wrote:
>> >>
>> >> Oh that’s a great environment to start digging. Thanks. I’ll have
>a
>> look.
>> >>
>> >> B.
>> >>
>> >> Verstuurd vanaf mijn iPad
>> >>
>> >>> Op 30 okt. 2018 om 18:25 heeft Ash Berlin-Taylor <ash@xxxxxxxxxx>
>het
>> volgende geschreven:
>> >>>
>> >>> This line in airflow.jobs (line 874 in my checkout) is causing
>the
>> loop:
>> >>>
>> >>>          last_run = dag.get_last_dagrun(session=session)
>> >>>          if last_run and next_run_date:
>> >>>              while next_run_date <= last_run.execution_date:
>> >>>                  next_run_date =
>dag.following_schedule(next_run_date)
>> >>>
>> >>>
>> >>>
>> >>>> On 30 Oct 2018, at 17:20, Ash Berlin-Taylor <ash@xxxxxxxxxx>
>wrote:
>> >>>>
>> >>>> Hi, kaczors on gitter has produced a minmal reproduction case:
>> https://github.com/kaczors/airflow_1_10_tz_bug
>> >>>>
>> >>>> Rough repro steps: In a VM, with time syncing disabled, and
>> configured with system timezone of Europe/Zurich (or any other CEST
>one)
>> run
>> >>>>
>> >>>> - `date 10280250.00`
>> >>>> - initdb, start scheduler, webserver, enable dag etc.
>> >>>> - `date 10280259.00`
>> >>>> - wait 5-10 mins for scheduler to catch up
>> >>>> - After the on-the-hour task run the scheduler will spin up
>another
>> process to parse the dag... and it never returns.
>> >>>>
>> >>>> I've only just managed to reproduce it, so haven't dug in to why
>yet.
>> A quick hacky debug print shows something is stuck in an infinite
>loop.
>> >>>>
>> >>>> -ash
>> >>>>
>> >>>>> On 29 Oct 2018, at 17:59, Bolke de Bruin <bdbruin@xxxxxxxxx>
>wrote:
>> >>>>>
>> >>>>> Can this be confirmed? Then I can have a look at it. Preferably
>with
>> dag definition code.
>> >>>>>
>> >>>>> On the licensing requirements:
>> >>>>>
>> >>>>> 1. Indeed licensing header for markdown documents. It was
>suggested
>> to use html comments. I’m not sure how that renders with others like
>PDF
>> though.
>> >>>>> 2. The licensing notifications need to be tied to a specific
>version
>> as licenses might change with versions.
>> >>>>>
>> >>>>> Cheers
>> >>>>> Bolke
>> >>>>>
>> >>>>> Verstuurd vanaf mijn iPad
>> >>>>>
>> >>>>>> Op 29 okt. 2018 om 12:39 heeft Ash Berlin-Taylor
><ash@xxxxxxxxxx>
>> het volgende geschreven:
>> >>>>>>
>> >>>>>> I was going to make a start on the release, but two people
>have
>> reported that there might be an issue around non-UTC dags and the
>scheduler
>> changing over from Summer time.
>> >>>>>>
>> >>>>>>> 08:45 Emmanuel> Hi there, we are currently experiencing a
>very
>> strange issue : we have hourly DAGs with a start_date in a local
>timezone
>> (not UTC) and since (Sunday) the last winter time change they don’t
>run
>> anymore. Any idea ?
>> >>>>>>> 09:41 <Emmanuel> it impacted all our DAG that had a run at
>3am
>> (Europe/Paris), the exact time of winter time change :(
>> >>>>>>
>> >>>>>> I am going to take a look at this today and see if I can get
>to the
>> bottom of it.
>> >>>>>>
>> >>>>>> Bolke: are there any outstanding tasks/issues that you know of
>that
>> might slow down the vote for a 1.10.1? (i.e. did we sort of out all
>the
>> licensing issues that were asked of us? I thought I read something
>about
>> license declarations in markdown files?)
>> >>>>>>
>> >>>>>> -ash
>> >>>>>>
>> >>>>>>> On 28 Oct 2018, at 14:46, Bolke de Bruin <bdbruin@xxxxxxxxx>
>> wrote:
>> >>>>>>>
>> >>>>>>> I agree with that, but I would favor time based releases
>instead.
>> We are again at the point that a release takes so much time that the
>gap is
>> getting really big again. @ash why not start releasing now and move
>the
>> remainder to 1.10.2? I dont think there are real blockers (although
>we
>> might find them).
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>> On 28 Oct 2018, at 15:35, airflowuser
><airflowuser@xxxxxxxxxxxxxx.INVALID>
>> wrote:
>> >>>>>>>>
>> >>>>>>>> I was really hoping that
>> https://github.com/apache/incubator-airflow/pull/4069 will be merged
>into
>> 1.10.1
>> >>>>>>>> Deleting dags was a highly requested feature for 1.10 - this
>can
>> fix the problem with it.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>> >>>>>>>>>> On Friday, October 26, 2018 6:12 PM, Bolke de Bruin <
>> bdbruin@xxxxxxxxx> wrote:
>> >>>>>>>>>
>> >>>>>>>>> Hey Ash,
>> >>>>>>>>>
>> >>>>>>>>> I was wondering if you are picking up the 1.10.1 release?
>Master
>> is speeding ahead and you were tracking fixes for 1.10.1 right?
>> >>>>>>>>>
>> >>>>>>>>> B.
>> >>>>>>
>> >>>>
>> >>>
>> >
>>
>>