OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Datetime_Plus and TimestampAdd operators


There doesn’t seem to be a clear victor:
TIMESTAMPADD is specified by ODBC, whereas interval “+”  is specified by the SQL standard. TIMESTAMPADD
TIMESTAMPADD with Calcite extensions can handle unusual intervals such as WEEK, whereas these cannot be directly expressed using interval “+”
Interval “+” can handle multi-unit intervals (e.g. INTERVAL ‘2:30’ HOUR TO MINUTE) whereas the best TIMESTAMPADD could do would be 90 minutes. 

Whichever form we adopt internally, there will be some looseness. When we convert to the other form, the expression will sometimes look different than the user typed.

There is a good reason to choose an internal representation. Then we can pool our resources writing transformation rules that optimize these kinds of expressions.

However, it would be reasonable for the JDBC adapter to convert to whichever is the preferred form for the target database.

It would be useful to have Rex utilities to convert from each form to the other form.

Julian


> On May 30, 2018, at 3:12 PM, Julian Hyde <jhyde@xxxxxxxxxx> wrote:
> 
> I have logged https://issues.apache.org/jira/browse/CALCITE-2339 <https://issues.apache.org/jira/browse/CALCITE-2339> to capture this discussion, and attached a PDF snapshot of the google doc.
> 
>> On May 30, 2018, at 1:24 PM, James Duong <jduong@xxxxxxxxxx <mailto:jduong@xxxxxxxxxx>> wrote:
>> 
>> Note that this is essentially an extension of the current way we expose
>> these two operations.
>> 
>> We expose a form that normalizes the two approaches but allows the adapter
>> layer
>> to get it into a form that it supports. Some databases support TIMESTAMPADD
>> only
>> such as SQL Server. While others support Datetime_plus only such as Oracle.
>> 
>> On Wed, May 30, 2018 at 7:32 AM, James Duong <jduong@xxxxxxxxxx <mailto:jduong@xxxxxxxxxx>> wrote:
>> 
>>> Thanks Michael.
>>> 
>>> Here is a link to a Google Doc:
>>> https://docs.google.com/document/d/1j0wa0pZ2senQhAoy_ <https://docs.google.com/document/d/1j0wa0pZ2senQhAoy_>
>>> 3W_Ev2vHqIhm3ob19BGy4nBJkI/edit?usp=sharing
>>> 
>>> On Wed, May 30, 2018 at 6:08 AM, Michael Mior <mmior@xxxxxxxxxxxx> wrote:
>>> 
>>>> James,
>>>> 
>>>> Just a note that the list doesn't support attachments. I'd suggest
>>>> creating
>>>> a Google Doc with the contents.
>>>> 
>>>> --
>>>> Michael Mior
>>>> mmior@xxxxxxxxxxxx
>>>> 
>>>> 
>>>> Le mer. 30 mai 2018 à 04:23, James Duong <jduong@xxxxxxxxxx> a écrit :
>>>> 
>>>>> I've recorded my thoughts on this in the attached document
>>>>> 
>>>>> A few notes:
>>>>> - TimestampAdd has very well-spec'd behavior as part of ODBC. For
>>>> example
>>>>> the semantics for how to handle adding a second to a date value are
>>>> clearly
>>>>> defined. I experimented with a few different databases for how this is
>>>>> handled with datetime_plus and didn't find it consistent.
>>>>> - Transforming a TimestampAdd expression to datetime_plus would require
>>>>> multiplying a 'unit' interval by the expression used for the interval
>>>>> parameter in TimestampAdd. This looks odd in generated SQL, and if you
>>>> were
>>>>> go to back to timestampadd, the original intent wouldn't be clear.
>>>>> - There are a few units in timestampadd that are not directly supported
>>>> by
>>>>> intervals (week, quarter, millennium, etc)
>>>>> 
>>>>> So I wouldn't recommend datetime_plus. I'd recommend a new structure
>>>>> similar to timestampadd, but with features to transform to
>>>> datetime_plus.
>>>>> 
>>>>> On Mon, May 28, 2018 at 6:52 PM, Julian Hyde <jhyde@xxxxxxxxxx> wrote:
>>>>> 
>>>>>> I think we should use the datetime_plus operator. It is standard, and
>>>>>> sufficiently general. Its second argument needs to be an interval
>>>>>> value, not necessarily an interval literal.
>>>>>> 
>>>>>> On Mon, May 28, 2018 at 2:35 PM, James Duong <jduong@xxxxxxxxxx>
>>>> wrote:
>>>>>>> There are essentially two ways to add an interval to a datetime vaue
>>>> in
>>>>>>> Calcite
>>>>>>> 
>>>>>>> Call the timestampadd() function:
>>>>>>> select {fn timestampadd(year, 1, hire_date)}...
>>>>>>> 
>>>>>>> Use datetime_plus interval arithmetic:
>>>>>>> select hire_date + interval '1' year
>>>>>>> 
>>>>>>> 
>>>>>>> Note that timestampadd's second argument does not need to be a
>>>> literal.
>>>>>>> Often it is a column expression. For datetime_plus literals are
>>>> usually
>>>>>>> used.
>>>>>>> 
>>>>>>> I propose we create a new SqlOperator that can canonicalize both of
>>>>>> these
>>>>>>> inputs into one node. This lets us apply any transformations on this
>>>>>>> canonical type regardless of what the original query was.
>>>>>>> 
>>>>>>> It takes in the following arguments:
>>>>>>> 1. A date/time/timestamp input
>>>>>>> 2. an interval input as an integer
>>>>>>> 3. a time unit for the input
>>>>>>> 4. a synthetic argument indicating the source form of the function
>>>> call
>>>>>>> (either datetime_plus or timestampadd).
>>>>>>> 
>>>>>>> 
>>>>>>> The idea is that this canonical form is easy to get into for both
>>>> types,
>>>>>>> and provides methods to easily convert to either type. This would
>>>> help
>>>>>> with
>>>>>>> unparsing in SqlDialects (you do not need to implement pushdown for
>>>> both
>>>>>>> types of inputs).
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>