[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Implicit Casts for Arithmetic Operators

This (overflow) is an excellent point, but this also affects aggregations which were introduced a long time ago.  They already inherit Java semantics for all of the relevant types (silent wrap around).  We probably want to be consistent, meaning either changing aggregations (which incurs a cost for changing API) or continuing the java semantics here.

This is why having these discussions explicitly in the community before a release is so critical, in my view.  It’s very easy for these semantic changes to go unnoticed on a JIRA, and then ossify.

> On 2 Oct 2018, at 15:48, Ariel Weisberg <ariel@xxxxxxxxxxx> wrote:
> Hi,
> I think we should decide based on what is least surprising as you mention, but isn't overridden by some other concern.
> It seems to me the priorities are
> * Correctness
> * Performance
> * User visible complexity
> * Developer visible complexity
> Defaulting to silent implicit data loss is not ideal from a correctness standpoint.
> Doing something better like using wider types doesn't seem like a performance issue.
> From a user standpoint doing something less lossy doesn't look more complex as long as it's consistent, and documented and doesn't change from version to version.
> There is some developer complexity, but this is a public API and we only get one shot at this. 
> I wonder about how overflow is handled as well. In VoltDB I think we threw on overflow and tended to just do widening conversions to make that less common. We didn't imitate another database (as far as I know) we just went with what least likely to silently corrupt data.
> https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L2213 <https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L2213>
> https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L3764 <https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L3764>
> Ariel
> On Tue, Oct 2, 2018, at 7:30 AM, Benedict Elliott Smith wrote:
>> ç introduced arithmetic operators, and alongside these 
>> came implicit casts for their operands.  There is a semantic decision to 
>> be made, and I think the project would do well to explicitly raise this 
>> kind of question for wider input before release, since the project is 
>> bound by them forever more.
>> In this case, the choice is between lossy and lossless casts for 
>> operations involving integers and floating point numbers.  In essence, 
>> should:
>> (1) float + int = float, double + bigint = double; or
>> (2) float + int = double, double + bigint = decimal; or
>> (3) float + int = decimal, double + bigint = decimal
>> Option 1 performs a lossy implicit cast from int -> float, or bigint -> 
>> double.  Simply casting between these types changes the value.  This is 
>> what MS SQL Server does.
>> Options 2 and 3 cast without loss of precision, and 3 (or thereabouts) 
>> is what PostgreSQL does.
>> The question I’m interested in is not just which is the right decision, 
>> but how the right decision should be arrived at.  My view is that we 
>> should primarily aim for least surprise to the user, but I’m keen to 
>> hear from others.
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx <mailto:dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx>
>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx <mailto:dev-help@xxxxxxxxxxxxxxxxxxxx>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx <mailto:dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx>
> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx <mailto:dev-help@xxxxxxxxxxxxxxxxxxxx>