OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Implicit Casts for Arithmetic Operators


+1 on Postgres approach. In the last 5 years I’ve seen people move from Oracle and SQL server to some variant of Cassandra or Postgres and other new tech is also more likely to support Postgres (Cockroach..)

I don’t care either way. It really depends on what you are storing.

Rahul Singh
Chief Executive Officer
m 202.905.2818

Anant Corporation
1010 Wisconsin Ave NW, Suite 250
Washington, D.C. 20007

We build and manage digital business technology platforms.
On Oct 2, 2018, 11:11 AM -0700, Jonathan Haddad <jon@xxxxxxxxxxxxx>, wrote:
> Thanks for bringing this up, it definitely needs to be discussed.
>
> Last surprise is difficult here, since all major databases have their own
> way of doing things and people will just assume that their way is the right
> way. On that note, some people will be surprised no matter what we do.
>
> I'd rather avoid the pitfalls of returning incorrect results, so either
> option 2 or 3 sound reasonable, but leaning towards the Postgres approach
> of always returning a decimal for those cases.
>
> Jon
>
>
>
> On Tue, Oct 2, 2018 at 10:54 AM Benedict Elliott Smith <benedict@xxxxxxxxxx>
> wrote:
>
> > I agree, in broad strokes at least. Interested to hear others’ positions.
> >
> >
> >
> > > On 2 Oct 2018, at 16:44, Ariel Weisberg <ariel@xxxxxxxxxxx> wrote:
> > >
> > > Hi,
> > >
> > > I think overflow and the role of widening conversions are pretty linked
> > so I'll continue to inject that into this discussion. Also overflow is much
> > worse since most applications won't be impacted by a loss of precision when
> > an expression involves an int and float, but will care quite a bit if they
> > get some nonsense wrapped number in an integer only expression.
> > >
> > > For VoltDB in practice we didn't run into issues with applications not
> > making progress due to exceptions with real data due to the widening
> > conversions. The range of double and long are pretty big and that hides
> > wrap around/infinity.
> > >
> > > I think the proposal of having all operations return a decimal is
> > attractive in that these expressions always result in a consistent type.
> > Two pain points might be whether client languages have decimal support and
> > whether there is a performance issue? The nice thing about always returning
> > decimal is we can sidestep the issue of overflow.
> > >
> > > I would start with seeing if that's acceptable, and if it isn't then
> > look at other approaches like returning a variety of types such when doing
> > int + int return a bigint or int + float return a double.
> > >
> > > If we take an approach that allows overflow the ideal end state IMO
> > would be to get all users to run Cassandra in way that overflow results in
> > an error even in the context of aggregation. The road to get there is
> > tricky, but maybe start by having it as an opt in tunable in
> > cassandra.yaml. I don't know how/when we could ever change that as a
> > default and it's unfortunate having an option like this that 99% won't know
> > they should flip.
> > >
> > > It seems like having the default throw on overflow is not as bad as it
> > sounds if you do the widening conversions since most people won't run into
> > them. The change in the column types of results sets actually sounds worse
> > if we want to also improve aggregrations. Many applications won't notice if
> > the client library abstracts that away, but I think there are still cases
> > where people would notice the type changing.
> > >
> > > Ariel
> > >
> > > > On Tue, Oct 2, 2018, at 11:09 AM, Benedict Elliott Smith wrote:
> > > > This (overflow) is an excellent point, but this also affects
> > > > aggregations which were introduced a long time ago. They already
> > > > inherit Java semantics for all of the relevant types (silent wrap
> > > > around). We probably want to be consistent, meaning either changing
> > > > aggregations (which incurs a cost for changing API) or continuing the
> > > > java semantics here.
> > > >
> > > > This is why having these discussions explicitly in the community before
> > > > a release is so critical, in my view. It’s very easy for these
> > semantic
> > > > changes to go unnoticed on a JIRA, and then ossify.
> > > >
> > > >
> > > > > On 2 Oct 2018, at 15:48, Ariel Weisberg <ariel@xxxxxxxxxxx> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I think we should decide based on what is least surprising as you
> > mention, but isn't overridden by some other concern.
> > > > >
> > > > > It seems to me the priorities are
> > > > >
> > > > > * Correctness
> > > > > * Performance
> > > > > * User visible complexity
> > > > > * Developer visible complexity
> > > > >
> > > > > Defaulting to silent implicit data loss is not ideal from a
> > correctness standpoint.
> > > > >
> > > > > Doing something better like using wider types doesn't seem like a
> > performance issue.
> > > > >
> > > > > From a user standpoint doing something less lossy doesn't look more
> > complex as long as it's consistent, and documented and doesn't change from
> > version to version.
> > > > >
> > > > > There is some developer complexity, but this is a public API and we
> > only get one shot at this.
> > > > >
> > > > > I wonder about how overflow is handled as well. In VoltDB I think we
> > threw on overflow and tended to just do widening conversions to make that
> > less common. We didn't imitate another database (as far as I know) we just
> > went with what least likely to silently corrupt data.
> > > > >
> > https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L2213
> > <
> > https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L2213
> > >
> > > > >
> > https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L3764
> > <
> > https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L3764
> > >
> > > > >
> > > > > Ariel
> > > > >
> > > > > > On Tue, Oct 2, 2018, at 7:30 AM, Benedict Elliott Smith wrote:
> > > > > > ç introduced arithmetic operators, and alongside these
> > > > > > came implicit casts for their operands. There is a semantic decision
> > to
> > > > > > be made, and I think the project would do well to explicitly raise
> > this
> > > > > > kind of question for wider input before release, since the project is
> > > > > > bound by them forever more.
> > > > > >
> > > > > > In this case, the choice is between lossy and lossless casts for
> > > > > > operations involving integers and floating point numbers. In
> > essence,
> > > > > > should:
> > > > > >
> > > > > > (1) float + int = float, double + bigint = double; or
> > > > > > (2) float + int = double, double + bigint = decimal; or
> > > > > > (3) float + int = decimal, double + bigint = decimal
> > > > > >
> > > > > > Option 1 performs a lossy implicit cast from int -> float, or bigint
> > ->
> > > > > > double. Simply casting between these types changes the value. This
> > is
> > > > > > what MS SQL Server does.
> > > > > > Options 2 and 3 cast without loss of precision, and 3 (or
> > thereabouts)
> > > > > > is what PostgreSQL does.
> > > > > >
> > > > > > The question I’m interested in is not just which is the right
> > decision,
> > > > > > but how the right decision should be arrived at. My view is that we
> > > > > > should primarily aim for least surprise to the user, but I’m keen to
> > > > > > hear from others.
> > > > > > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx <mailto:
> > dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx>
> > > > > > For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> > <mailto:dev-help@xxxxxxxxxxxxxxxxxxxx>
> > > > > >
> > > > >
> > > > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx <mailto:
> > dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx>
> > > > > For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> > <mailto:dev-help@xxxxxxxxxxxxxxxxxxxx>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> > > For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> > For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> >
> >
>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade