OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UDF


I'm +1 with this solution going in 4.0.

That said, this make we realize that through this dependency we've
ended up exposing (publicly) a bit too much to UDF. Namely, all we really
need/want to expose for UDF is the "value" classes (UDTValue, TupleValue,
Duration and LocalDate) and the types (DataType and his 2 subclasses
UserType
and TupleType).

But we end up exposing quite a bit more: some abstract classes that are
public
(AbstractGettableData, AbstractSettableData, ...), but more problematic imo,
the CodecRegistry class (whose only need to be public is, as far as I can
tell, the UserType/TupeType.of() methods, which aren't useful to UDF at
all)
which 1) is overly complex for what we care about in UDF and 2) brings
transitive dependencies (TypeCodec and all his sub-classes, TypeToken).

So my suggestion is that we go with Robert's patch, but on top of that we
add detection for usage of all those classes/methods that have no business
being
exposed. For brand new UDFs, we could just reject them if they use any of
those
in 4.0. For existing UDFs, we'd log a warning at startup that say it's
deprecated and that the UDF should be replaced or things will break in the
next
major. Honestly, my expectation is that no-one uses those classes/methods
(since again, they aren't exactly useful) so I don't anticipate doing so
being
a big deal. But unless we do it, we won't be able to clean up the code
added by
this.

And there is quite a bit that would be nice to cleanup at some point here
imo.
Not that the driver code is bad in any way; just that 1) some of the
complexity
it exposes is overkill for UDF (registering TypeCodec typically), and that a
lot of the code added duplicates things we already have server side).

--
Sylvain


On Wed, Sep 12, 2018 at 11:46 AM Aleksey Yeschenko <aleksey@xxxxxxxxxx>
wrote:

> It’s my understanding that the duplicated code is being fully donated - so
> it’ll have all the usual ASF license/copyright headers when it lands in
> trunk. No special steps to take here.
>
> —
> AY
>
> On 11 September 2018 at 19:43:58, Jeremiah D Jordan (
> jeremiah.jordan@xxxxxxxxx) wrote:
>
> Be careful when pulling in source files from the DataStax Java Driver (or
> anywhere) to make sure and respect its Apache License, Version 2.0 and keep
> all Copyright's etc with said files.
>
> -Jeremiah
>
> > On Sep 11, 2018, at 12:29 PM, Jeff Jirsa <jjirsa@xxxxxxxxx> wrote:
> >
> > +1 as well.
> >
> > On Tue, Sep 11, 2018 at 10:27 AM Aleksey Yeschenko <aleksey@xxxxxxxxxx>
>
> > wrote:
> >
> >> If this is about inclusion in 4.0, then I support it.
> >>
> >> Technically this is *mostly* just a move+donation of some code from
> >> java-driver to Cassandra. Given how important this seemingly is to the
> >> board and PMC for us to not have the dependency on the driver, the
> sooner
> >> it’s gone, the better.
> >>
> >> I’d be +1 for committing to trunk.
> >>
> >> —
> >> AY
> >>
> >> On 11 September 2018 at 14:43:29, Robert Stupp (snazy@xxxxxxxx)
> wrote:
> >>
> >> The patch is technically complete - i.e. it works and does its thing.
> >>
> >> It's not strictly a bug fix but targets trunk. That's why I started
> the
> >> discussion.
> >>
> >>
> >> On 09/11/2018 02:53 PM, Jason Brown wrote:
> >>> Hi Robert,
> >>>
> >>> Thanks for taking on this work. Is this message a heads up that a
> patch
> >> is
> >>> coming/complete, or to spawn a discussion about including this in
> 4.0?
> >>>
> >>> Thanks,
> >>>
> >>> -Jason
> >>>
> >>> On Tue, Sep 11, 2018 at 2:32 AM, Robert Stupp <snazy@xxxxxxxx>
> wrote:
> >>>
> >>>> In an effort to clean up our hygiene and limit the dependencies used
> >> by
> >>>> UDFs/UDAs, I think we should refactor the UDF code parts and remove
> >> the
> >>>> dependency to the Java Driver in that area without breaking existing
> >>>> UDFs/UDAs.
> >>>>
> >>>> A working prototype is in this branch:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_snazy_&d=DwIFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=Gesm79MRSHznQEKqQabvh3Ie1L3xzqlPsfLfEfadHTM&s=6lUpmmETCKbmt_zcp_DCLIxCGPjVyf7zdX0UjBVOZX4&e=
>
> >>>> cassandra/tree/feature/remove-udf-driver-dep-trunk <
> >>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_snazy_cassandra_tree_feature_remove-2D&d=DwIFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=Gesm79MRSHznQEKqQabvh3Ie1L3xzqlPsfLfEfadHTM&s=fBx64l59d8Y9Q7m9j0nNH9VvcaHc3QfoCAx4st5UJDM&e=
>
> >>>> udf-driver-dep-trunk> . The changes are rather trivial and provide
> >> 100%
> >>>> backwards compatibility for existing UDFs.
> >>>>
> >>>> The prototype copies the necessary parts from the Java Driver into
> the
> >> C*
> >>>> source tree to org.apache.cassandra.cql3.functions.types and adopts
> >> its
> >>>> usages - i.e. UDF/UDA code plus CQLSSTableWriter +
> >> StressCQLSSTableWriter.
> >>>> The latter two classes have a reference to UDF’s UDHelper and had to
> >> be
> >>>> changed as well.
> >>>>
> >>>> Some functionality, like type parsing & handling, is duplicated in
> the
> >>>> code base with this prototype - once in the “current” source tree
> and
> >> once
> >>>> for UDFs. However, unifying the code paths is not trivial, since the
> >> UDF
> >>>> sandbox prohibits the use of internal classes (direct and likely
> >> indirect
> >>>> dependencies).
> >>>>
> >>>> Robert
> >>>>
> >>>> —
> >>>> Robert Stupp
> >>>> @snazy
> >>>>
> >>>>
> >>
> >> --
> >> Robert Stupp
> >> @snazy
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> >> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>
>