OSDir

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Evolving the client protocol


Is one of the “abuse” of Apache license is ScyllaDB which is using Cassandra but not contributing back?
Happy to be proved wrong as I am not a lawyer and don’t understand various licenses ..

> On Apr 23, 2018, at 16:55, Dor Laor <dor@xxxxxxxxxxxx> wrote:
> 
>> On Mon, Apr 23, 2018 at 4:13 PM, Jonathan Haddad <jon@xxxxxxxxxxxxx> wrote:
>> 
>> From where I stand it looks like you've got only two options for any
>> feature that involves updating the protocol:
>> 
>> 1. Don't built the feature
>> 2. Built it in Cassanda & scylladb, update the drivers accordingly
>> 
>> I don't think you have a third option, which is built it only in ScyllaDB,
>> because that means you have to fork *all* the drivers and make it work,
>> then maintain them.  Your business model appears to be built on not doing
>> any of the driver work yourself, and you certainly aren't giving back to
>> the open source community via a permissive license on ScyllaDB itself, so
>> I'm a bit lost here.
>> 
> 
> It's totally not about business model.
> Scylla itself is 99% open source with AGPL license that prevents abuse and
> forces to be committed back to the project. We also have our core engine
> (seastar) licensed
> as Apache since it needs to be integrated with  the core application.
> Recently one of our community members even created a new Seastar based, C++
> driver.
> 
> Scylla chose to be compatible with the drivers in order to leverage the
> existing infrastructure
> and (let's be frank) in order to allow smooth migration.
> We would have loved to contribute more to the drivers but up to recently we:
> 1. Were busy on top of our heads with the server
> 2. Happy w/ the existing drivers
> 3. Developed extensions - GoCQLX - our own contribution
> 
> Finally we can contribute back to the same driver project, we want to do it
> the right way,
> without forking and without duplicated efforts.
> 
> Many times, having a private fork is way easier than proper open source
> work so from
> a pure business perspective, we don't select the shortest path.
> 
> 
>> 
>> To me it looks like you're asking a bunch of volunteers that work on
>> Cassandra to accommodate you.  What exactly do we get out of this
>> relationship?  What incentive do I or anyone else have to spend time
>> helping you instead of working on something that interests me?
>> 
> 
> Jon, this is certainty not the case.
> We genuinely wish to make true *open source* work on:
> a. Cassandra drivers
> b. Client protocol
> c. Scylla server side.
> d. Cassandra community related work: mailing list, Jira, design
> 
> But not
> e. Cassandra server side
> 
> While I wouldn't mind doing the Cassandra server work, we don't have the
> resources or
> the expertise. The Cassandra _developer_ community is welcome to decide
> whether
> we get to contribute a/b/c/d. Avi has enumerated the options of
> cooperation, passive cooperation
> and zero cooperation (below).
> 
> 1. The protocol change is developed using the Cassandra process in a JIRA
> ticket, culminating in a patch to doc/native_protocol*.spec when consensus
> is achieved.
> 2. The protocol change is developed outside the Cassandra process.
> 3. No cooperation.
> 
> Look, I can understand the hostility and suspicious, however, from the C*
> project POV, it makes no
> sense to ignore, otherwise we'll fork the drivers and you won't get
> anything back. There is another
> at least one vendor today with their server fork and driver fork and it
> makes sense to keep the protocol
> unified in an extensible way and to discuss new features _together_.
> 
> 
> 
>> 
>> Jon
>> 
>> 
>> On Mon, Apr 23, 2018 at 7:59 AM Ben Bromhead <ben@xxxxxxxxxxxxxxx> wrote:
>> 
>>>>>> This doesn't work without additional changes, for RF>1. The token
>> ring
>>>> could place two replicas of the same token range on the same physical
>>>> server, even though those are two separate cores of the same server.
>> You
>>>> could add another element to the hierarchy (cluster -> datacenter ->
>> rack
>>>> -> node -> core/shard), but that generates unneeded range movements
>> when
>>> a
>>>> node is added.
>>>>> I have seen rack awareness used/abused to solve this.
>>>>> 
>>>> 
>>>> But then you lose real rack awareness. It's fine for a quick hack, but
>>>> not a long-term solution.
>>>> 
>>>> (it also creates a lot more tokens, something nobody needs)
>>>> 
>>> 
>>> I'm having trouble understanding how you loose "real" rack awareness, as
>>> these shards are in the same rack anyway, because the address and port
>> are
>>> on the same server in the same rack. So it behaves as expected. Could you
>>> explain a situation where the shards on a single server would be in
>>> different racks (or fault domains)?
>>> 
>>> If you wanted to support a situation where you have a single rack per DC
>>> for simple deployments, extending NetworkTopologyStrategy to behave the
>> way
>>> it did before https://issues.apache.org/jira/browse/CASSANDRA-7544 with
>>> respect to treating InetAddresses as servers rather than the address and
>>> port would be simple. Both this implementation in Apache Cassandra and
>> the
>>> respective load balancing classes in the drivers are explicitly designed
>> to
>>> be pluggable so that would be an easier integration point for you.
>>> 
>>> I'm not sure how it creates more tokens? If a server normally owns 256
>>> tokens, each shard on a different port would just advertise ownership of
>>> 256/# of cores (e.g. 4 tokens if you had 64 cores).
>>> 
>>> 
>>>> 
>>>>> Regards,
>>>>> Ariel
>>>>> 
>>>>>> On Apr 22, 2018, at 8:26 AM, Avi Kivity <avi@xxxxxxxxxxxx> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On 2018-04-19 21:15, Ben Bromhead wrote:
>>>>>>> Re #3:
>>>>>>> 
>>>>>>> Yup I was thinking each shard/port would appear as a discrete
>> server
>>>> to the
>>>>>>> client.
>>>>>> This doesn't work without additional changes, for RF>1. The token
>> ring
>>>> could place two replicas of the same token range on the same physical
>>>> server, even though those are two separate cores of the same server.
>> You
>>>> could add another element to the hierarchy (cluster -> datacenter ->
>> rack
>>>> -> node -> core/shard), but that generates unneeded range movements
>> when
>>> a
>>>> node is added.
>>>>>> 
>>>>>>> If the per port suggestion is unacceptable due to hardware
>>>> requirements,
>>>>>>> remembering that Cassandra is built with the concept scaling
>>>> *commodity*
>>>>>>> hardware horizontally, you'll have to spend your time and energy
>>>> convincing
>>>>>>> the community to support a protocol feature it has no (current) use
>>>> for or
>>>>>>> find another interim solution.
>>>>>> Those servers are commodity servers (not x86, but still commodity).
>> In
>>>> any case 60+ logical cores are common now (hello AWS i3.16xlarge or
>> even
>>>> i3.metal), and we can only expect logical core count to continue to
>>>> increase (there are 48-core ARM processors now).
>>>>>> 
>>>>>>> Another way, would be to build support and consensus around a clear
>>>>>>> technical need in the Apache Cassandra project as it stands today.
>>>>>>> 
>>>>>>> One way to build community support might be to contribute an Apache
>>>>>>> licensed thread per core implementation in Java that matches the
>>>> protocol
>>>>>>> change and shard concept you are looking for ;P
>>>>>> I doubt I'll survive the egregious top-posting that is going on in
>>> this
>>>> list.
>>>>>> 
>>>>>>> 
>>>>>>>> On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg <ariel@xxxxxxxxxxx
>>> 
>>>> wrote:
>>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> So at technical level I don't understand this yet.
>>>>>>>> 
>>>>>>>> So you have a database consisting of single threaded shards and a
>>>> socket
>>>>>>>> for accept that is generating TCP connections and in advance you
>>>> don't know
>>>>>>>> which connection is going to send messages to which shard.
>>>>>>>> 
>>>>>>>> What is the mechanism by which you get the packets for a given TCP
>>>>>>>> connection delivered to a specific core? I know that a given TCP
>>>> connection
>>>>>>>> will normally have all of its packets delivered to the same queue
>>>> from the
>>>>>>>> NIC because the tuple of source address + port and destination
>>>> address +
>>>>>>>> port is typically hashed to pick one of the queues the NIC
>>> presents. I
>>>>>>>> might have the contents of the tuple slightly wrong, but it always
>>>> includes
>>>>>>>> a component you don't get to control.
>>>>>>>> 
>>>>>>>> Since it's hashing how do you manipulate which queue packets for a
>>> TCP
>>>>>>>> connection go to and how is it made worse by having an accept
>> socket
>>>> per
>>>>>>>> shard?
>>>>>>>> 
>>>>>>>> You also mention 160 ports as bad, but it doesn't sound like a big
>>>> number
>>>>>>>> resource wise. Is it an operational headache?
>>>>>>>> 
>>>>>>>> RE tokens distributed amongst shards. The way that would work
>> right
>>>> now is
>>>>>>>> that each port number appears to be a discrete instance of the
>>>> server. So
>>>>>>>> you could have shards be actual shards that are simply colocated
>> on
>>>> the
>>>>>>>> same box, run in the same process, and share resources. I know
>> this
>>>> pushes
>>>>>>>> more of the complexity into the server vs the driver as the server
>>>> expects
>>>>>>>> all shards to share some client visible like system tables and
>>> certain
>>>>>>>> identifiers.
>>>>>>>> 
>>>>>>>> Ariel
>>>>>>>>> On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:
>>>>>>>>> Port-per-shard is likely the easiest option but it's too ugly to
>>>>>>>>> contemplate. We run on machines with 160 shards (IBM POWER
>>> 2s20c160t
>>>>>>>>> IIRC), it will be just horrible to have 160 open ports.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> It also doesn't fit will with the NICs ability to automatically
>>>>>>>>> distribute packets among cores using multiple queues, so the
>> kernel
>>>>>>>>> would have to shuffle those packets around. Much better to have
>>> those
>>>>>>>>> packets delivered directly to the core that will service them.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> (also, some protocol changes are needed so the driver knows how
>>>> tokens
>>>>>>>>> are distributed among shards)
>>>>>>>>> 
>>>>>>>>>> On 2018-04-19 19:46, Ben Bromhead wrote:
>>>>>>>>>> WRT to #3
>>>>>>>>>> To fit in the existing protocol, could you have each shard
>> listen
>>>> on a
>>>>>>>>>> different port? Drivers are likely going to support this due to
>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-7544 (
>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm
>> not
>>>> super
>>>>>>>>>> familiar with the ticket so their might be something I'm missing
>>>> but it
>>>>>>>>>> sounds like a potential approach.
>>>>>>>>>> 
>>>>>>>>>> This would give you a path forward at least for the short term.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg <
>>> ariel@xxxxxxxxxxx>
>>>>>>>> wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> I think that updating the protocol spec to Cassandra puts the
>>> onus
>>>> on
>>>>>>>> the
>>>>>>>>>>> party changing the protocol specification to have an
>>> implementation
>>>>>>>> of the
>>>>>>>>>>> spec in Cassandra as well as the Java and Python driver (those
>>> are
>>>>>>>> both
>>>>>>>>>>> used in the Cassandra repo). Until it's implemented in
>> Cassandra
>>> we
>>>>>>>> haven't
>>>>>>>>>>> fully evaluated the specification change. There is no
>> substitute
>>>> for
>>>>>>>> trying
>>>>>>>>>>> to make it work.
>>>>>>>>>>> 
>>>>>>>>>>> There are also realities to consider as to what the maintainers
>>> of
>>>> the
>>>>>>>>>>> drivers are willing to commit.
>>>>>>>>>>> 
>>>>>>>>>>> RE #1,
>>>>>>>>>>> 
>>>>>>>>>>> I am +1 on the fact that we shouldn't require an extra hop for
>>>> range
>>>>>>>> scans.
>>>>>>>>>>> In JIRA Jeremiah made the point that you can still do this from
>>> the
>>>>>>>> client
>>>>>>>>>>> by breaking up the token ranges, but it's a leaky abstraction
>> to
>>>> have
>>>>>>>> a
>>>>>>>>>>> paging interface that isn't a vanilla ResultSet interface.
>> Serial
>>>> vs.
>>>>>>>>>>> parallel is kind of orthogonal as the driver can do either.
>>>>>>>>>>> 
>>>>>>>>>>> I agree it looks like the current specification doesn't make
>> what
>>>>>>>> should
>>>>>>>>>>> be simple as simple as it could be for driver implementers.
>>>>>>>>>>> 
>>>>>>>>>>> RE #2,
>>>>>>>>>>> 
>>>>>>>>>>> +1 on this change assuming an implementation in Cassandra and
>> the
>>>>>>>> Java and
>>>>>>>>>>> Python drivers.
>>>>>>>>>>> 
>>>>>>>>>>> RE #3,
>>>>>>>>>>> 
>>>>>>>>>>> It's hard to be +1 on this because we don't benefit by boxing
>>>>>>>> ourselves in
>>>>>>>>>>> by defining a spec we haven't implemented, tested, and decided
>> we
>>>> are
>>>>>>>>>>> satisfied with. Having it in ScyllaDB de-risks it to a certain
>>>>>>>> extent, but
>>>>>>>>>>> what if Cassandra decides to go a different direction in some
>>> way?
>>>>>>>>>>> 
>>>>>>>>>>> I don't think there is much discussion to be had without an
>>> example
>>>>>>>> of the
>>>>>>>>>>> the changes to the CQL specification to look at, but even then
>> if
>>>> it
>>>>>>>> looks
>>>>>>>>>>> risky I am not likely to be in favor of it.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Ariel
>>>>>>>>>>> 
>>>>>>>>>>>> On Thu, Apr 19, 2018, at 9:33 AM, glommer@xxxxxxxxxxxx wrote:
>>>>>>>>>>>> On 2018/04/19 07:19:27, kurt greaves <kurt@xxxxxxxxxxxxxxx>
>>>> wrote:
>>>>>>>>>>>>>> 1. The protocol change is developed using the Cassandra
>>> process
>>>> in
>>>>>>>>>>>>>>     a JIRA ticket, culminating in a patch to
>>>>>>>>>>>>>>     doc/native_protocol*.spec when consensus is achieved.
>>>>>>>>>>>>> I don't think forking would be desirable (for anyone) so this
>>>> seems
>>>>>>>>>>>>> the most reasonable to me. For 1 and 2 it certainly makes
>> sense
>>>> but
>>>>>>>>>>>>> can't say I know enough about sharding to comment on 3 -
>> seems
>>>> to me
>>>>>>>>>>>>> like it could be locking in a design before anyone truly
>> knows
>>>> what
>>>>>>>>>>>>> sharding in C* looks like. But hopefully I'm wrong and there
>>> are
>>>>>>>>>>>>> devs out there that have already thought that through.
>>>>>>>>>>>> Thanks. That is our view and is great to hear.
>>>>>>>>>>>> 
>>>>>>>>>>>> About our proposal number 3: In my view, good protocol designs
>>> are
>>>>>>>>>>>> future proof and flexible. We certainly don't want to propose
>> a
>>>>>>>> design
>>>>>>>>>>>> that works just for Scylla, but would support reasonable
>>>>>>>>>>>> implementations regardless of how they may look like.
>>>>>>>>>>>> 
>>>>>>>>>>>>> Do we have driver authors who wish to support both projects?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Surely, but I imagine it would be a minority. ​
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>> ---------------------------------------------------------------------
>>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>> For
>>>>>>>>>>>> additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>> ---------------------------------------------------------------------
>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>> Ben Bromhead
>>>>>>>>>> CTO | Instaclustr <https://www.instaclustr.com/>
>>>>>>>>>> +1 650 284 9692 <(650)%20284-9692> <(650)%20284-9692>
>>> <(650)%20284-9692>
>>>>>>>>>> Reliability at Scale
>>>>>>>>>> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
>>>>>>>>>> 
>>>>>>>>> 
>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>> 
>>>>>>>> 
>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>>>>>> 
>>>>>>>> --
>>>>>>> Ben Bromhead
>>>>>>> CTO | Instaclustr <https://www.instaclustr.com/>
>>>>>>> +1 650 284 9692 <(650)%20284-9692> <(650)%20284-9692>
>>>>>>> Reliability at Scale
>>>>>>> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
>>>>>>> 
>>>>>> 
>>>>>> ------------------------------------------------------------
>> ---------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>>>> 
>>>>> 
>>>>> ------------------------------------------------------------
>> ---------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>>> 
>>>> 
>>>> --
>>> Ben Bromhead
>>> CTO | Instaclustr <https://www.instaclustr.com/>
>>> +1 650 284 9692 <(650)%20284-9692>
>>> Reliability at Scale
>>> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
>>> 
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx