OSDir

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Repair scheduling tools


Agree on including in the distribution but I think repair can live independently and be run / configured separately.

--
Rahul Singh
rahul.singh@xxxxxxxx

Anant Corporation

On Apr 3, 2018, 4:37 PM -0400, Nate McCall <zznate.m@xxxxxxxxx>, wrote:
> This document does a really good job of listing out some of the issues
> of coordinating scheduling repair. Regardless of which camp you fall
> into, it is certainly worth a read.
>
> On Wed, Apr 4, 2018 at 8:10 AM, Joseph Lynch <joe.e.lynch@xxxxxxxxx> wrote:
> > I just want to say I think it would be great for our users if we moved
> > repair scheduling into Cassandra itself. The team here at Netflix has
> > opened the ticket <https://issues.apache.org/jira/browse/CASSANDRA-14346
> > and have written a detailed design document
> > <https://docs.google.com/document/d/1RV4rOrG1gwlD5IljmrIq_t45rz7H3xs9GbFSEyGzEtM/edit#heading=h.iasguic42ger
> > that includes problem discussion and prior art if anyone wants to
> > contribute to that. We tried to fairly discuss existing solutions, what
> > their drawbacks are, and a proposed solution.
> >
> > If we were to put this as part of the main Cassandra daemon, I think it
> > should probably be marked experimental and of course be something that
> > users opt into (table by table or cluster by cluster) with the
> > understanding that it might not fully work out of the box the first time we
> > ship it. We have to be willing to take risks but we also have to be honest
> > with our users. It may help build confidence if a few major deployments use
> > it (such as Netflix) and we are happy of course to provide that QA as best
> > we can.
> >
> > -Joey
> >
> > On Tue, Apr 3, 2018 at 10:48 AM, Blake Eggleston <beggleston@xxxxxxxxx
> > wrote:
> >
> > > Hi dev@,
> > >
> > >
> > >
> > > The question of the best way to schedule repairs came up on
> > > CASSANDRA-14346, and I thought it would be good to bring up the idea of an
> > > external tool on the dev list.
> > >
> > >
> > >
> > > Cassandra lacks any sort of tools for automating routine tasks that are
> > > required for running clusters, specifically repair. Regular repair is a
> > > must for most clusters, like compaction. This means that, especially as far
> > > as eventual consistency is concerned, Cassandra isn’t totally functional
> > > out of the box. Operators either need to find a 3rd party solution or
> > > implement one themselves. Adding this to Cassandra would make it easier to
> > > use.
> > >
> > >
> > >
> > > Is this something we should be doing? If so, what should it look like?
> > >
> > >
> > >
> > > Personally, I feel like this is a pretty big gap in the project and would
> > > like to see an out of process tool offered. Ideally, Cassandra would just
> > > take care of itself, but writing a distributed repair scheduler that you
> > > trust to run in production is a lot harder than writing a single process
> > > management application that can failover.
> > >
> > >
> > >
> > > Any thoughts on this?
> > >
> > >
> > >
> > > Thanks,
> > >
> > >
> > >
> > > Blake
> > >
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>