OSDir

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Repair scheduling tools


I think a take away here is that we can't assume a level of operation
maturity will coincide automatically with scale. To make our core
features robust, we have to account for less-experienced users.

A lot of folks on this thread have *really* strong ops and OpsViz
stories. Let's not forget that most of our users don't.
((Un)fortunately, as a consulting firm, we tend to see the worst of
this).

On Fri, Apr 6, 2018 at 2:52 PM, Jonathan Haddad <jon@xxxxxxxxxxxxx> wrote:
> Off the top of my head I can remember clusters with 600 or 700 nodes with
> 256 tokens.
>
> Not the best situation, but it’s real. 256 has been the default for better
> or worse.
>
> On Thu, Apr 5, 2018 at 7:41 PM Joseph Lynch <joe.e.lynch@xxxxxxxxx> wrote:
>
>> >
>> > We see this in larger clusters regularly. Usually folks have just
>> > 'grown into it' because it was the default.
>> >
>>
>> I could understand a few dozen nodes with 256 vnodes, but hundreds is
>> surprising. I have a whitepaper draft lying around showing how vnodes
>> decrease availability in large clusters by orders of magnitude, I'll polish
>> it up and send it out to the list when I get a second.
>>
>> In the meantime, sorry for de-railing a conversation about repair
>> scheduling to talk about vnodes, let's chat about that in a different
>> thread :-)
>>
>> -Joey
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx