OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Load balancing and load determination


> Am 05.11.2018 um 16:58 schrieb William A Rowe Jr <wrowe@xxxxxxxxxxxxx>:
> 
> On Mon, Nov 5, 2018 at 7:48 AM jean-frederic clere <jfclere@xxxxxxxxx> wrote:
> On 30/10/2018 13:53, Jim Jagielski wrote:
> > As some of you know, one of my passions and area of focus is
> > on the use of Apache httpd as a reverse proxy and, as such, load
> > balancing, failover, etc are of vital interest to me.
> > 
> > One topic which I have mulling over, off and on, has been the
> > idea of some sort of universal load number, that could be used
> > and agreed upon by web servers. Right now, the reverse proxy
> > "guesses" the load on the backend servers which is OK, and
> > works well enough, but it would be great if it actually "knew"
> > the current loads on those servers. I already have code that
> > shares basic architectural info, such as number of CPUs, available
> > memory, loadavg, etc which can help, of course, but again, all
> > this info can be used to *infer* the current status of those backend
> > servers; it doesn't really provide what the current load actually
> > *is*.
> > 
> > So I was thinking maybe some sort of small, simple and "fast"
> > benchmark which could be run by the backends as part of their
> > "status" update to the front-end reverse proxy server... something
> > that shows general capability at that point in time, like Hanoi or
> > something similar. Or maybe some hash function. Some simple code
> > that could be used to create that "universal" load number.
> > 
> > Thoughts? Ideas? Comments? Suggestions? :)
> 
> having the back-ends to provide the load they are able to handle
> lbfactor (via w_lf or somethere similar. That requires the back-ends to
> be able to send request to httpd balancer-manager handler.
> 
> Not really. I'd suggest a response header, travelling with each response
> back to the balancer, which can be composed quickly enough to share
> a play-by-play snapshot of the availability of that backend. This adds
> next to no traffic and minimal cpu drain if composed cleanly. And it can
> optionally be axed by the balancer in the response to the client.
> 
> The last thing we want are the routing headaches of contacting an
> ever-changing list one-or-many potential balancers. And we can't
> rely on a dying lbmember to "check in" that it isn't functional. Since
> the balancer must already start requests to the backend, having that
> backend supplement the responses with its health status is simple.

Funnily enough, I did my master thesis (is that a word?) a long, long
while ago on scheduling in distributed systems. And with "distributed"
the general tricky thing is that there is not global knowledge of the
system state.

While any load indicator reported from the backends might look very
useful, once you deal with several front ends, this degenerates quickly
(where each frontend makes its own decision without talking to each
other).

If you detect and exclude any failing backends (heartbeat), then, with
growing number of back- and frontends, it's very hard to beat a random
job distribution.

I found that, in general, pulling works slightly better than pushing. The
scenario here would be that backends ask frontends for requests to execute.
That is also very stable in case of backend failures, of course.

tl;dr

If your problem scenario includes more than a single frontend, go for random.

Cheers,

Stefan