logo       

Re: The Erlang way - dynamic upgrade of a server and UBF extensions: msg#00415

lang.erlang.general

Subject: Re: The Erlang way - dynamic upgrade of a server and UBF extensions

Hi Joe,

A. great
B. don't know enough about this problem domain.
C. Hmm, I'm worried about what happens when the server experiences difficulties that degrade the quality of service to the point it has problems even telling clients to reroute to the next ip address. Perhaps, tying into B, there could be an explicit set of configuration negotiations between client and server (perhaps relating to versions?), such that a server can tell a client what the fallback measures are for degradation of services from that server -- e.g. if latency falls below .25 sec, then here is a set of other ip addresses to try, etc. This policy could be dynamically adjusted, of course, and having it be persistent would be great so it doesn't have be be renegotiated each time (the version neg. protocol would handle that...). How would this fit into the current client->server model? Perhaps "out of band" messages could be sent as replies to clients -- this would ensure serialization as well (e.g. if all clients were servers and servers clients, a server could broadcast reconfiguration information, but there would be no guarantee that clients would see it before blasting through another few messages -- if a server is trying to calm a traffic storm, it might be a good idea to get the immediate attention of all clients who are actually interacting with the server...)

Another area that has concerned me is error signaling. Although this could fall into the application realm, experience informs me that at least a framework for the formatting of error messages (e.g. predefined types) would be very helpful. A communication mode (ala out of band messages, or rpc style.) I don't know if this would be best as a hard art of UBF(C), or perhaps a set of "recommended practices".

Erik.

On Tuesday, April 29, 2003, at 01:44 AM, Joe Armstrong wrote:


I want to add some things to UBF so we can make fault-tolerent
systems. This can be done with only a little change to UBF. I havn't
thought out all the details so I thought I'd try the idea out first.

All comments are welcome.

The UBF extensions provide an alternative solution to the problem
of dynamically upgrading or migrating a server or of dynamically
upgrading software.

Firstly a bit of philosophy

The Erlang way
==============

1) everything is a process
2) processes communicate by message passing
3) processes obey protocols
4) errors are handled non locally

Should I add more points?

UBF
===

UBF describes (specifies) 3) above

Extensions to UBF
=================

A) ? (a undefined type) - to denote absence of an argument

B) The protocol versioning tag

C) The next IP word.

A) Undefined Type
=================

Just use ? in UBF(A) to denote a missing value (easy)

B) + C)
=======

Client UBF packets to a server should begin

+---------------+---------------------+
| VersionNumber | ... rest of packet |
+---------------+---------------------+

And server replies should begin

+---------------+---------+---------------------+
| VersionNumber | NextIP | ... rest of packet |
+---------------+---------+---------------------+

B) Protocol versioning
======================

The version number is an integer 1 2 3 ... etc.

This is to allow dynamic upgrade of services

I have not thought out all the details but the idea is this:

Assume that a client has versions 1,2,3,6,8,10,11 of some software
The server speaks versions 1,2,3,4,5,9,10,14

At the start of the session we enter "protocol negotiation mode"

Both sides agree that version 10 is the highest common version of the
protocol that they both understand.

A session starts at level 10.

The client S/W crashes - this is detected by the client SW
T
The next message it sends to the server is a version 8 message

The server realizes that something has gone wrong - it cannot handle
version 8 messages - so they go back into protocol negotiation mode.
They agree on version 3 and continue.

This should allow "dynamic introduction of new services" with fallback to
previous versions if things go wrong.

C) Next IP
==========

This is to allow dynamic migration of services.

There are two basic approaches to making things reliable. I'd like
to suggest a third.

The two common methods are:

1) Fixed server IP

The server has a fixed IP. To make things fault-tolerant the fixed IP is the
IP of a switch - the switch dispatches the request to a back-end.

(This is the common cluster solution - the switch is big and
expensive and can act as a front-end to many back-end servers)

2) Two fixed IP's

This is the DNS solution. My machine has two addresses for DNS - I try the
first, if it is broken I try the second.

I propose a third method: Each packet contains a NextIP field. This is
the address where the Next message should be sent.

Example. I have a server at 127.45.67.223

a) The client connects to 127.45.67.223

b) Client and server exchange messages

The server becomes heavily loaded, or, the operator wants to take the
server out of operation.

At some point the client receives a rpc reply with a NEW IP address
(not 127.45.67.223) but something different.

The client closes the connection to 127.45.67.223 connects to the new
address and carries on.

I think that B) + C) above would solve a lot of problems.

Comments?

/Joe




Erik Pearson
Adaptations
desk +1 510 527 5437
cell +1 510 517 3122




<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise