|
|
Subject: Re: Lamport timestamps? - msg#00054
List: network.spread.user
On 24 Aug 2007, at 8:53 pm, John Schultz wrote:
So does choosing CAUSAL_MESS over AGREED_MESS actually gain one?
Right now, no.
Nothing for now, but future implementations may implement CAUSAL_MESS
more efficiently?
Exactly!
That's fine. I've noticed that with all sorts of performance-critical
systems, the more information about the client's intentions you can
grab the better; even if you're not using it yet, you'll probably be
able to find uses for it later, in the endless tradeoff between
semantics and speed :-)
For example, let's say I'm server A and I'm connected with two
other servers, B and C. I have a causal message from C with a
lamport timestamp on it. I need to somehow know that there are no
outstanding messages from either B or C with lesser timestamps on
them before I can deliver this message. So somehow I must collect
this knowledge (e.g. - ACK from B and index # of origination from
C) from those servers.
Ah, good point, I'm being blinkered by my own application domain; out-
of-order delivery of messages matters not to me, since I can discard
an update to a record if the update is timestamped earlier than the
last-modified timestamp in the record.
The vector timestamp and DAG methods reverse the issue of
discovering dependency. When a sender sends a message it knows
exactly which messages upon which it depends. So it simply
attaches that knowledge to the message. Then when a receiver gets
the message it knows immediately which messages it has to deliver
prior to delivering this one. This method is superior (latency-
wise) to gathering implicit or explicit ACKs from ALL other
participating servers before delivering. It is also more precise
and introduces no unnecessary dependencies (particularly if it is
client generated).
Quite!
Thanks for the insights :-)
ABS
--
Alaric Snell-Pym
Work: http://www.snell-systems.co.uk/
Play: http://www.snell-pym.org.uk/alaric/
Blog: http://www.snell-pym.org.uk/?author=4
Was this page helpful?
Thread at a glance:
Previous Message by Date:
click to view message preview
Re: Lamport timestamps?
On Fri, 24 Aug 2007, Alaric Snell-Pym wrote:
Interesting; I think I'm going to have to put some time aside to read
those papers on spread's implementation...
Please note that Spread is still using protocols similar to the RING
protocol of Totem. The experimental versions of Spread that did fancy
0wide area stuff have never gone into production.
It sounds like my current approach (implementing lamport timestamps
on top of RELIABLE_MESS) is probably the way to go for now, then!
I'm not sure how UNRELIABLE_MESS, RELIABLE_MESS and FIFO_MESS are
implemented currently, although I believe they are NOT simply AGREED_MESS.
I do not know, for example, if the daemon waits for the token to send
these messages or not, which is not strictly needed for these weaker
types (although possibly necessary for flow control).
So does choosing CAUSAL_MESS over AGREED_MESS actually gain one?
Right now, no.
Nothing for now, but future implementations may implement CAUSAL_MESS
more efficiently?
Exactly!
Is that worth the tracking overhead, though, compared to a lamport
timestamp, which after all encodes that the state of the system
sending the message with timestamp K depeneds on all the message it's
received with timestamps less than K?
I know that suggests dependencies on messages that the recipient has
never itself received, which might perhaps cause more to be replayed
than needs be, but does that outweigh the simplicity of the
algorithm? ;-)
A lamport timestamp alone is not sufficient to be able to deliver causal
messages. In order to deliver a causal message you need to know that all
causally previous messages in the system have already been delivered. In
effect, this means you need to collect an acknowlegement from EACH OF THE
OTHER PARTICIPANTS that they didn't send any messages causally prior to
that message. All the LTS tells you is that if you have two messages then
the one with the lesser LTS can't be causally dependent upon the one with
the higher LTS.
For example, let's say I'm server A and I'm connected with two other
servers, B and C. I have a causal message from C with a lamport timestamp
on it. I need to somehow know that there are no outstanding messages from
either B or C with lesser timestamps on them before I can deliver this
message. So somehow I must collect this knowledge (e.g. - ACK from B and
index # of origination from C) from those servers.
The vector timestamp and DAG methods reverse the issue of discovering
dependency. When a sender sends a message it knows exactly which messages
upon which it depends. So it simply attaches that knowledge to the
message. Then when a receiver gets the message it knows immediately which
messages it has to deliver prior to delivering this one. This method is
superior (latency-wise) to gathering implicit or explicit ACKs from ALL
other participating servers before delivering. It is also more precise
and introduces no unnecessary dependencies (particularly if it is client
generated).
Cheers!
---
John Schultz
Spread Concepts
Phn: 443 838 2200
Next Message by Date:
click to view message preview
message reliability
hello.
i apologize if this question has been answered, but i could not find the
answer anywhere.
if i send a message to a group with 6 nodes, and for whatever reason, only 4
nodes receive it, is there a way i can find out that which two nodes did not
receive the message? and that message specifically?
i am using spread 4.0
thank you,
-=- adam grossman
Previous Message by Thread:
click to view message preview
Re: Lamport timestamps?
On Fri, 24 Aug 2007, Alaric Snell-Pym wrote:
Interesting; I think I'm going to have to put some time aside to read
those papers on spread's implementation...
Please note that Spread is still using protocols similar to the RING
protocol of Totem. The experimental versions of Spread that did fancy
0wide area stuff have never gone into production.
It sounds like my current approach (implementing lamport timestamps
on top of RELIABLE_MESS) is probably the way to go for now, then!
I'm not sure how UNRELIABLE_MESS, RELIABLE_MESS and FIFO_MESS are
implemented currently, although I believe they are NOT simply AGREED_MESS.
I do not know, for example, if the daemon waits for the token to send
these messages or not, which is not strictly needed for these weaker
types (although possibly necessary for flow control).
So does choosing CAUSAL_MESS over AGREED_MESS actually gain one?
Right now, no.
Nothing for now, but future implementations may implement CAUSAL_MESS
more efficiently?
Exactly!
Is that worth the tracking overhead, though, compared to a lamport
timestamp, which after all encodes that the state of the system
sending the message with timestamp K depeneds on all the message it's
received with timestamps less than K?
I know that suggests dependencies on messages that the recipient has
never itself received, which might perhaps cause more to be replayed
than needs be, but does that outweigh the simplicity of the
algorithm? ;-)
A lamport timestamp alone is not sufficient to be able to deliver causal
messages. In order to deliver a causal message you need to know that all
causally previous messages in the system have already been delivered. In
effect, this means you need to collect an acknowlegement from EACH OF THE
OTHER PARTICIPANTS that they didn't send any messages causally prior to
that message. All the LTS tells you is that if you have two messages then
the one with the lesser LTS can't be causally dependent upon the one with
the higher LTS.
For example, let's say I'm server A and I'm connected with two other
servers, B and C. I have a causal message from C with a lamport timestamp
on it. I need to somehow know that there are no outstanding messages from
either B or C with lesser timestamps on them before I can deliver this
message. So somehow I must collect this knowledge (e.g. - ACK from B and
index # of origination from C) from those servers.
The vector timestamp and DAG methods reverse the issue of discovering
dependency. When a sender sends a message it knows exactly which messages
upon which it depends. So it simply attaches that knowledge to the
message. Then when a receiver gets the message it knows immediately which
messages it has to deliver prior to delivering this one. This method is
superior (latency-wise) to gathering implicit or explicit ACKs from ALL
other participating servers before delivering. It is also more precise
and introduces no unnecessary dependencies (particularly if it is client
generated).
Cheers!
---
John Schultz
Spread Concepts
Phn: 443 838 2200
Next Message by Thread:
click to view message preview
Updated error_log_spread.pl for mod_log_spread
I found that error_log_spread.pl in the mod_log_spread distribution
(1.0.4), dated 2000, needed a little work.
When I tried it, it turned out my spread-based error_log lacked
newlines. (?)
Since I speak perl, I looked and this was a simple fix, one line.
So while I was at it, I added a small feature: -h will already include
the hostname that sent the message in the log message. I added -a to
include (instead) the _short_ hostname, eg www1 instead of
www1.myfullyqualifieddommainname.com.
This works *great* in production, and I have this apache ErrorLog directive:
ErrorLog "|/usr/local/bin/error_log_spread.pl -a -h -g www_error_log"
This gives me log entries like (sorry if it wraps):
www1 [Thu Aug 23 19:26:45 2007] [error] [client ###.###.###.###] File
does not exist: /var/www/_vti_bin/owssvr.dll
Attached is a diff.
--- error_log_spread.pl 2000-10-14 21:08:04.000000000 -0700
+++ /usr/local/bin/error_log_spread.pl 2007-08-23 19:27:24.000000000 -0700
@@ -18,16 +18,16 @@
$| = 1;
# Read in options
-getopts('dg:hs:', \%opts);
+getopts('adg:hs:', \%opts);
$debug = $opts{d};
&usage() unless ($group = $opts{g});
$hosttoggle = $opts{h};
-$spreaddaemon = ($opts{s}?$opts{s}:3333);
+$spreaddaemon = ($opts{'s'}?$opts{'s'}:3333);
# Set spread connection params
#$hostname = `hostname`;
-chomp ($hostname = `hostname`) ;
+chomp ($hostname = $opts{a} ? `hostname -s` : `hostname`) ;
$args{'spread_name'} = $spreaddaemon;
$args{'private_name'} = "$$-$hostname";
$args{'priority'} = 0;
@@ -35,18 +35,16 @@
# Connect to daemon
print "Trying to connect to spread...\n" if $debug;
-($mbox, $privategroup) = Spread::connect(
- \%args
- );
+($mbox, $privategroup) = Spread::connect( \%args );
print "$sperrno\n" unless (defined($mbox) && $debug);
#Logging loop
while(<STDIN>){
-chomp;
-print "MESSAGE $_\n";
+ # chomp; # spreadlogd doesn't replace this, so don't throw it away.
+ print "MESSAGE $_";
if(($ret = Spread::multicast($mbox,
RELIABLE_MESS, $group, 0,
- ($hosttoggle?$hostname." ".$_:$_)))>0) {
+ ($hosttoggle ? "$hostname $_" : $_))) > 0) {
print STDERR "Successfully multicasted $ret bytes to [$group]\n" if $debug;
} else {
print STDERR "Failed multicast $_ to $group: $sperrno\n" if $debug;
@@ -54,7 +52,7 @@
}
sub usage(){
- print STDERR "Usage: Logger -g group [-h] [-s daemon] [-d]\n";
+ print STDERR "Usage: Logger -g group [-h] [-a] [-s daemon] [-d]\n";
exit 0;
}
_______________________________________________
Spread-users mailing list
Spread-users@xxxxxxxxxxxxxxxx
http://lists.spread.org/mailman/listinfo/spread-users
|
|