|
|
Choosing A Webhost: |
Wackamole failing after cable dis-/reconnect: msg#00013apache.mod-wackamole.general
Hallo, I have a Spread/Wackamole setup which works at least testing-wise fine as long as I fail one machine of the two machine setup by completely rebooting it or by killing the Wackamole or Spread daemon. I the above case(s) the other machine in that litte cluster takes over with a very short outage only. The problem which I encounter is after disconnecting physically the network interface's cable on either machine and afterwards reconnecting that cable. Step by step I do the following, assuming initially both machines are fine an listening: 1. disconnect NIC cable of machine A (which is 2.4.20 kernel, German SUSE 8.2 distro) 2. watch syslog on the other machine B (which is RH 7.1, kernel 2.4.2-2), wait for Wackamole to complete the arp spoof 3. watch ping -t on a Windows box on the same network. After disconnection there is a brief outage of one-two seconds, then the other machine jumps in, and ping is receiving good responses again 3. reconnect NIC cable of machine A (where Spread daemon and Wackamole have continued running while the cable was off) 4. watch syslog of machine B, Wackamole brings the VIP down 5. watch syslog of machine A, there is no activity, apart from the notice that the cacle has been reconnected and a 100Mbit link has been established 6. watch ping -t on Windows box, ping receives destination host unreachable messages originating from the physical IP of machine B. Since machine B has taken down the VIP it was listening to when the cable on machine A was reconnected it should not be able to respond to ping going to the VIP, which is OK. 7. doing arp -a on the Windows box, I see that the arp cache for the VIP has not been updated. One explanation that occurs to me is, that the arp spoof and subsequent update of the shared arp cache seem to happen only when a VIP comes up, not when its ging down. So in my case, the VIP on machine B goes down, without notifying anyone of it. And the VIP on machine A, which has been up right through during the physical disconnect, does not sense any changes and therefore does not broadcast arp information as well. 8. If I purge the VIP from the Windows box arp cache, the ping comes right back with good responses. Well, I hope no one got bored with the lengthy explanation. I will post the important parts of my conf below. The Wackamole conf is different from most others I have seen. I want (have to) use only one IP Address as VIP for both machines in my little cluster, since booth machines have to exposed by that IP address, not at the same time (I know this wouldn't work) but intermittendly depending on their health state or running condition. The network has no DNS available, therefore I have to go with the IP. Spread (Conf is identical on both machines A and B) Spread_Segment 192.168.1.255:4803 { "192" 192.168.1.141 ibm-linux 192.168.1.59 } Wackamole (identical as well) # The Spread daemon we are going to connect to. It should be on the local box Spread = 4803 SpreadRetryInterval = 2s # The group name Group = wack1 # Named socket for online control Control = /var/run/wack.it # Denote the interface we prefer to have #prefer eth0:10.3.4.5/8 #prefer { eth0:10.2.3.4/8 eth1:192.168.10.23/24 } # In most cases, I just don't care. Let wackamole decide. Prefer None # List all the virtual interfaces (ALL of them) VirtualInterfaces { # The following two lines have the same effect # en0:192.168.1.2/24 { eth0:192.168.1.200/24 } # This is how you say 2 or more IPs are to be treated as a single # "set" or "virtual interface". If wackamole decides that this # machine will manage it, you are ensured to get ALL the ips in the # set. # { en1:10.0.0.1/8 en0:192.168.35.64/26 } } # Collect and broadcast the IPs in our ARP table every so often Arp-Cache = 1s # List who we will notify # Here the netblock (/24 or /28) can be deceptive. It is NOT a netmask # for a single IP. It is how one will describe that they want to # notify ALL IPs in a segment. Notify { # Let's notify our router: eth0:192.168.1.1/32 # Notify out DNS servers # en1:10.0.0.10/32 # en1:10.0.0.11/32 # 10.0.0.0 -> 10.0.0.255, but only 128 notifications/sec # en0:10.0.0.0/24 throttle 128 # Wackamole shares arp-cache across machines, this says to # notify every IP address in the aggregate shared arp-cache. arp-cache } balance { # This field is the maximum number of IP addresses that will move # from one wackamole to another during a round of balancing. AcquisitionsPerRound = 1 # Time interval in each balancing round. interval = 1s } # How long it takes us to mature mature = 3s ----- If anyone has got some time: Can that what I intend to do work at all? Any hints how I could work aroud my problem? If you haven't the time, thanks for reading anyway! -- Mit freundlichen Gruessen / Kind Regards Toralf Richter triplesense GmbH Hanauer Landstraße 186 60314 Frankfurt am Main
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | RE: Wackamole on FreeBSD 5.1 will not start, Quinby, James (MAN-Corporate) |
|---|---|
| Next by Date: | Re: Wackamole failing after cable dis-/reconnect, Ashima Munjal |
| Previous by Thread: | Wackamole on FreeBSD 5.1 will not start, Cauthorn, Matt (MAN-Corporate) |
| Next by Thread: | Re: Wackamole failing after cable dis-/reconnect, Ashima Munjal |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
Free MagazinesCisco NewsReceive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business. subscribe Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field. subscribe The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business. subscribe Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company. subscribe Total Telecom Total Telecom is "The Economist of the communications industry". subscribe |