|
|
Mozy Online Backup: 2GB Free. Automatic. Secure.
Subject: RE: Wackamole on FreeBSD 5.1 will not start - msg#00012
List: apache.mod-wackamole.general
> -----Original Message-----
> If you compiling 5.1 from cvsup, you may not have disabled all the
> debugging? If that is the case, I think all new memory allocations
> (including free pages, etc.) get allocated as 0x5c5c5c5c or some such
> thing. That could be the issue -- and it would be an issue with
> wackamole.
>
> // Theo Schlossnagle
We did a full install of 5.1-REL from the ISO. I forgot to hang on to a copy
of the kernel config file, but I'm pretty sure I keptthe debugging option
commented out. In any case, the release notes warn about debugging and
diagnostic code scattered about (those these are supposed to impact
performance). We've moved forward with our 4.9-RC2 machines, but my
workstation is still running 5.1, and I can replicate it. I'm going to keep
poking at it, and can post traces and so forth if people are interested.
--
Jay Quinby
Web Hosting Admin.
Manheim Auctions
(678) 645-2438 / aim: ManheimJRQ
This message (including any attachments) contains confidential information
intended for a specific individual and purpose, and is protected by law. If
you are not the intended recipient, you should delete this message and are
hereby notified that any disclosure, copying, or distribution of this
message, or the taking of any action based on it, is strictly prohibited.
Was this page helpful?
Thread at a glance:
Previous Message by Date:
click to view message preview
RE: Wackamole on FreeBSD 5.1 will not start
-----Original Message-----
From: Theo Schlossnagle
To: wackamole-users@xxxxxxxxxxxxxxxxxx
Cc: Theo Schlossnagle
Sent: 10/19/03 8:03 PM
Subject: Re: [Wackamole-users] Wackamole on FreeBSD 5.1 will not start
My guess is that this is an uninitialized variable.
I will try to run wackamole under valgrind on Linux to track this down
(unless someone beats me to it -- any takers).
---End Original Message----
Theo -
I'm working with Matt on this project. The real head-scratching part of this
is that after rolling back the OS on our test machines to FreeBSD 4.9-RC2,
spread and wackamole perform flawlessly.
As far as I can tell, there wasn't anything radically different in the
kernel configurations, though device support for the onboard ethernet
controllers (Broadcom Gigabit adapters) seems a little sketchy under 5.1.
The chipsets for our particular cards didn't get full support until 4.9.
Under 5.1, the NICs were detected and seemed to function fine, but we saw
some strange messages in the startup logs that we don't see when booting
under 4.9-RC2.
I've dug through the release notes, early adopter guide and BSD list
archives trying to find out what might have been causing this (some subtle
change in the way sockets are handled, something security or jail-related,
etc), but have come up empty-handed. Using 5.1 is not a show-stopper for our
particular project...it just baffled the hell out of me, though my kernel
hacking abilities are admittedly pretty limited.
JQ
Next Message by Date:
click to view message preview
Wackamole failing after cable dis-/reconnect
Hallo,
I have a Spread/Wackamole setup which works at least testing-wise fine
as long as I fail one machine of the two machine setup by completely
rebooting it or by killing the Wackamole or Spread daemon.
I the above case(s) the other machine in that litte cluster takes over
with a very short outage only.
The problem which I encounter is after disconnecting physically the
network interface's cable on either machine and afterwards reconnecting
that cable.
Step by step I do the following, assuming initially both machines are
fine an listening:
1. disconnect NIC cable of machine A (which is 2.4.20 kernel, German
SUSE 8.2 distro)
2. watch syslog on the other machine B (which is RH 7.1, kernel
2.4.2-2), wait for Wackamole to complete the arp spoof
3. watch ping -t on a Windows box on the same network. After
disconnection there is a brief outage of one-two seconds, then the other
machine jumps in, and ping is receiving good responses again
3. reconnect NIC cable of machine A (where Spread daemon and Wackamole
have continued running while the cable was off)
4. watch syslog of machine B, Wackamole brings the VIP down
5. watch syslog of machine A, there is no activity, apart from the
notice that the cacle has been reconnected and a 100Mbit link has been
established
6. watch ping -t on Windows box, ping receives destination host
unreachable messages originating from the physical IP of machine B.
Since machine B has taken down the VIP it was listening to when the
cable on machine A was reconnected it should not be able to respond to
ping going to the VIP, which is OK.
7. doing arp -a on the Windows box, I see that the arp cache for the VIP
has not been updated. One explanation that occurs to me is, that the arp
spoof and subsequent update of the shared arp cache seem to happen only
when a VIP comes up, not when its ging down. So in my case, the VIP on
machine B goes down, without notifying anyone of it. And the VIP on
machine A, which has been up right through during the physical
disconnect, does not sense any changes and therefore does not broadcast
arp information as well.
8. If I purge the VIP from the Windows box arp cache, the ping comes
right back with good responses.
Well, I hope no one got bored with the lengthy explanation.
I will post the important parts of my conf below.
The Wackamole conf is different from most others I have seen. I want
(have to) use only one IP Address as VIP for both machines in my little
cluster, since booth machines have to exposed by that IP address, not at
the same time (I know this wouldn't work) but intermittendly depending
on their health state or running condition. The network has no DNS
available, therefore I have to go with the IP.
Spread (Conf is identical on both machines A and B)
Spread_Segment 192.168.1.255:4803 {
"192" 192.168.1.141
ibm-linux 192.168.1.59
}
Wackamole (identical as well)
# The Spread daemon we are going to connect to. It should be on the
local box
Spread = 4803
SpreadRetryInterval = 2s
# The group name
Group = wack1
# Named socket for online control
Control = /var/run/wack.it
# Denote the interface we prefer to have
#prefer eth0:10.3.4.5/8
#prefer { eth0:10.2.3.4/8 eth1:192.168.10.23/24 }
# In most cases, I just don't care. Let wackamole decide.
Prefer None
# List all the virtual interfaces (ALL of them)
VirtualInterfaces {
# The following two lines have the same effect
# en0:192.168.1.2/24
{ eth0:192.168.1.200/24 }
# This is how you say 2 or more IPs are to be treated as a single
# "set" or "virtual interface". If wackamole decides that this
# machine will manage it, you are ensured to get ALL the ips in the
# set.
# { en1:10.0.0.1/8 en0:192.168.35.64/26 }
}
# Collect and broadcast the IPs in our ARP table every so often
Arp-Cache = 1s
# List who we will notify
# Here the netblock (/24 or /28) can be deceptive. It is NOT a
netmask
# for a single IP. It is how one will describe that they want to
# notify ALL IPs in a segment.
Notify {
# Let's notify our router:
eth0:192.168.1.1/32
# Notify out DNS servers
# en1:10.0.0.10/32
# en1:10.0.0.11/32
# 10.0.0.0 -> 10.0.0.255, but only 128 notifications/sec
# en0:10.0.0.0/24 throttle 128
# Wackamole shares arp-cache across machines, this says to
# notify every IP address in the aggregate shared arp-cache.
arp-cache
}
balance {
# This field is the maximum number of IP addresses that will move
# from one wackamole to another during a round of balancing.
AcquisitionsPerRound = 1
# Time interval in each balancing round.
interval = 1s
}
# How long it takes us to mature
mature = 3s
-----
If anyone has got some time:
Can that what I intend to do work at all?
Any hints how I could work aroud my problem?
If you haven't the time, thanks for reading anyway!
--
Mit freundlichen Gruessen / Kind Regards
Toralf Richter
triplesense GmbH
Hanauer Landstraße 186
60314 Frankfurt am Main
Previous Message by Thread:
click to view message preview
RE: Wackamole on FreeBSD 5.1 will not start
-----Original Message-----
From: Theo Schlossnagle
To: wackamole-users@xxxxxxxxxxxxxxxxxx
Cc: Theo Schlossnagle
Sent: 10/19/03 8:03 PM
Subject: Re: [Wackamole-users] Wackamole on FreeBSD 5.1 will not start
My guess is that this is an uninitialized variable.
I will try to run wackamole under valgrind on Linux to track this down
(unless someone beats me to it -- any takers).
---End Original Message----
Theo -
I'm working with Matt on this project. The real head-scratching part of this
is that after rolling back the OS on our test machines to FreeBSD 4.9-RC2,
spread and wackamole perform flawlessly.
As far as I can tell, there wasn't anything radically different in the
kernel configurations, though device support for the onboard ethernet
controllers (Broadcom Gigabit adapters) seems a little sketchy under 5.1.
The chipsets for our particular cards didn't get full support until 4.9.
Under 5.1, the NICs were detected and seemed to function fine, but we saw
some strange messages in the startup logs that we don't see when booting
under 4.9-RC2.
I've dug through the release notes, early adopter guide and BSD list
archives trying to find out what might have been causing this (some subtle
change in the way sockets are handled, something security or jail-related,
etc), but have come up empty-handed. Using 5.1 is not a show-stopper for our
particular project...it just baffled the hell out of me, though my kernel
hacking abilities are admittedly pretty limited.
JQ
Next Message by Thread:
click to view message preview
Wackamole failing after cable dis-/reconnect
Hallo,
I have a Spread/Wackamole setup which works at least testing-wise fine
as long as I fail one machine of the two machine setup by completely
rebooting it or by killing the Wackamole or Spread daemon.
I the above case(s) the other machine in that litte cluster takes over
with a very short outage only.
The problem which I encounter is after disconnecting physically the
network interface's cable on either machine and afterwards reconnecting
that cable.
Step by step I do the following, assuming initially both machines are
fine an listening:
1. disconnect NIC cable of machine A (which is 2.4.20 kernel, German
SUSE 8.2 distro)
2. watch syslog on the other machine B (which is RH 7.1, kernel
2.4.2-2), wait for Wackamole to complete the arp spoof
3. watch ping -t on a Windows box on the same network. After
disconnection there is a brief outage of one-two seconds, then the other
machine jumps in, and ping is receiving good responses again
3. reconnect NIC cable of machine A (where Spread daemon and Wackamole
have continued running while the cable was off)
4. watch syslog of machine B, Wackamole brings the VIP down
5. watch syslog of machine A, there is no activity, apart from the
notice that the cacle has been reconnected and a 100Mbit link has been
established
6. watch ping -t on Windows box, ping receives destination host
unreachable messages originating from the physical IP of machine B.
Since machine B has taken down the VIP it was listening to when the
cable on machine A was reconnected it should not be able to respond to
ping going to the VIP, which is OK.
7. doing arp -a on the Windows box, I see that the arp cache for the VIP
has not been updated. One explanation that occurs to me is, that the arp
spoof and subsequent update of the shared arp cache seem to happen only
when a VIP comes up, not when its ging down. So in my case, the VIP on
machine B goes down, without notifying anyone of it. And the VIP on
machine A, which has been up right through during the physical
disconnect, does not sense any changes and therefore does not broadcast
arp information as well.
8. If I purge the VIP from the Windows box arp cache, the ping comes
right back with good responses.
Well, I hope no one got bored with the lengthy explanation.
I will post the important parts of my conf below.
The Wackamole conf is different from most others I have seen. I want
(have to) use only one IP Address as VIP for both machines in my little
cluster, since booth machines have to exposed by that IP address, not at
the same time (I know this wouldn't work) but intermittendly depending
on their health state or running condition. The network has no DNS
available, therefore I have to go with the IP.
Spread (Conf is identical on both machines A and B)
Spread_Segment 192.168.1.255:4803 {
"192" 192.168.1.141
ibm-linux 192.168.1.59
}
Wackamole (identical as well)
# The Spread daemon we are going to connect to. It should be on the
local box
Spread = 4803
SpreadRetryInterval = 2s
# The group name
Group = wack1
# Named socket for online control
Control = /var/run/wack.it
# Denote the interface we prefer to have
#prefer eth0:10.3.4.5/8
#prefer { eth0:10.2.3.4/8 eth1:192.168.10.23/24 }
# In most cases, I just don't care. Let wackamole decide.
Prefer None
# List all the virtual interfaces (ALL of them)
VirtualInterfaces {
# The following two lines have the same effect
# en0:192.168.1.2/24
{ eth0:192.168.1.200/24 }
# This is how you say 2 or more IPs are to be treated as a single
# "set" or "virtual interface". If wackamole decides that this
# machine will manage it, you are ensured to get ALL the ips in the
# set.
# { en1:10.0.0.1/8 en0:192.168.35.64/26 }
}
# Collect and broadcast the IPs in our ARP table every so often
Arp-Cache = 1s
# List who we will notify
# Here the netblock (/24 or /28) can be deceptive. It is NOT a
netmask
# for a single IP. It is how one will describe that they want to
# notify ALL IPs in a segment.
Notify {
# Let's notify our router:
eth0:192.168.1.1/32
# Notify out DNS servers
# en1:10.0.0.10/32
# en1:10.0.0.11/32
# 10.0.0.0 -> 10.0.0.255, but only 128 notifications/sec
# en0:10.0.0.0/24 throttle 128
# Wackamole shares arp-cache across machines, this says to
# notify every IP address in the aggregate shared arp-cache.
arp-cache
}
balance {
# This field is the maximum number of IP addresses that will move
# from one wackamole to another during a round of balancing.
AcquisitionsPerRound = 1
# Time interval in each balancing round.
interval = 1s
}
# How long it takes us to mature
mature = 3s
-----
If anyone has got some time:
Can that what I intend to do work at all?
Any hints how I could work aroud my problem?
If you haven't the time, thanks for reading anyway!
--
Mit freundlichen Gruessen / Kind Regards
Toralf Richter
triplesense GmbH
Hanauer Landstraße 186
60314 Frankfurt am Main
|
|