I upgraded the kernel on this box to the then newest 2.4.18-26.8.0smp
kernel on 3/7/2003. Since then it has locked up on 3/15, 3/17, and 3/22
. This morning I upgraded the kernel to the now newest 2.4.18-27.8.0smp
kernel. It was down from 3/22 until this morning with the following
message displayed, many times, "I/O error: dev 08:21, sector 24641776"
with different sectors, finally ending in "0". I just upgraded it to
the newest 2.4.18-27.8.0smp this morning. I'm fairly certain there are
no disk issues. I'll continue with my testing over the next week. Also
I notice there was a massive load spike >2000 on 3/22 immediately before
it hung. If I don't see any load spikes and/or locking over the next
week then I will move on to updating the firmware on both the system and
the PERC (per Jason Andrade's suggestions.)
Peter Smith
Peter Smith wrote:
This is an odd issue which is why I'm notifying/contacting the list.
I have a PE2500 which, up until about 1 1/2 weeks ago, was running
RedHat v7.1 without a hitch or hiccup. Since things were going so
well, I decided it was high time to upgrade to RedHat v8.0 . At the
same time, I upgraded Squid, its main application. Keep in mind this
PE2500 is an older unit, shipped on 9/5/2001, and it is using a PERC
2/Di. The reason I upgraded it is I have another, newer, PE2500 which
has been running RedHat v8.0 and my newer Squid (all same software
revs) using the same PERC 2/Di but in a newer box, shipped 3/26/2002.
The problem I am having is that the failing machine is experiencing
massive load (>1000) at certain somewhat cyclic times. I reboot this
particular machine every morning at 3:00am. I don't believe the
massive load has to do with anything other than drive access. It
seems the raid driver is sometimes taking up too much time and can
lock up the machine. Only one other time did I have a problem which
seemed unrelated to the raid driver--recently after it rebooted at
3:00am it got stuck attempting to initialize the AIC7XXX driver at
startup. I understand this is somewhat of a known issue (but for
RedHat v8?) and I'm working on getting the newest newest happiest
AIC7XXX driver installed, so this probably isn't too much of a
problem. However, I am running the RedHat '2.4.18-24.8.0smp' kernel
and am still experiencing massive load problems (which I used to not
see when running RedHat v7.1 on this box.) I'll be setting up the
newest newest kernel '2.4.18-26.8.0smp' probably tonight and will give
that a whirl. I have a feeling that unless the Aacraid driver has
been changed I'll experience the same problems. I see no massive-load
or hangs on my other machine at this time.
The only other thing is this machine is using the on-board Eepro card
and two add-on 3c905's. I've left the configuration on these fairly
generic. Plus, nothing, as far as network goes, changed in the
upgrade to RedHat v8.0 .
Any ideas? Pointers? More data? I'm fairly stumped... I suppose at
the worst, I could maybe learn how to hook up a remote kernel
profiler/debugger to get some real numbers on it.. When running
"iostat" it looks like this box does a lot more raid-driver service
time than all the other boxes which leads me to believe it is a
raid-driver (aacraid) issue again.
Thank you in advance...
Peter Smith
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge@xxxxxxxx
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq or search the list
archives at http://lists.us.dell.com/htdig/
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge@xxxxxxxx
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq or search the list archives
at http://lists.us.dell.com/htdig/
|