[BlueOnyx:10782] Re: Blue Onyx 5106R Crash

Michael Stauber mstauber at blueonyx.it
Wed Jun 13 09:25:01 -05 2012


Hi Bill,

> 6. I log in via ssh to see if maybe it is loocked up on the update or
> experiencing a DOS. When I get in and su to root I do a "top" to see 
> what is
> happening and I get "Bus Error". Wow never seen that before.

I'm really sorry to hear that this box is giving you grief again. :-(

I have seen "Bus Error" error messages before on one of my own boxes. 
It was also a fairly old clunker only used for some development and 
testing stuff. When Linux says "Bus Error", then that points somewhere 
in the direction of the hard disk controller. It could be that the 
controller itself is busted, could be bad cabling, oxydized contacts, a 
problem with the circuit board that's screwed under the HD itself, or it 
could be something wrong with a part of the motherboard that interfaces 
with the controller.

For testing purposes I'd put the disks into another box. This could be 
any PC - even a workstation. In my office I'd take a Windows box, would 
disconnect the internal HDs, would hook up one of the HDs from the 
failed server via USB and would boot the box off the BlueOnyx CD in 
rescue mode. That should allow you to check if there is still good data 
on the hard disk or if the partition table is trashed as the failed box 
claims.

If the disks turn out OK, you could even shove them into another server 
and could use them there. If that still makes sense is another question. 
You mentioned 80GB disks, so I assume they are also already pretty aged.

All in all I agree with Chris: If the hardware starts to get flaky, 
then it's time to bin it, or to retire it to unimportant tasks where 
potential loss of data is no longer a critical or crippling issue. In my 
experience the typical hardware lasts me about four years and then the 
mean time between failures usually goes through the roof. First the 
disks let go and the number of disk related failures skyrocket, then the 
box crashed more often and in the end something lets go entirely, which 
prevents startup. The longest I ever got out of a 24/7 running server 
was seven years, but then it was operating on its third set of HDs and 
second set of RAM.

-- 

With best regards,

Michael Stauber



More information about the Blueonyx mailing list