[BlueOnyx:11110] Re: SL6.2 no boot from degraded RAID1... with fix... BTW 6.3 is OK

Richard Morgan richard at morgan-web.co.uk
Mon Aug 6 17:45:42 -05 2012


  ----- Original Message ----- 
  From: Gerald Waugh 
  To: BlueOnyx General Mailing List 
  Sent: Monday, August 06, 2012 11:22 PM
  Subject: [BlueOnyx:11108] Re: SL6.2 no boot from degraded RAID1... with fix... BTW 6.3 is OK


  On 08/06/2012 07:47 AM, Gerald Waugh wrote: 
I M P O R T A N T ! ! !
Note: copied from Scientific Linux list Konstantin Olchanski <olchansk at triumf.ca>
 
"======================================
FYI, as a regression from SL6.0 and SL6.1, SL6.2 does not boot from degraded RAID1 devices.

If your "/" is on a RAID1 mirrored across 2 disks and 1 of the 2 disks dies, your system will
not boot because dracut does not activate the required md devices.

This is a very serious problem because RAID1 (mirroring) of "/" and "swap" is a popular
solution for protecting against single-disk failures. The present bug defeats this protection
and makes the situation worse because failure of either of the 2 disks makes your system
unbootable.

It is astonishing that this problem was not caught by anybody's QA, did not receive
wide publicity *and* the solution was not pushed into the current release of SL.

Bug report against dracut was filed in January:
https://bugzilla.redhat.com/show_bug.cgi?id=772926
marked as duplicate of secret bug:
https://bugzilla.redhat.com/show_bug.cgi?id=761584
solution made available in July for (the best I can tell) the 6.3 release:
http://rhn.redhat.com/errata/RHBA-2012-0839.html (dracut-004-283.el6.src.rpm)
http://rhn.redhat.com/errata/RHBA-2012-1078.html (dracut-004-284.el6_3.src.rpm)

These RPMs are available in SL6 .../6rolling/x86_64/updates/fastbugs/

I confirm that dracut-004-284.el6_3 can boot SL6.2 from degraded "/" (one disk missing).

Note that applying the fix on affected systems is not trivial:

1) rpm -vh --upgrade dracut-004-284.el6_3.noarch.rpm dracut-kernel-004-284.el6_3.noarch.rpm
2) bad dracut is still present inside the /boot/initramfs files, your system is still broken
3) dracut -v -f ### this rebuilds the initramfs for the ***presently running*** kernel, not necessarily the one used for the next reboot
4) find /boot -name 'initramfs*.img' -print -exec lsinitrd {} \; | grep dracut-0 ### report dracut version inside all /boot/initramfs files
5) dracut -v -f /boot/initramfs-2.6.32-279.1.1.el6.x86_64.img 2.6.32-279.1.1.el6.x86_64 ### rebuild initramfs for the latest update kernel
"=======================================

Is fixed is SL6-3
Looks like CentOS with its 6,3 version is OK 


And no one cares to comment
  It doesn't bother you that if one of the drives go south, and you reboot the server, 
  it won't boot up, you have go to the data center and swap drives to get it to boot.

-- 
Gerald Hi Gerald, can't speak for others but I've done a bit of research into this prompted by your email.  I for one appreciated your message greatly.I've tried putting a test server together this evening to try out both the failing and the fix, but my spare disks are at my office. I'll try tomorrow and reporting my findings.Richard

------------------------------------------------------------------------------


  _______________________________________________
  Blueonyx mailing list
  Blueonyx at mail.blueonyx.it
  http://mail.blueonyx.it/mailman/listinfo/blueonyx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.blueonyx.it/pipermail/blueonyx/attachments/20120806/21640417/attachment.html>


More information about the Blueonyx mailing list