[BlueOnyx:11113] Re: SL6.2 no boot from degraded RAID1... with fix... BTW 6.3 is OK
Gerald Waugh
gwaugh at frontstreetnetworks.com
Mon Aug 6 18:49:15 -05 2012
On 08/06/2012 05:45 PM, Richard Morgan wrote:
>
> ----- Original Message -----
> *From:* Gerald Waugh <mailto:gwaugh at frontstreetnetworks.com>
> *To:* BlueOnyx General Mailing List
> <mailto:blueonyx at mail.blueonyx.it>
> *Sent:* Monday, August 06, 2012 11:22 PM
> *Subject:* [BlueOnyx:11108] Re: SL6.2 no boot from degraded
> RAID1... with fix... BTW 6.3 is OK
>
> On 08/06/2012 07:47 AM, Gerald Waugh wrote:
>> *I M P O R T A N T ! ! !
>> Note: copied from Scientific Linux list* Konstantin Olchanski <olchansk at triumf.ca> <mailto:olchansk at triumf.ca%3E>
>>
>> "======================================
>> FYI, as a regression from SL6.0 and SL6.1, SL6.2 does not boot from degraded RAID1 devices.
>>
>> If your "/" is on a RAID1 mirrored across 2 disks and *1 of the 2 disks dies,* *your system will
>> not boot* because dracut does not activate the required md devices.
>>
>> This is a very serious problem because RAID1 (mirroring) of "/" and "swap" is a popular
>> solution for protecting against single-disk failures. The present bug defeats this protection
>> and makes the situation worse because failure of either of the 2 disks makes your system
>> unbootable.
>>
>> It is astonishing that this problem was not caught by anybody's QA, did not receive
>> wide publicity **and** the solution was not pushed into the current release of SL.
>>
>> Bug report against dracut was filed in January:
>> https://bugzilla.redhat.com/show_bug.cgi?id=772926
>> marked as duplicate of secret bug:
>> https://bugzilla.redhat.com/show_bug.cgi?id=761584
>> solution made available in July for (the best I can tell) the 6.3 release:
>> http://rhn.redhat.com/errata/RHBA-2012-0839.html (dracut-004-283.el6.src.rpm)
>> http://rhn.redhat.com/errata/RHBA-2012-1078.html (dracut-004-284.el6_3.src.rpm)
>>
>> These RPMs are available in SL6 .../6rolling/x86_64/updates/fastbugs/
>>
>> I confirm that dracut-004-284.el6_3 can boot SL6.2 from degraded "/" (one disk missing).
>>
>> Note that applying the fix on affected systems is not trivial:
>>
>> 1) rpm -vh --upgrade dracut-004-284.el6_3.noarch.rpm dracut-kernel-004-284.el6_3.noarch.rpm
>> 2) bad dracut is still present inside the /boot/initramfs files, your system is still broken
>> 3) dracut -v -f ### this rebuilds the initramfs for the ***presently running*** kernel, not necessarily the one used for the next reboot
>> 4) find /boot -name 'initramfs*.img' -print -exec lsinitrd {} \; | grep dracut-0 ### report dracut version inside all /boot/initramfs files
>> 5) dracut -v -f /boot/initramfs-2.6.32-279.1.1.el6.x86_64.img 2.6.32-279.1.1.el6.x86_64 ### rebuild initramfs for the latest update kernel
>> "=======================================
>> *
>> Is fixed is SL6-3
>> Looks like CentOS with its 6,3 version is OK
>>
>> *
> And no one cares to comment
> It doesn't bother you that if one of the drives go south, and you
> reboot the server,
> it won't boot up, you have go to the data center and swap drives
> to get it to boot.
>
> --
> Gerald
>
>
>
> Hi Gerald, can't speak for others but I've done a bit of research into this prompted by your email. I for one appreciated your message greatly.
>
> I've tried putting a test server together this evening to try out both the failing and the fix, but my spare disks are at my office. I'll try tomorrow and reporting my findings.
>
> Richard
>
I believe you will be OK
Just be sure that the working drive is in 'a' position (1st drive)
I believe the problem is that a server may not reboot, when one of the
drives/partitions have failed.
and you aren't available to move the drive to the correct position.
--
Gerald
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.blueonyx.it/pipermail/blueonyx/attachments/20120806/6bef37d1/attachment.html>
More information about the Blueonyx
mailing list