<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    On 08/06/2012 05:45 PM, Richard Morgan wrote:

    <blockquote cite="mid:04B71FB9BFCA4377AA8B9A679DCB1ACE@morganweb"

      type="cite">

      <meta http-equiv="Context-Type" content="text/html;

        charset=iso-8859-1">

      <blockquote>

        <div>----- Original Message ----- </div>

        <div><b>From:</b> <a moz-do-not-send="true"

            title="gwaugh@frontstreetnetworks.com"

            href="mailto:gwaugh@frontstreetnetworks.com">Gerald Waugh</a>

        </div>

        <div><b>To:</b> <a moz-do-not-send="true"

            title="blueonyx@mail.blueonyx.it"

            href="mailto:blueonyx@mail.blueonyx.it">BlueOnyx General

            Mailing List</a> </div>

        <div><b>Sent:</b> Monday, August 06, 2012 11:22 PM</div>

        <div><b>Subject:</b> [BlueOnyx:11108] Re: SL6.2 no boot from

          degraded RAID1... with fix... BTW 6.3 is OK</div>

        <div><br>

        </div>

        On 08/06/2012 07:47 AM, Gerald Waugh wrote:

        <blockquote cite="mid:501FBCF5.5050100@frontstreetnetworks.com"

          type="cite">

<pre>I M P O R T A N T ! ! !

Note: copied from Scientific Linux list</b> Konstantin Olchanski <a moz-do-not-send="true" href="mailto:olchansk@triumf.ca%3E"><olchansk@triumf.ca></a>

"======================================

FYI, as a regression from SL6.0 and SL6.1, SL6.2 does not boot from degraded RAID1 devices.

If your "/" is on a RAID1 mirrored across 2 disks and <b>1 of the 2 disks dies,</b> <b>your system will

not boot because dracut does not activate the required md devices.

This is a very serious problem because RAID1 (mirroring) of "/" and "swap" is a popular

solution for protecting against single-disk failures. The present bug defeats this protection

and makes the situation worse because failure of either of the 2 disks makes your system

unbootable.

It is astonishing that this problem was not caught by anybody's QA, did not receive

wide publicity *and* the solution was not pushed into the current release of SL.

Bug report against dracut was filed in January:

<a moz-do-not-send="true" href="https://bugzilla.redhat.com/show_bug.cgi?id=772926">https://bugzilla.redhat.com/show_bug.cgi?id=772926</a>

marked as duplicate of secret bug:

<a moz-do-not-send="true" href="https://bugzilla.redhat.com/show_bug.cgi?id=761584">https://bugzilla.redhat.com/show_bug.cgi?id=761584</a>

solution made available in July for (the best I can tell) the 6.3 release:

<a moz-do-not-send="true" href="http://rhn.redhat.com/errata/RHBA-2012-0839.html">http://rhn.redhat.com/errata/RHBA-2012-0839.html</a> (dracut-004-283.el6.src.rpm)

<a moz-do-not-send="true" href="http://rhn.redhat.com/errata/RHBA-2012-1078.html">http://rhn.redhat.com/errata/RHBA-2012-1078.html</a> (dracut-004-284.el6_3.src.rpm)

These RPMs are available in SL6 .../6rolling/x86_64/updates/fastbugs/

I confirm that dracut-004-284.el6_3 can boot SL6.2 from degraded "/" (one disk missing).

Note that applying the fix on affected systems is not trivial:

1) rpm -vh --upgrade dracut-004-284.el6_3.noarch.rpm dracut-kernel-004-284.el6_3.noarch.rpm

2) bad dracut is still present inside the /boot/initramfs files, your system is still broken

3) dracut -v -f ### this rebuilds the initramfs for the ***presently running*** kernel, not necessarily the one used for the next reboot

4) find /boot -name 'initramfs*.img' -print -exec lsinitrd {} \; | grep dracut-0 ### report dracut version inside all /boot/initramfs files

5) dracut -v -f /boot/initramfs-2.6.32-279.1.1.el6.x86_64.img 2.6.32-279.1.1.el6.x86_64 ### rebuild initramfs for the latest update kernel

"=======================================

<b>

Is fixed is SL6-3

Looks like CentOS with its 6,3 version is OK 

</b>

</pre>

        </blockquote>

        And no one cares to comment<br>

        It doesn't bother you that if one of the drives go south, and

        you reboot the server, <br>

        it won't boot up, you have go to the data center and swap drives

        to get it to boot.<br>

        <pre>-- 

Gerald </pre>

        <pre> </pre>

        <pre>Hi Gerald, can't speak for others but I've done a bit of research into this prompted by your email.  I for one appreciated your message greatly.</pre>

        <pre>I've tried putting a test server together this evening to try out both the failing and the fix, but my spare disks are at my office. I'll try tomorrow and reporting my findings.</pre>

        <pre>Richard</pre>

        <p> </p>

      </blockquote>

    </blockquote>

    I believe you will be OK<br>

    Just be sure that the working drive is in 'a' position (1st drive)<br>

    <br>

    I believe the problem is that a server may not reboot, when one of

    the drives/partitions have failed.<br>

    and you aren't available to move the drive to the correct position.<br>

    <pre class="moz-signature" cols="72">-- 

Gerald</pre>

  </body>

</html>