[BlueOnyx:08266] Re: eth Interface Bug and More!

Michael Stauber mstauber at blueonyx.it
Thu Aug 25 17:42:56 -05 2011


Hi Aaron,

> I've been imaging servers by using RAID 1 to duplicate hard drives
> across machines, starting from one, and then creating another and
> another as necessary. I have four Sun Fire X2200 servers. They all have
> four ethernet ports in the back: two are controlled by an nVidia
> chipset, and two by a Broadcom chipset. They are labeled on the physical
> case as eth0, eth1, eth2 and eth3.
> 
> The way the ports get labeled in the operating system, using the same
> terms (eth0 through eth3) is pretty haphazard, meaning that the OS
> labels definitely do not correspond to the physical port labels.
> Suddenly eth3 means eth0. It's very confusing. I doubt there's much that
> can be done about this unless the hardware itself provides metadata
> about which port is which, and I don't know if it does.

Yeah, if you have multiple NICs, then things can get a bit messy. With just 
two NICs it's sometimes a coin toss decission which NIC becomes eth0 and which 
one eth1 after each reboot. With three or four it gets outright crazy.

However: Each NIC has it's own MAC address and these can be used to hard-wire 
a NIC to a specific ethernet device.

There are different approaches how to do this. 

The typical approach is to run "/sbin/ifconfig" and to look at that output to 
find out the MAC addresses of your network cards.

Then you edit your /etc/sysconfig/network-scripts/ifcfg-eth* files and add a 
line to each:

HWADDR=00:FF:FF:99:DA:AA

Of course you'd change 00:FF:FF:99:DA:AA with the correct MAC address of each 
NIC.

So if eth0 has the MAC 00:FF:FF:99:DA:AA, you'd put the line ...

HWADDR=00:FF:FF:99:DA:AA

... into /etc/sysconfig/network-scripts/ifcfg-eth0

However, EL6 (RHEL6, CentOS6, SL6) has a newer UDEV (as you already noticed). 
They try to keep track of the MAC address <-> ethernet device issue in a file 
called /etc/udev/rules.d/70-persistent-net.rules

Which really throws things out of their tracks when you either replace the 
NICs, or when you take a disk out of one box and put it into another.

Sadly the info in /etc/udev/rules.d/70-persistent-net.rules takes precedence 
over the HWADDR lines in your ifcfg-eth* files.

So the best procedure here is: If you take a disk out of one box and put it 
into another, edit  /etc/udev/rules.d/70-persistent-net.rules and throw out 
all the "SUBSYSTEM=" lines at the end. Then reboot and your network should 
come up fine.

> It gets better. When you go to change your IP address in the BlueOnyx
> control panel, you are presented with four sets of fields corresponding
> to your four ethernet adapters. But you can't save. Why can't you save?
> Because there's a JavaScript error. Why is there a JavaScript error?
> Because on line 101 of /usr/sausalito/ui/web/base/network/ethernet.php,
> the PHP variable $interfaces is getting its data from
> $cceClient->findx('Network', array('real' => 1), array(), 'ascii',
> 'device');, and that's reporting back that there are in fact eight
> ethernet adapters. There aren't, but as soon as JavaScript hits
> $interfaces[4], it can't find the HTML inputs because they don't exist.
> (This makes it REALLY hard to get an internet connection up and running,
> by the way.)

Yes, I can imagine. I think I had already noticed this several weeks ago 
during development and forgot to note it down for fixing, as I had intended to 
tackle the /etc/udev/rules.d/70-persistent-net.rules issue from another angle.

> Another fun fact about using RAID 1 to clone a clean BlueOnyx install
> image: it kills your swap partition. 

Yeah, disk cloning sadly has some drawbacks like this and if you do it, you 
have to recreate the Swap partition. Alternatively you can possibly fix this 
by just substituting the right UUIDs? It has been a while since I did that, so 
I'm not 100% sure.

> If you ever want that back instead of a lame md127 array that comes out of 
> nowhere because the mdadm-style (versus the other style, who thought of t
> his insanity and why are there two styles?) [...]

Oooooooh, yeah. Back in the days of old the RAID info was stored in 
/etc/raidtab, now with MDADM it goes into /etc/mdadm.conf. MDADM is still 
maintained and has a few more goodies than the ancient raid-bins, so sometime 
in or around CentOS4 this started to sneak in. Sadly MDADM also keeps track of 
the UUID of disks and notes them down in /etc/mdadm.conf. So if you clone one 
RAID disk, put it into another box and want it to RAID-sync the data to the 
2nd (blank) disk, the UUID for the 2nd disk in your /etc/mdadm.conf will be 
different from the one it sees.

You can fix this manually, too. Just run the "blkid" as "root" and it will 
show you a list of your blockdevices with their associated UUID. Just stuff 
the right UUID's into your /etc/mdadm.conf, reboot and you're good to go.

To sum things up:

When you clone a 5107R RAID1 disk and put it into a fresh box alongside an 
empty disk of equal size, you need to do the following:

a) Delete all SUBSYSTEM lines from /etc/udev/rules.d/70-persistent-net.rules

b) Run "blkid" to find out the UUIDs of your disks and partitions. 

c) Edit /etc/mdadm.conf to fill in the right UUIDs for your partitions.

Or instead of (b) and (c) do this:

d) Recreate your SWAP partition:

 mdadm --manage --stop /dev/md127
 mdadm --create /dev/md2 --level=1 --raid-devices=2 /dev/sda3 /dev/sdb3
 mdadm --misc --detail /dev/md2 (tells you the UUID)
 vim /etc/mdadm.conf (paste the UUID in the proper place)
 mkswap /dev/md2
 swapon /dev/md2

Finally: Reboot.

-- 
With best regards

Michael Stauber
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.blueonyx.it/pipermail/blueonyx/attachments/20110826/2d61957d/attachment.html>


More information about the Blueonyx mailing list