[BlueOnyx:10580] Re: CCED is hanging on startup and is still there after a couple hours.

Greg Kuhnert gkuhnert at compassnetworks.com.au
Tue May 15 16:23:25 -05 2012


Hi Jeremy.

On 5/16/2012 1:01 AM, jtowne at turtlehut.com wrote:
>>> Hi Jeremy,
>>>
>>>> We were having some issues with adding sites to a 5106R so I tried
>>>> restarting CCED and admserv but it kept hanging on CCED.  There were
>>>> usually 2 processes started but never gave the OK/done.  So we decided
>>>> to reboot the server because someone was going to be in the area today.
>>>> Well that was 2.5 hours ago and it is still setting at starting CCED.
>>>> All sites are down and we can't shell in.  Any ideas how to proceed?
>>> Boot off the BlueOnyx CD in rescue mode. To do so, insert the CD, reboot
>>> and
>>> at the CD installation menu type "linux rescue" instead of pressing
>>> return.
>>>
>>> It'll ask for keyboard and network settings. You don't really need to
>>> set
>>> the
>>> network, so you can skip that step. It will ask you if you want to mount
>>> the
>>> hard disks, which you confirm.
>>>
>>> Once you get the shell, type:
>>>
>>> chroot /mnt/sysimage/
>>>
>>> That willl mount the filesystem of the CD as /, so that you can directly
>>> work
>>> with the installed OS instead of the OS provided by the rescue mode.
>>>
>>> Type this command next:
>>>
>>> /sbin/chkconfig --level 2345 cced.init off
>>>
>>> That will prevent cced.init from starting up during the next reboot.
>>>
>>> Press CTRL+D twice to exit out of the chroot and to exit the rescue
>>> mode.
>>> Reboot and while the system goes into reboot remove the CD from the
>>> drive
>>> and
>>> let the server boot up the normal way.
>>>
>>> The system will come up again, but without starting CCEd. That will
>>> prevent
>>> the GUI from working and httpd may also not work, but you can at least
>>> get
>>> back into the box using SSH for troubleshooting the real issue.
>>>
>>> Most likely the CCEd issue is a corrupted CODB database. To troubleshoot
>>> that
>>> create the script /root/oid.sh and copy and paste the following code
>>> into
>>> it
>>> (without the ---- lines):
>>>
>>> --------------------------------------------------------------------------------------
>>> #!/bin/bash
>>> LAST=-1
>>> MIN=-1
>>>
>>> for X in `ls /usr/sausalito/codb/objects/ | sort -n`
>>> do
>>>    MYNEXT=$(( $LAST + 1 ))
>>>    if [ $MYNEXT -eq $X ]
>>>    then
>>>      LAST=$X
>>>    else
>>>      if [ $LAST -ge 1 ]
>>>      then
>>>        if [ $MIN -eq $LAST ]
>>>        then
>>>          echo -n $LAST,
>>>        else
>>>          echo -n $MIN-$LAST,
>>>        fi
>>>      fi
>>>      LAST=$X
>>>      MIN=$X
>>>    fi
>>> done
>>> if [ $MYNEXT -lt $X ]
>>> then
>>>    echo -n $LAST
>>> else
>>>    echo -n $MIN-$LAST
>>> fi
>>>
>>> echo ""
>>> echo "/usr/sausalito/codb/codb.oids reports:"
>>> cat /usr/sausalito/codb/codb.oids
>>> echo ""
>>> --------------------------------------------------------------------------------------
>>>
>>> Make the script executeable ("chmod 700 /root/oid.sh") and run it. CCEd
>>> does
>>> NOT have to run for this test.
>>>
>>> It will generate output similar to this:
>>>
>>> [root at server ~]# /root/oid.sh
>>> 1-51,56-59,65-114,127-135,149-153,160-175,181-201,204-207,219-229,231-257,260-291,328-329,337-347,363-367,377-382,392-397
>>> /usr/sausalito/codb/codb.oids reports:
>>> 1-51,56-59,65-114,127-135,149-153,160-175,181-201,204-207,219-229,231-257,260-291,328-329,337-347,363-367,377-382,392-397
>>>
>>> The lines with the numbers are what we are looking for. They MUST be
>>> identical. If they are not, then that is the problem.
>>>
>>> So if the lines are NOT identical, do this:
>>>
>>> Edit /usr/sausalito/codb/codb.oids and remove everything in it. Then
>>> copy
>>> the
>>> first numbered line into that file.
>>>
>>> Afterwards restart CCEd with this command:
>>>
>>> /etc/init.d/cced.init restart
>>>
>>> That should fix it.
>>>
>>> --
>>> With best regards
>>>
>>> Michael Stauber
>>> _______________________________________________
>>> Blueonyx mailing list
>>> Blueonyx at mail.blueonyx.it
>>> http://mail.blueonyx.it/mailman/listinfo/blueonyx
>>>
>> For some reason when I do this I got the server back up.  Then I fixed the
>> OIDS so they matched.  I restarted CCED but it never completed for hours.
>> I logged in with another terminal and the OIDS don't match again.  Why is
>> this?  What am I doing wrong?
>>
>> Jeremy
>>
>>
>>
>>
>> _______________________________________________
>> Blueonyx mailing list
>> Blueonyx at mail.blueonyx.it
>> http://mail.blueonyx.it/mailman/listinfo/blueonyx
>>
>   To add I just found out that in the administration it shows a site that
> doesn't have a folder anymore in /home.  It looks like a failed site
> removal or someone removed it incorrectly which could have happened since
> there are multiple hands on the cookie jar.

You probably dont want to hear this... but there is a three word 
solution you need to consider... "Restore from backup".

The CODB database is made up of three basic chunks. Objects, a free-list 
index, and a class index. The free-list index has been rebuilt using the 
script above... so we know that is not the problem. The objects 
themselves generally cannot cause the type of issues you are seeing. 
That leaves the class index file. This is a berkley database that 
provides an index to the objects keyed on the class names.

Now comes the bad news. There is currently no tool available to rebuild 
/ recreate that index file. If it gets corrupt - your only way out is a 
restore from backup. I have only seen it cause problems once or twice - 
and it has always been a fatal situation... Restore from backup.

Sorry to be the bearer of bad tidings.

Regards,
Greg.




More information about the Blueonyx mailing list