CentOS 6.2 and libvirt startup issues

After the centos 6.2 update I noticed libvirt was not running on some servers. Looking at the logs I found

Starting libvirtd daemon: 10:35:16.697: 6933: info : libvirt version: 0.9.4, package: 23.el6_2.1 (CentOS BuildSystem <http://bugs.centos.org>, 2011-12-17-16:39:59, c6b4.bsys.dev.centos.org)
10:35:16.697: 6933: error : virNetServerMDNSStart:460 : internal error Failed to create mDNS client: Daemon not running

 

Further investigation found avahi was needed for this to work. The final fix was running

 

yum -y install avahi
/etc/init.d/messagebus restart
/etc/init.d/avahi-daemon restart
/etc/init.d/libvirtd restart
/sbin/chkconfig messagebus on
/sbin/chkconfig avahi-daemon on

In centos 6.2 restarting libvirt will not restart the vm’s. Once done libvirt was running again.

 

 

Posted in Uncategorized | 2 Comments

OpenVZ and CentOS6

I have been testing openvz on centos6 today, and following my normal config I installed software raid1. Everything went fine until a reboot into the centos kernel. On a reboot, I got an error

 

dracut cannot find root, “sleeping forever”

 

As an old school sysadmin I hate change, what is dracut I’m thinking (not to mention fstab has UUID’s instead of LABELs, looking like ubuntu now). This replaces the old initrd system. Ok it has a shell you can boot into rdshell which seems kind of cool. On to the fix I found.

I noticed that on openvz dracut never assembled raid1. I ran

dracut -f –add-drivers raid1 –mdadmconf /boot/initramfs-2.6.32-042stab024.1.img 2.6.32-042stab024.1

Which at the time was the stable kernel and rebooted. System is back up with out errors.

 

You may need to run

mdadm –examine –scan

and update your /etc/mdadm.conf file

 

Posted in Uncategorized | Leave a comment

CVE-2010-3856

A new glibc exploit has been disclosed under CVE-2010-3856. Unlike the last glibc exploit a few days ago you do not get direct root access, but you can create files/dirs in root owned paths. I expect an update from RedHat with in the next 24 – 48 hours.

I released a glibc update for the last glibc update in a testing repo. It looks like I will be keeping the testing repo for some time. Here is how to get the latest glibc update (a copy of my previous post)

-

Run /admin/updatefromtesting and there are glibc updates for CentOS 5.

You can get this by running

/admin/upscripts

If you do not have the admin scripts run

rsync -a rsync://mirror.trouble-free.net/admin /admin

Before use, you will need to run either

ln -s /admin/testing.repo /etc/yum.repos.d/testing.repo

or

cp /admin/testing.repo /etc/yum.repos.d/testing.repo

Then run

/admin/updatefromtesting

This repo is not enabled by default. So what is really happening is yum is being called as yum –enablerepo=tf-testing update

The testing repo will stay around for a bit longer. If you are a current InterServer customer please contact support.

I have tested this update on multiple i386 and x86_64 systems and have seemed it stable. However, using the testing repo is not an official update from RedHat or CentOS.

The repo, including srpm, is at http://mirror.trouble-free.net/tf/testing/5.5/

Posted in Security | Leave a comment

CVE-2010-3847

There is a new linux root exploit through glibc CVE-2010-3847. This exploit can be used to gain root access by a “local user”. Of course, being in the web hosting industry a local user can be an exploitable script, a customer, a php or cgi shell and on and on. Affected are RHEL and CentOS 5.

No glibc update has been released yet by RedHat.

I have released a new admin script and a testing repo on the InterServer yum repo. The admin script is /admin/updatefromtesting and there are glibc updates for CentOS 5.

You can get this by running

/admin/upscripts

If you do not have the admin scripts run

rsync -a rsync://mirror.trouble-free.net/admin /admin

Before use, you will need to run either

ln -s /admin/testing.repo /etc/yum.repos.d/testing.repo

or

cp /admin/testing.repo /etc/yum.repos.d/testing.repo

Then run

/admin/updatefromtesting

This repo is not enabled by default. So what is really happening is yum is being called as yum –enablerepo=tf-testing update

Future updates will not use this repo. In fact, I do not have plans on keeping the testing repo – we will see.

I expect the glibc update from redhat to apply over the testing repo. However this is glibc, so use at your own risks. If you are an InterServer customer contact support for help with this update.

I have tested the update on multiple servers and have build for i386 and x86_64.

The repo, including srpm, is at http://mirror.trouble-free.net/tf/testing/5.5/

Posted in Security | Leave a comment

Building a cheap and easy harddrive copying system

The problem: Harddrives seem to die on Friday at 4 PM.

Needed: A simple system to have anyone be able to easily copy a harddrive, even with bad sectors.

In the past I manually copied the data with a giant rsync command. after creating all the partitions. I had tried DD, but with bad sectors there could be issues. DD_Rescue wasn’t bad, but I was going for something even simpler.

The solution: FreeBSD recoverdisk (it comes with the default FreeBSD 8.0 install)

Recoverdisk can do any OS, skip bad sectors, restart the copy, and even shows a percentage to completion.

I also wanted something that could have the drives swapped out of easily, even if the server was live. To do this, I used ESATA and an external drive dock, with a motherboard that supports AHCI. I also used FreeBSD 8.0. I will not go into the process of installing that however.

So to build this you will need:

1) an esata PCI card for two ports

2) ahci motherboard

3) a server running freebsd 8.0

4) external sata drive dock. I am using StarTech dual 2.5/3.5 Drive eSATA/USB 2.0 SATA HD Docking Station (Part# SATASOCK22UE)

Some things to know before you begin. You could use this as USB, but I found that is much slower. Also smartcheck will not work under USB mode. To do any hot swapping you need an AHCI motherboard. In the below setup I have disk names as ad6 and ad8 – these may be different on your set up – so be sure to double check and change them as you see fit. Finally, test test test, and label because you could wipe the wrong drive otherwise. On my set up, I have the section for the old drive clearly labeled (and called ad6) and the section for the new drive labeled (called ad8). I have also found, some drives that have a lot of bad sectors get to 99.9XXX done and never complete the copy – you can end it there and the new drive will boot.

Here is the process to set this up. Everything below assumes you will put the bad drive in /dev/ad6 and the good drive in /dev/ad8. Change this as needed.

1) Install FreeBSD 8

2) Use pkg_add to install

bash
e2fsprogs
gettext
libiconv
linuxfdisk
nano
ntfsprogs
psmisc
smartmontools

Thats simply issuing a pkg_add -r bash for example.

3) Set bash as the default shell with chsh -s /usr/local/bin/bash root

4) Edit /root/.bash_profile and add

if [ -e /etc/bashrc ]; then
source /etc/bashrc
fi

5) edit /etc/bashrc and add

CLICOLOR=”YES”;    export CLICOLOR
LSCOLORS=”ExGxFxdxCxDxDxhbadExEx”;    export LSCOLORS

COLOR1=”\[\033[1;35m\]” ### Bright Purple
COLOR2=”\[\033[0;34m\]” ### Dark Blue
COLOR3=”\[\033[0;35m\]” ### Purple (Magenta)
COLOR4=”\[\033[0m\]“    ### Blank
COLOR5=”\[\033[1;37m\]” ### Bright White
COLOR6=”\[\033[1;31m\]” ### Red
COLOR7=”\[\033[1;33m\]” ### Bright Yellow
COLOR8=”\[\033[1;32m\]” ### Light Green
COLOR9=”\[\033[0;36m\]” ### Cyan (Aqua)
COLOR10=”\[\033[1;34m\]” ### Blue
TTY=”$(tty | sed s#”/dev/”#”"#g)”
export PS1=”$COLOR2/–$COLOR10($COLOR2\u$COLOR10@$COLOR2\h$COLOR4$COLOR10)$COLOR7:$COLOR5[$COLOR4\w$COLOR5]$COLOR2-$COLOR5[$COLOR4$TTY$COLOR5]$COLOR2-$COLOR10($COLOR4\$(date +%I:%M%P)$COLOR10)\n$COLOR2\-$COLOR10>$COLOR4 ”

# START bash completion — do not remove this line
bash=${BASH_VERSION%.*}; bmajor=${bash%.*}; bminor=${bash#*.}
if [ "$PS1" ] && [ $bmajor -eq 2 ] && [ $bminor '>' 04 ] \
&& [ -f /etc/bash_completion ]; then # interactive shell
# Source completion code
. /etc/bash_completion
fi

# freebsd
if [ -f /usr/local/etc/bash_completion ]; then
. /usr/local/etc/bash_completion
fi

unset bash bmajor bminor
# END bash completion — do not remove this line

export VISUAL=nano

# commands

alias list=”atacontrol list”
alias detach1=”atacontrol detach ata3″
alias detach2=”atacontrol detach ata4″
alias attach1=”atacontrol detach ata3 && atacontrol attach ata3″
alias attach2=”atacontrol detach ata4 && atacontrol attach ata4″
alias smart1=”smartctl -s on /dev/ad6 2&>/dev/null ; smartctl -data -a /dev/ad6 2>&1| more”
alias smart2=”smartctl -s on /dev/ad8 2&>/dev/null ; smartctl -data -a /dev/ad8 2>&1 | more”
alias copy=”recoverdisk -w /root/disk.txt /dev/ad6 /dev/ad8″
alias restartcopy=”recoverdisk -r /root/disk.txt -w /root/disk.txt /dev/ad6 /dev/ad8″
alias part1=”fdisk-linux -l /dev/ad6″
alias part2=”fdisk-linux -l /dev/ad8″
alias badblocks1=”/usr/local/sbin/badblocks -vv /dev/ad6″
alias badblocks2=”/usr/local/sbin/badblocks -vv /dev/ad8″
alias copyhelp=”cat /etc/motd | more”

# END BASHRC

7) Edit /etc/motd and add

Quick commands:

HD Recovery:
copy         – copy bad drive (1) to good drive (2)
restartcopy  – restart a copy that has failed (reboot / powerfailure etc)\

HD Test:
smart1       – smartcheck bad drive (1)
smart2       – smartcheck good drive (2)
part1        – list partitions and drive size on bad drive (1)
part2        – list partitions and drive size on good drive (2)
badblocks1   – run a bad block test on bad drive (1)
badblocks2   – run a bad block test on good drive (2)

Start / Stop
attach1      – start drive 1 (bad)
attach2      – start drive 2 (good)
detach1      – stop drive 1 (bad)
detach2      – stop drive 2 (good)

Misc
list         – list all drives

Linux Tools
e2fsck       – fsck for linux

Windows Tools
ntfsfix      – fix for dirty windows filesystem

FreeBSD Tools
fsck         – fsck for freebsd

Mount
mount -t ext2fs – ext2/3 linux

List This Help
copyhelp

That’s it. Now you are done with the setup. Hopefully that was easy for you.

Ok, so how can you use this? Log into the server when both drives are in the proper location and run

copy

Here are some real work examples:

Copy and restartcopy

/–(root@recover):[~]-[pts/0]-(03:20P)
\-> copy
Bigsize = 1048576, medsize = 32768, minsize = 512
start    size     block-len state          done     remaining    % done
3145728 1048576  500104716288     0       3145728  500104716288   0.00063
4194304 1048576 failed (Input/output error)
174063616 1048576  499933798400     0     173015040  499934846976   0.03460^C
Saving worklist … done.
/–(root@recover):[~]-[pts/0]-(03:20P)
\-> restartcopy
Bigsize = 1048576, medsize = 32768, minsize = 512
Reading worklist … done.
start    size     block-len state          done     remaining    % done
333447168 1048576  499774414848     0     332398592  499775463424   0.06647^C

attaching a drive:

/–(root@recover):[~]-[pts/0]-(03:19P)
\-> attach1
Master:  ad6 <SAMSUNG HD502HJ/1AJ10001> SATA revision 2.x
Slave:       no device present
/–(root@recover):[~]-[pts/0]-(03:19P)
\-> attach2
Master:  ad8 <WDC WD3200AAJS-00L7A0/01.03E01> SATA revision 2.x
Slave:       no device present
/–(root@recover):[~]-[pts/0]-(03:19P)
\-> list
ATA channel 0:
Master:      no device present
Slave:       no device present
ATA channel 1:
Master:      no device present
Slave:       no device present
ATA channel 2:
Master:      no device present
Slave:       no device present
ATA channel 3:
Master:  ad6 <SAMSUNG HD502HJ/1AJ10001> SATA revision 2.x
Slave:       no device present
ATA channel 4:
Master:  ad8 <WDC WD3200AAJS-00L7A0/01.03E01> SATA revision 2.x
Slave:       no device present
ATA channel 5:
Master:      no device present
Slave:       no device present
ATA channel 6:
Master:      no device present
Slave:       no device present

Detach a drive:
/–(root@recover):[~]-[pts/0]-(03:18P)
\-> detach1
/–(root@recover):[~]-[pts/0]-(03:19P)
\-> detach2
/–(root@recover):[~]-[pts/0]-(03:19P)
\-> list
ATA channel 0:
Master:      no device present
Slave:       no device present
ATA channel 1:
Master:      no device present
Slave:       no device present
ATA channel 2:
Master:      no device present
Slave:       no device present
ATA channel 3:
Master:      no device present
Slave:       no device present
ATA channel 4:
Master:      no device present
Slave:       no device present
ATA channel 5:
Master:      no device present
Slave:       no device present
ATA channel 6:
Master:      no device present

Now it is safe to remove it.

Show partitions (it also will show the drive model and size)

\-> part1

Disk /dev/ad6: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot    Start       End    Blocks   Id  System
/dev/ad6s1   *         1     60800 488375968+   7  HPFS/NTFS

As you can see in the help section there are other options as well. With any command that uses 1 or 2 at the end, 1 is assumed to be the bad drive and 2 is assumed to be the good drive.

Finally here are some pictures of my set up:

Posted in Hardware | Leave a comment