Tags: xen

Building libvirt on CentOS5 (incomplete)

Originally published at The Pædantic Programmer. Please leave any comments there.

Much thanks to Brett for the pointers on rpm-fu.

http://grantmcwilliams.com/index.php?option=com_content&view=article&id=229:

$ sudo yum install \
xen-devel \
xhtml1-dtds \
hal-devel \
libpcap-devel \
cyrus-sasl-devel \
parted-devel \
numactl-devel \
avahi-devel \
slang-devel \
libvolume_id-devel \
openldap-devel

# device-mapper-devel \
# xmlrpc-c-devel \

for pkg in \
libssh2-devel-1.2.5-1.el5.pp.x86_64.rpm \
libssh2-1.2.5-1.el5.pp.x86_64.rpm \
libssh-0.2.1-0.2.svn193.el5.pp.x86_64.rpm \
libssh-devel-0.2.1-0.2.svn193.el5.pp.x86_64.rpm
do
wget ftp://ftp.pbone.net/mirror/ftp.pramberger.at/systems/linux/contrib/rhel5/x86_64/$pkg
rpm -i $pkg
done

wget http://www.fateyev.com/RPMS/RHEL5/x86_64/xmlrpc-c-devel-1.14.8-1.el5.x86_64.rpm
wget http://www.fateyev.com/RPMS/RHEL5/x86_64/xmlrpc-c-1.14.8-1.el5.x86_64.rpm
rpm -i xmlrpc-c-*1.14*el5*.rpm

wget ftp://ftp.icm.edu.pl/vol/rzm1/linux-fedora-secondary/development/source/SRPMS/corosync-0.95-2.fc11.src.rpm
alien -t corosync-0.95-2.fc11.src.rpm
mv corosync-0.95.tgz /tmp
cd ~/rpm/SOURCES
tar xfz /tmp/corosync-0.95.tgz
patch -p0

--- corosync.spec.orig  2010-05-26 19:17:15.000000000 +0000
+++ corosync.spec       2010-05-26 19:21:39.000000000 +0000
@@ -39,11 +39,7 @@
 fi
 %endif

-%{_configure}  CFLAGS="$(echo '%{optflags}')" \
-               --prefix=/usr \
-               --sysconfdir=/etc \
-               --localstatedir=/var \
-               --libdir=%{_libdir}
+%{configure}   CFLAGS="$(echo '%{optflags}')"

 %build
 make %{_smp_mflags}
@@ -52,8 +48,8 @@
 rm -rf %{buildroot}

 make install DESTDIR=%{buildroot}
-install -d %{buildroot}%{_initddir}
-install -m 755 init/redhat %{buildroot}%{_initddir}/corosync
+install -d %{buildroot}%{_sysconfdir}/init.d
+install -m 755 init/redhat %{buildroot}%{_sysconfdir}/init.d/corosync

 ## tree fixup
 # drop static libs
@@ -95,7 +91,7 @@
 %{_sbindir}/corosync-fplay
 %{_sbindir}/corosync-pload
 %config(noreplace) /etc/corosync.conf
-%{_initddir}/corosync
+%{_sysconfdir}/init.d/corosync
 %dir %{_libexecdir}/lcrso
 %{_libexecdir}/lcrso/coroparse.lcrso
 %{_libexecdir}/lcrso/objdb.lcrso

rpmbuild -bb corosync.spec
sudo rpm -i ~/rpm/RPMS/x86_64/corosynclib*.rpm

wget ftp://ftp.icm.edu.pl/vol/rzm1/linux-fedora-secondary/development/source/SRPMS/openais-0.94-1.fc11.src.rpm
alien -t openais-0.94-1.fc11.src.rpm
mv openais-0.94.tgz /tmp
cd ~/rpm/SOURCES
tar xfz /tmp/openais-0.94.tgz
rpmbuild -bb openais.spec
sudo rpm -i ~/rpm/RPMS/x86_64/openaislib-*.rpm

wget ftp://ftp.icm.edu.pl/vol/rzm1/linux-fedora-secondary/development/source/SRPMS/lvm2-2.02.45-4.fc11.src.rpm
alien -t lvm2-2.02.45-4.fc11.src.rpm
mv lvm2-2.02.45.tgz /tmp
cd ~/rpm/SOURCES
tar xfz /tmp/lvm2-2.02.45.tgz
rpmbuild -bb lvm2.spec

John Hodgman is using software that I helped to write

Originally published at The Pædantic Programmer. Please leave any comments there.

A recent blog post by The Hodg Man mentions that he uses (AND ENJOYS) a product I’ve helped to build. Yay.

***

FULL DISCLOSURE NUMBER TWO

FAITHFUL READERS OF THIS IMITATION BLOG know that, having crashed my own website repeatedly while linking to it Twitterphonically, I experimented with NEW INTERNET TECHNOLOGY to try to fix this problem.

SPECIFICALLY, my host, LiquidWeb, called and offered their cloud computing solution STORM ON DEMAND. I think this means that instead of being stored on a single regular computer, my whole website is instead stored on dozens of semi-mechanical, murderous black clouds on various uncharted islands.

Cloud Computing at Work

THIS EXPERIMENT WAS SUCCESSFUL. So, in the spirit of full disclosure,

I AM USING STORM ON DEMAND FOR A REDUCED PRICE, and unless it grabs me and pulls me in to the woods to murder me, I will continue to USE AND ENJOY IT.

***

AoE root for KVM guests

Originally published at The Pædantic Programmer. Please leave any comments there.

Intro

So. I’m trying to get familiar with libvirt and friends. To this end, I’ve set up a Lucid virtual machine booting from PXE into an initrd environment which does a pivot_root to an AoE block device.

The #virt channel on irc.oftc.net told me that in order to have libvirt provide PXE capability, I would have to install a recent version of libvirt. I built version 0.7.5-3 from sid on my karmic laptop and it seems to be working okay.

I decided to set up the pxe root directoy in /var/lib/tftproot just because that’s what the example code had in it.

Configure the Virtual Network

I had to manually configure a virtual network. Here is the XML config file:

$ sudo virsh net-dumpxml netboot
<network>
  <name>netboot</name>
  <uuid>81ff0d90-c91e-6742-64da-4a736edb9a9b</uuid>
  <forward mode='nat'/>
  <bridge name='virbr1' stp='off' delay='1' />
  <domain name='example.com'/>
  <ip address='192.168.123.1' netmask='255.255.255.0'>
    <tftp root='/var/lib/tftproot' />
    <dhcp>
      <range start='192.168.123.2' end='192.168.123.254' />
      <bootp file='pxelinux.0' />
    </dhcp>
  </ip>
</network>

Install syslinux

This, of course, depends on the pxelinux.0 file. Luckily, this is packaged up in syslinux and can be installed with a simple


$ sudo apt-get install syslinux
$ sudo mkdir /var/lib/tftproot
$ sudo cp /usr/lib/syslinux/pxelinux.0 /var/lib/tftproot

Configure PXE boot parameters

I had to create a pxelinux config file for the virtual machine (indexed by mac address). Note that I put a console=ttyS0,115200 argument on the kernel command line so that I can attach to the serial port from the host system for copy/paste debugging. Also of importance is the root=/dev/etherd/e0.1p1 argument, specifying which block device we’ll be doing the pivot_root to eventually.


$ mkdir /var/lib/tftproot/pxelinux.cfg/
$ cat /var/lib/tftproot/pxelinux.cfg/01-52-54-00-44-34-67
DEFAULT linux
LABEL linux
SAY Now booting the kernel from PXELINUX...
KERNEL vmlinuz-lucid0
APPEND ro root=/dev/etherd/e0.1p1 console=ttyS0,115200 initrd=initrd.img-lucid0

I decided to use the karmic kernel for lucid initially. I’ll eventually switch over to the lucid kernel ;)


$ sudo cp /boot/vmlinuz-2.6.31-17-generic /var/lib/tftproot/vmlinuz-lucid0

Customize initramfs-tools

I copied /etc/initramfs-tools to ~/tmp/lucid so that I didn’t mess up the system initrd scripts:


$ mkdir -p ~/tmp/lucid && cp -r /etc/initramfs-tools ~/tmp/lucid/

Since mkinitramfs doesn’t currently have a system for AoE root, I had to do a bit of fiddling. I copied the NFS root boot script and made a couple of modifications.

$ diff -u /usr/share/initramfs-tools/scripts/nfs ~/tmp/lucid/initramfs-tools/scripts/aoe
--- /usr/share/initramfs-tools/scripts/nfs	2008-06-23 23:10:21.000000000 -0700
+++ /home/cjac/tmp/lucid/initramfs-tools/scripts/aoe	2010-01-15 14:56:28.098298027 -0800
@@ -5,59 +5,25 @@
 retry_nr=0

 # parse nfs bootargs and mount nfs
-do_nfsmount()
+do_aoemount()
 {
-
 	configure_networking

-	# get nfs root from dhcp
-	if [ "x${NFSROOT}" = "xauto" ]; then
-		# check if server ip is part of dhcp root-path
-		if [ "${ROOTPATH#*:}" = "${ROOTPATH}" ]; then
-			NFSROOT=${ROOTSERVER}:${ROOTPATH}
-		else
-			NFSROOT=${ROOTPATH}
-		fi
-
-	# nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>]
-	elif [ -n "${NFSROOT}" ]; then
-		# nfs options are an optional arg
-		if [ "${NFSROOT#*,}" != "${NFSROOT}" ]; then
-			NFSOPTS="-o ${NFSROOT#*,}"
-		fi
-		NFSROOT=${NFSROOT%%,*}
-		if [ "${NFSROOT#*:}" = "$NFSROOT" ]; then
-			NFSROOT=${ROOTSERVER}:${NFSROOT}
-		fi
-	fi
+        ip link set up dev eth0

-	if [ -z "${NFSOPTS}" ]; then
-		NFSOPTS="-o retrans=10"
-	fi
+        ls /dev/etherd/

-	[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/nfs-premount"
-	run_scripts /scripts/nfs-premount
-	[ "$quiet" != "y" ] && log_end_msg
+        echo > /dev/etherd/discover

-	if [ ${readonly} = y ]; then
-		roflag="-o ro"
-	else
-		roflag="-o rw"
-	fi
+        ls /dev/etherd/

-	nfsmount -o nolock ${roflag} ${NFSOPTS} ${NFSROOT} ${rootmnt}
+        mount ${ROOT} ${rootmnt}
 }

-# NFS root mounting
+# AoE root mounting
 mountroot()
 {
-	[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/nfs-top"
-	run_scripts /scripts/nfs-top
-	[ "$quiet" != "y" ] && log_end_msg
-
-	modprobe nfs
-	# For DHCP
-	modprobe af_packet
+	modprobe aoe

 	# Default delay is around 180s
 	# FIXME: add usplash_write info
@@ -67,17 +33,13 @@
 		delay=${ROOTDELAY}
 	fi

-	# loop until nfsmount succeds
+	# loop until aoemount succeds
 	while [ ${retry_nr} -lt ${delay} ] && [ ! -e ${rootmnt}${init} ]; do
 		[ ${retry_nr} -gt 0 ] && \
-		[ "$quiet" != "y" ] && log_begin_msg "Retrying nfs mount"
-		do_nfsmount
+		[ "$quiet" != "y" ] && log_begin_msg "Retrying AoE mount"
+		do_aoemount
 		retry_nr=$(( ${retry_nr} + 1 ))
 		[ ! -e ${rootmnt}${init} ] && /bin/sleep 1
 		[ ${retry_nr} -gt 0 ] && [ "$quiet" != "y" ] && log_end_msg
 	done
-
-	[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/nfs-bottom"
-	run_scripts /scripts/nfs-bottom
-	[ "$quiet" != "y" ] && log_end_msg
 }

(below is the full file in case udiff is less convenient)

$ cat ~/tmp/lucid/initramfs-tools/scripts/aoe
# NFS filesystem mounting			-*- shell-script -*-

# FIXME This needs error checking

retry_nr=0

# parse nfs bootargs and mount nfs
do_aoemount()
{
	configure_networking

        ip link set up dev eth0

        ls /dev/etherd/

        echo > /dev/etherd/discover

        ls /dev/etherd/

        mount ${ROOT} ${rootmnt}
}

# AoE root mounting
mountroot()
{
	modprobe aoe

	# Default delay is around 180s
	# FIXME: add usplash_write info
	if [ -z "${ROOTDELAY}" ]; then
		delay=180
	else
		delay=${ROOTDELAY}
	fi

	# loop until aoemount succeds
	while [ ${retry_nr} -lt ${delay} ] && [ ! -e ${rootmnt}${init} ]; do
		[ ${retry_nr} -gt 0 ] && \
		[ "$quiet" != "y" ] && log_begin_msg "Retrying AoE mount"
		do_aoemount
		retry_nr=$(( ${retry_nr} + 1 ))
		[ ! -e ${rootmnt}${init} ] && /bin/sleep 1
		[ ${retry_nr} -gt 0 ] && [ "$quiet" != "y" ] && log_end_msg
	done
}

There was also a small modification to the initramfs.conf file:

$ diff -u /etc/initramfs-tools/initramfs.conf ~/tmp/lucid/initramfs-tools/initramfs.conf
--- /etc/initramfs-tools/initramfs.conf	2008-07-08 18:37:42.000000000 -0700
+++ /home/cjac/tmp/lucid/initramfs-tools/initramfs.conf	2010-01-15 14:33:38.088295207 -0800
@@ -47,14 +47,16 @@
 #

 #
-# BOOT: [ local | nfs ]
+# BOOT: [ local | nfs | aoe]
 #
 # local - Boot off of local media (harddrive, USB stick).
 #
 # nfs - Boot using an NFS drive as the root of the drive.
 #
+# aoe - Boot using an AoE drive as the root of the drive.
+#

-BOOT=local
+BOOT=aoe

 #
 # DEVICE: ...

I also needed to add aoe to the list of modules included in the initramfs:


$ echo aoe >> ~/tmp/lucid/initramfs-tools/modules

In order to generate the initrd.img file from this new config, I ran the following:


$ sudo mkinitramfs -d ~/tmp/lucid/initramfs-tools/ -o /var/lib/tftproot/initrd.img-lucid0

Install OS to virtual block device

I created a lucid VM by installing from the desktop install disk. You can grab the ISO here:

http://cdimage.ubuntu.com/daily-live/current/

I’ll leave the creation of the virtual machine and installation as an exercise for the reader. I put the filesystem on an lvm volume group called vg0 in a logical volume called lucid0 (ie, /dev/vg0/lucid0).

Create virtual machine definition with virsh

At this point, I created a new virtual machine called lucid0. Here is the xml for the domain:

$ sudo virsh dumpxml lucid0
<domain type='kvm' id='1'>
  <name>lucid0</name>
  <uuid>96fbad21-4f25-5700-ddd8-1a565c7170ee</uuid>
  <memory>524288</memory>
  <currentMemory>524288</currentMemory>
  <vcpu>1</vcpu>
  <os>
    <type arch='x86_64' machine='pc-0.11'>hvm</type>
    <boot dev='network'/>
  </os>
  <features>
    <pae/>
  </features>
  <clock offset='localtime'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/bin/kvm</emulator>
    <interface type='network'>
      <mac address='52:54:00:44:34:67'/>
      <source network='netboot'/>
      <target dev='vnet0'/>
    </interface>
    <serial type='pty'>
   <source path='/dev/pts/4'/>
      <target port='0'/>
    </serial>
  <console type='pty' tty='/dev/pts/4'>
   <source path='/dev/pts/4'/>
      <target port='0'/>
    </console>
    <input type='tablet' bus='usb'/>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='5901' autoport='yes' listen='127.0.0.1' keymap='en-us'/>
    <sound model='es1370'/>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
    </video>
  </devices>
</domain>

Start AoE target

Now we’re ready to start the AoE target and launch the virtual machine. If you don’t have vblade installed, do so now:


$ sudo apt-get install vblade

Start the target up with the following command:


$ sudo vbladed 0 1 virbr1 /dev/vg0/lucid0

Boot the virtual machine

Now, if all goes well, you should be able to watch the virtual machine boot up and do its thing like so:


$ sudo virsh start lucid0 && sudo screen -S lucid0 `sudo virsh ttyconsole lucid0` 115200

If you get errors about /dev/etherd/e0.1p1 not existing (these might look like this):

Begin: Retrying AoE mount ...
err         discover    interfaces  revalidate  flush
err         discover    interfaces  revalidate  flush
mount: mounting /dev/etherd/e0.1p1 on /root failed: No such file or directory
Done.

then you might want to try restarting vbladed like this:


$ sudo kill -9 `ps auwx | grep vblade | grep -v grep | awk '{print $2}' ` && sudo vbladed 0 1 virbr1 /dev/vg0/lucid0

Questions? Comments?

So. Now you should have a lucid gdm in your virt-manager console. Any questions? #virt on irc.oftc.net

Also, feel free to email me

building unmodified_drivers

Originally published at The Pædantic Programmer. Please leave any comments there.

This is the gist of it:

$ cd /usr/src/
# $ sudo chmod a+rwx .
$ wget ftp://ftp.suse.com/pub/projects/kernel/kotd/SLE11_BRANCH/src/kernel-source-2.6.27.39-0.0.0.25.15a4c6f.src.rpm
$ alien -tg kernel-source-2.6.27.39-0.0.0.25.15a4c6f.src.rpm
$ cd kernel-source-2.6.27.39
$ tar xfj linux-2.6.27.tar.bz2
$ for f in patches.*.tar.bz2; do
tar xfj $f || break;
done
$ for p in $(./guards x86_64 < series.conf); do
patch -d linux-2.6.27 -p1 < $p || break
done
$ cd linux-2.6.27
$ fakeroot make-kpkg debian
$ fakeroot make-kpkg build
$ sudo make install modules_install
$ cd /usr/src
$ hg clone http://xenbits.xen.org/xen-unstable.hg
$ cd xen-unstable.hg/unmodified_drivers/linux-2.6
$ XEN=/usr/src/xen-unstable.hg/xen XL=/usr/src/kernel-source-2.6.27.39/linux-2.6.27 ./mkbuildtree x86_64
$ make -C /usr/src/kernel-source-2.6.27.39/linux-2.6.27 M=$PWD modules
$ sudo make -C /usr/src/kernel-source-2.6.27.39/linux-2.6.27 M=$PWD modules_install

Xen PV network driver

Originally published at The Pædantic Programmer. Please leave any comments there.

I haven’t used Xen HVM until recently. When I was at Amazon and
hanging with the Xen provisioning folks, I recall complaints about the
performance of network drivers on HVM instances. I’ve recently
discovered that this was due to the use of the ioemu virtual interface
(vif) system. In paravirtualized environments, Xen vif devices are
more efficient because the guest kernel can talk to the hypervisor to
schedule I/O. In hardware virtualized environments, the hypervisor
emulates the back end of a common network driver and the guest uses it
as if it is the real thing.

The Xen team (or was it RedHat?) patched the 2.6.18 kernel such that
HVM guests got the best of both worlds. The guest ran in a HVM
environment, but also had a way of talking with the hypervisor by way
of the xen-vnif kernel driver
(kernel/drivers/xenpv_hvm/netfront/xen-vnif.ko).

I may not be looking in the right places, but I can’t seem to find
such a paravirtualized network driver for any kernels more recent than
2.6.18. This has recently become a bit of an issue, since karmic (and probably sid)
depends on udev 145, which depends on signalfd, and thus won’t run on
kernels before (something like) 2.6.25.

Unless I miss my guess (which is quite possible =]), this means one of
the following needs to happen:

  • signalfd needs to be back-ported to the 2.6.18 xen kernel
  • xen-vnif needs to be forward-ported to >= 2.6.25

Has this already happened? Am I missing something?