Latest Tweets

Send a newer PXE kernel version to a new computer node on Rocks Cluster 6.2 without altering the frontend

The issue

You have a Rocks Cluster deployed on your university. The latest official Rocks version is 6.2. It ships with kernel 2.6.32 out-of-the-box. This is the same kernel that is installed on every single computer node, and it’s the very same one that it’s sent via PXE to the computer nodes in order to boot up the installation process. After a long while performing computer simulations, you have bought new computer nodes to be added to the cluster. It comes as no surprise when, once you have started the installation process via PXE, a kernel panic arises: unsupported CPU family. Sometimes it’s not a kernel panic, it’s just that there is no support for your newer hardware and anaconda stops the kickstart process altogether, waiting for you to provide the installer with some additional drivers. Crap.

It’s not the end of the world

So you go through the Rocks documentation and find how to compile and install a custom GNU/Linux Kernel for your computer nodes. Nice:

http://central6.rocksclusters.org/roll-documentation/base/6.1.1/customization-kernel.html

But this do not update the kernel and initramfs sent via PXE, so the node is not going to be able to start the installation process anyway. Reading on, you find that in order to update the PXE kernel and initramfs, you have to install this new kernel on the fronted, and from there rocks-boot will take care of updating the /tftpboot/pxelinux/vmlinuz-6.1.1-x86_64 and /tftpboot/pxelinux/initrd.img-6.1.1-x86_64 files. But hey, wait a minute. What if you do not want to install a new kernel on the frontend? Fret not: here I came up with a safe solution.

First step: Set up the New Installation Kernel.

Clone the Rocks kernel repository from GitHub first:

git clone https://github.com/rocksclusters/kernel.git

Then, download the kernel you need, extract it, and configure it. You can copy the config file for the running kernel and then use “make olddefconfig” to speed things up. If you need some particular driver or kernel option, make sure it is set before going further (you can perform make olddefconfig first followed by make menuconfig). For illustrative purposes, we will be using kernel 3.18.68 here:

cd kernel/src/kernel.org
wget http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.18.68.tar.gz
tar xvfz linux-3.18.68.tar.gz
cd linux-3.18.68
cp /boot/config-2.6.32-504.16.2.el6.x86_64 .config
make olddefconfig

Once this is done, copy this new kernel config file to the parent directory, delete the entire linux-3.18.68 source directory and change the VERSION directive in the version.mk file:

cp .config ../config-3.18.68
cd ..
rm -rf linux-3.18.68
vi version.mk …

The new version.mk file should look like this:

NAME = kernel
RELEASE = 1

VERSION = 3.18.68
PAE = 0
XEN = 0

Now the tricky part. In the past, Centos 5 and early versions shipped with kudzu and mkinitrd. But this is long gone now: kudzu exists no longer and mkinitrd has been replaced with dracut. Therefore, you must change this in the kernel.spec.in file; edit it and remove these lines:

Requires: mkinitrd
Requires: kudzu

Add this line to the file:

Requires: dracut

The %post section will ensure that, during the post-installation of the new kernel image, initramfs is generated. On the computer node there is no such executable as /sbin/new-kernel-pkg, so the following command won’t be executed (this is not even commented on the official Rocks documentation!):

%post

[ -x /sbin/new-kernel-pkg ] && /sbin/new-kernel-pkg –package %{name} \
–mkinitrd –dracut –depmod –install %{kernelversion}

If initramfs does not get generated, the computer node will panic and won’t boot up. Remove these two lines in kernel.spec.in and write this one instead:

/sbin/dracut -f /boot/initramfs-%{kernelversion}.img %{kernelversion}

Now you can build the rpm packages:

make rpm

Once the rpms are built, copy the resulting packages to the rocks install directory and re-generate the distro:

cp ../../RPMS/<arch>/kernel*rpm /export/rocks/install/contrib/6.2/<arch>/RPMS/
cd /export/rocks/install
rocks create distro

This first step is complete now. This kernel will be installed by Anaconda on the computer nodes from now on (on the new ones and on the old ones whenever they are re-installed).

Second step: Set up the New PXE Kernel

Extract the contents of the previous kernel-<VERSION> package on a new directory. You need the vmlinuz image and the new modules before constructing the new initramfs for the PXE installation:

mkdir kernel.binary
cd kernel.binary
rpm2cpio  RPMS/<arch>/kernel-<version>.rpm |cpio –extract –make-directories –verbose
cp boot/vmlinuz-<VERSION> /tfptboot/pxelinux/vmlinuz-<VERSION>
rsync -avtHDl lib/modules/* /lib/modules

Now you have to construct the new initramfs for this new kernel using dracut. Don’t forget to add as many network drivers as needed (for example, for old and new nodes):

depmod -ae <VERSION>
dracut –add-drivers “r8169.ko e1000.ko e1000e.ko” -f /tmp/initramfs <VERSION>

Now the trickiest part of all: you need to modify the provided Rocks initramfs in order to add the new modules. First, create a new directory and inflate the contents of the original initramfs:

mkdir initramo
cd initramo
cp /tftpboot/pxelinux/initrd.img-6.2-x86_64 initrd.img-6.2-x86_64.lzma
unlzma initrd.img-6.2-x86_64.lzma
cat initrd.img-6.2-x86_64|cpio -i

Remove the old modules because you do not need them; don’t forget to delete the initrd file itself too:

rm -rf lib/modules/2.6.32-504.16.2.el6.x86_64
rm -rf initrd.img-6.2-x86_64

Inflate now the contents of the new initramfs previously generated with dracut:

mkdir initramn
cd initramn
gzip -dc /tmp/initramfs|cpio -i

Copy all the new modules to the original initramfs directory:

rsync -avtHDl lib/modules/<VERSION> ../initramo/lib/modules/

Get back to the original initramfs directory and generate the final initramfs file; save it to /tftpboot/pxelinux:

cd initrammo
find . | cpio –create –format=’newc’ > /tmp/newinitrd
gzip /tmp/newinitrd
cp /tmp/newinitrd.gz /tftpboot/pxelinux/initramfs-<VERSION>.img

Set the following permissions to both the kernel and the initramfs image:

chmod 755 /tftpd/pxelinux/vmlinuz-<VERSION>
chmod 755 /tftpd/pxelinux/initramfs-<VERSION>.img

Done; the second step is now complete.

Third step: send the new kernel via PXE to the nodes

Finally, in order to use this new kernel and initramfs via PXE, use the rocks command. First, add a new bootaction making sure to set the “kernel” and “ramdisk” parameters to the ones just generated:

rocks add bootaction action=”install newkernel” kernel=”vmlinuz-<VERSION>” ramdisk=”initramfs-<VERSION>.img” args=”ks ramdisk_size=150000 lang= devfs=nomount pxe kssendmac selinux=0 noipv6 ksdevice=bootif”

Next, set this new action as the install action for your particular subset of newer computer nodes:

rocks set host <NODE> installaction action=”install newkernel”

Now, make sure that during the next PXE boot, the desired node or nodes will be re-installed:

rocks set host boot <NODE> action=install

Conclusions

Now you can build a new kernel, deploy it on your computer nodes and update the PXE kernel and initramfs without installing the new kernel on the frontend or dealing with the rocks-boot package. This way you can even send different kernels via PXE to different computer nodes, in case you need it. Besides, even if the default PXE kernel does allow the new nodes to be correctly installed, you won’t be able to boot up the new nodes with a newer kernel because of the bug with /sbin/new-kernel-pkg, described above. For old computer nodes already installed, this should work out-of-the-box: a new kernel is not going to break the nodes. Anyway, you can try an old node to see if it is still working fine after installing it with the new kernel. If the default PXE kernel does allow the old node to be installed, it’s perfectly safe to install it using this default PXE kernel and then booting it up with the new kernel package.