Latest Tweets

Send a newer PXE kernel version to a new computer node on Rocks Cluster 6.2 without altering the frontend

The issue

You have a Rocks Cluster deployed on your university. The latest official Rocks version is 6.2. It ships with kernel 2.6.32 out-of-the-box. This is the same kernel that is installed on every single computer node, and it’s the very same one that it’s sent via PXE to the computer nodes in order to boot up the installation process. After a long while performing computer simulations, you have bought new computer nodes to be added to the cluster. It comes as no surprise when, once you have started the installation process via PXE, a kernel panic arises: unsupported CPU family. Sometimes it’s not a kernel panic, it’s just that there is no support for your newer hardware and anaconda stops the kickstart process altogether, waiting for you to provide the installer with some additional drivers. Crap.

It’s not the end of the world

So you go through the Rocks documentation and find how to compile and install a custom GNU/Linux Kernel for your computer nodes. Nice:

http://central6.rocksclusters.org/roll-documentation/base/6.1.1/customization-kernel.html

But this do not update the kernel and initramfs sent via PXE, so the node is not going to be able to start the installation process anyway. Reading on, you find that in order to update the PXE kernel and initramfs, you have to install this new kernel on the fronted, and from there rocks-boot will take care of updating the /tftpboot/pxelinux/vmlinuz-6.1.1-x86_64 and /tftpboot/pxelinux/initrd.img-6.1.1-x86_64 files. But hey, wait a minute. What if you do not want to install a new kernel on the frontend? Fret not: here I came up with a safe solution.

First step: Set up the New Installation Kernel.

Clone the Rocks kernel repository from GitHub first:

git clone https://github.com/rocksclusters/kernel.git

Then, download the kernel you need, extract it, and configure it. You can copy the config file for the running kernel and then use “make olddefconfig” to speed things up. If you need some particular driver or kernel option, make sure it is set before going further (you can perform make olddefconfig first followed by make menuconfig). For illustrative purposes, we will be using kernel 3.18.68 here:

cd kernel/src/kernel.org
wget http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.18.68.tar.gz
tar xvfz linux-3.18.68.tar.gz
cd linux-3.18.68
cp /boot/config-2.6.32-504.16.2.el6.x86_64 .config
make olddefconfig

Once this is done, copy this new kernel config file to the parent directory, delete the entire linux-3.18.68 source directory and change the VERSION directive in the version.mk file:

cp .config ../config-3.18.68
cd ..
rm -rf linux-3.18.68
vi version.mk …

The new version.mk file should look like this:

NAME = kernel
RELEASE = 1

VERSION = 3.18.68
PAE = 0
XEN = 0

Now the tricky part. In the past, Centos 5 and early versions shipped with kudzu and mkinitrd. But this is long gone now: kudzu exists no longer and mkinitrd has been replaced with dracut. Therefore, you must change this in the kernel.spec.in file; edit it and remove these lines:

Requires: mkinitrd
Requires: kudzu

Add this line to the file:

Requires: dracut

The %post section will ensure that, during the post-installation of the new kernel image, initramfs is generated. On the computer node there is no such executable as /sbin/new-kernel-pkg, so the following command won’t be executed (this is not even commented on the official Rocks documentation!):

%post

[ -x /sbin/new-kernel-pkg ] && /sbin/new-kernel-pkg –package %{name} \
–mkinitrd –dracut –depmod –install %{kernelversion}

If initramfs does not get generated, the computer node will panic and won’t boot up. Remove these two lines in kernel.spec.in and write this one instead:

/sbin/dracut -f /boot/initramfs-%{kernelversion}.img %{kernelversion}

Now you can build the rpm packages:

make rpm

Once the rpms are built, copy the resulting packages to the rocks install directory and re-generate the distro:

cp ../../RPMS/<arch>/kernel*rpm /export/rocks/install/contrib/6.2/<arch>/RPMS/
cd /export/rocks/install
rocks create distro

This first step is complete now. This kernel will be installed by Anaconda on the computer nodes from now on (on the new ones and on the old ones whenever they are re-installed).

Second step: Set up the New PXE Kernel

Extract the contents of the previous kernel-<VERSION> package on a new directory. You need the vmlinuz image and the new modules before constructing the new initramfs for the PXE installation:

mkdir kernel.binary
cd kernel.binary
rpm2cpio  RPMS/<arch>/kernel-<version>.rpm |cpio –extract –make-directories –verbose
cp boot/vmlinuz-<VERSION> /tfptboot/pxelinux/vmlinuz-<VERSION>
rsync -avtHDl lib/modules/* /lib/modules

Now you have to construct the new initramfs for this new kernel using dracut. Don’t forget to add as many network drivers as needed (for example, for old and new nodes):

depmod -ae <VERSION>
dracut –add-drivers “r8169.ko e1000.ko e1000e.ko” -f /tmp/initramfs <VERSION>

Now the trickiest part of all: you need to modify the provided Rocks initramfs in order to add the new modules. First, create a new directory and inflate the contents of the original initramfs:

mkdir initramo
cd initramo
cp /tftpboot/pxelinux/initrd.img-6.2-x86_64 initrd.img-6.2-x86_64.lzma
unlzma initrd.img-6.2-x86_64.lzma
cat initrd.img-6.2-x86_64|cpio -i

Remove the old modules because you do not need them; don’t forget to delete the initrd file itself too:

rm -rf lib/modules/2.6.32-504.16.2.el6.x86_64
rm -rf initrd.img-6.2-x86_64

Inflate now the contents of the new initramfs previously generated with dracut:

mkdir initramn
cd initramn
gzip -dc /tmp/initramfs|cpio -i

Copy all the new modules to the original initramfs directory:

rsync -avtHDl lib/modules/<VERSION> ../initramo/lib/modules/

Get back to the original initramfs directory and generate the final initramfs file; save it to /tftpboot/pxelinux:

cd initrammo
find . | cpio –create –format=’newc’ > /tmp/newinitrd
gzip /tmp/newinitrd
cp /tmp/newinitrd.gz /tftpboot/pxelinux/initramfs-<VERSION>.img

Set the following permissions to both the kernel and the initramfs image:

chmod 755 /tftpd/pxelinux/vmlinuz-<VERSION>
chmod 755 /tftpd/pxelinux/initramfs-<VERSION>.img

Done; the second step is now complete.

Third step: send the new kernel via PXE to the nodes

Finally, in order to use this new kernel and initramfs via PXE, use the rocks command. First, add a new bootaction making sure to set the “kernel” and “ramdisk” parameters to the ones just generated:

rocks add bootaction action=”install newkernel” kernel=”vmlinuz-<VERSION>” ramdisk=”initramfs-<VERSION>.img” args=”ks ramdisk_size=150000 lang= devfs=nomount pxe kssendmac selinux=0 noipv6 ksdevice=bootif”

Next, set this new action as the install action for your particular subset of newer computer nodes:

rocks set host <NODE> installaction action=”install newkernel”

Now, make sure that during the next PXE boot, the desired node or nodes will be re-installed:

rocks set host boot <NODE> action=install

Conclusions

Now you can build a new kernel, deploy it on your computer nodes and update the PXE kernel and initramfs without installing the new kernel on the frontend or dealing with the rocks-boot package. This way you can even send different kernels via PXE to different computer nodes, in case you need it. Besides, even if the default PXE kernel does allow the new nodes to be correctly installed, you won’t be able to boot up the new nodes with a newer kernel because of the bug with /sbin/new-kernel-pkg, described above. For old computer nodes already installed, this should work out-of-the-box: a new kernel is not going to break the nodes. Anyway, you can try an old node to see if it is still working fine after installing it with the new kernel. If the default PXE kernel does allow the old node to be installed, it’s perfectly safe to install it using this default PXE kernel and then booting it up with the new kernel package.

Installing and using RAVADAVDI on Debian Jessie

Introducing RAVADAVDI

This is an incredible effort to bring the power of VDI to you using only open source libraries and tools. It is mainly developed in PERL 5, and it relies on KVM for virtualization, and Spice for sound, I/O and graphics. Of course, there’s still a lot of things to polish, such as increasing its security: it does not implement any sort of encryption save for TLS on the ravadavdi web framework part (by means of using Apache, for example). The way the user authenticates against the Spice remote server is poor; only a 4-character pseudo-random password easily prone to brute-forcing. But this is a start, and a hell of a start indeed! I’ve been testing this whole framework myself and I have to admit that I am impressed. Thus, I was a bit disappointed at discovering that it did not work out out-of-the-box on Debian systems. The installation is really easy, but only if you are deploying this solution on Ubuntu-based distros. A pitty. So I decided to have a look into it and make it work for Debian Jessie.

First Issue: MySQL version < 5.6

You can follow the instructions and install the framework on your Debian Jessie box up until the “Ravada Web User” section (http://ravada.readthedocs.io/en/latest/docs/INSTALL.html). Debian Jessie ships with MySQL 5.55.084, and according to RAVADAVDI documentation, MySQL 5.6 is required. But in fact, it does not use anything from MySQL 5.6 that is not present on MySQL 5.55, so you can still use this framework without the pain of updating your MySQL version. MySQL 5.55 does not implement a DEFAULT value for DATETIME fields; therefore if you try to add a new user using the rvd_back perl script, you will get an error:

rvd_back –add-user lud.test
INFO: creating table messages
DBD::mysql::db do failed: Invalid default value for ‘date_send’ at /usr/share/perl5/Ravada.pm line 276.

You need to replace the field “date_send” in the table “messages”, which is of type DATETIME,  with  TIMESTAMP. TIMESTAMP fields in MySQL 5.55 does implement a DEFAULT value:

CREATE TABLE `messages` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_user` int(11) NOT NULL,
`id_request` int(11),
`subject` varchar(120) DEFAULT NULL,
`message` text,
`date_send` timestamp default now(),
`date_shown` datetime,
`date_read` datetime,
PRIMARY KEY (`id`),
KEY `id_user` (`id_user`)

Use your favourite ASCII editor and make this small alteration in the file /usr/share/doc/ravada/sql/mysql/messages.sql.

Second Issue: no kvm-spice binary

Reading the RAVADAVDI documentation we notice:

Debian jessie has been tried but kvm spice wasn’t available there, so it won’t work.

which is not true at all. Spice is already available on Debian Jessie. All you need to do in order to test that is to run the strings command on the kvm binary in Debian Jessie:

cat /usr/bin/kvm
#! /bin/sh
exec qemu-system-x86_64 -enable-kvm “$@”

strings /usr/bin/qemu-x86_64-static |grep spice
spicevmc
spiceport
qemu_spice_create_update
qemu_spice_wakeup
qemu_spice_del_memslot
qemu_spice_add_memslot
qxl_spice_update_area_rest
qxl_spice_update_area
qxl_spice_reset_memslots
qxl_spice_reset_image_cache
qxl_spice_reset_cursor
qxl_spice_oom
qxl_spice_loadvm_commands
qxl_spice_monitors_config
qxl_spice_destroy_surfaces
spice_vmc_event
spice_vmc_register_interface
spice_vmc_read
spice_vmc_write
qemu_spice_destroy_primary_surface
qemu_spice_create_primary_surface
qxl_spice_flush_surfaces_async
qxl_spice_destroy_surface_wait
qxl_spice_destroy_surface_wait_complete
qxl_spice_destroy_surfaces_complete
spice_vmc_unregister_interface

So, Debian Jessie does have spice support on the qemu-kvm package. The problem here is that on Ubuntu systems, there’s a file /usr/bin/kvm-spice, whereas on Debian Jessie there isn’t. To fix this, you can create a symlink and be done with it:

# ln -s /usr/bin/kvm /usr/bin/kvm-spice

Third issue: “persistent update of device ‘graphics’ is not supported”

This is a reported and well-know issue of libvirt0. The workaround on Debian Jessie systems is to add the backports repository and install spice from it:

apt-get -t jessie-backports install libvirt0

After that, make sure you have the right version running on your system:

apt-cache madison libvirt0
libvirt0 | 3.0.0-4~bpo8+1 | http://ftp.es.debian.org/debian/ jessie-backports/main amd64 Packages
libvirt0 | 3.0.0-4~bpo8+1 | http://http.debian.net/debian/ jessie-backports/main amd64 Packages
libvirt0 | 3.0.0-4~bpo8+1 | http://http.debian.net/debian/ jessie-backports/main amd64 Packages
libvirt0 | 1.2.9-9+deb8u4 | http://ftp2.fr.debian.org/debian/ jessie/main amd64 Packages
libvirt0 | 1.2.9-9+deb8u3 | http://security.debian.org/ jessie/updates/main amd64 Packages
libvirt | 1.2.9-9+deb8u4 | http://ftp2.fr.debian.org/debian/ jessie/main Sources
libvirt | 1.2.9-9+deb8u3 | http://security.debian.org/ jessie/updates/main Sources

You should have version 3.0.0-4 instead of 1.2.9-9.

Four issue: “Unsupported machine type”

Finally, whenever starting a new VDI you will get this error:

ERROR starting domain status:’done’ ( libvirt error code: 1, message: internal error: process exited while connecting to monitor: redir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on (process:29263): GLib-WARNING **: /build/glib2.0-ETetDu/glib2.0-2.48.0/./glib/gmem.c:483: custom memory allocation vtable not supported qemu-system-x86_64: -machine pc-i440fx-xenial,accel=kvm,usb=off,dump-guest-core=off: Unsupported machine type Use -machine help to list supported machines! )

Crystal clear: the -machine argument passed to every single machine defined using the proper XML files are Ubuntu-based. So you have to edit these files and use the right -machine flag for your Debian Jessie distro. For example, edit the “Debian Jessie AMD64” VM XML definition file and replace pc-i440fx-xenial with pc-i440fx-2.1:

vi /var/lib/ravada/xml/jessie-amd64.xml

<type arch=’x86_64′ machine=’pc-i440fx-2.1′>hvm</type>

Of course, you can do as suggested in the error message and get a list of valid machines by issuing:

kvm -machine help

Conclussions

It is quick and easy to make RAVADAVDI work on Debian Jessie. And it is better to use Debian than Ubuntu most of the time. So now, following these instructions, you can also benefit from this incredible work and start using an open-source  VDI framework right away on your amazing Debian GNU/Linux distro!

It’s so dark, that it is not even implemented (yet) Part III

The DIME standard and its tool

After fixing the signet and genrec tools, I had a valid signet for my test domain “lud.org” and the DIME management record correctly set up in my dnssec authoritative server (see Part I and Part II of this series). I tried it using dig:

dig +dnssec _dx.lud.org. @localhost txt|grep TXT
;_dx.lud.org. IN TXT
_dx.lud.org. 120 IN TXT “ver=1 pok=E7gyvx3E6ksBVkg9CD5XBoXX18txj45iFSqtn9NLqjA dx=dmtp.lud.org tls=c2jM4G+EFZROQYNOyvwVSiQhgL5QW3UJN3CaIipR/Z4hjoSZoO72UlGXdKsAl1T1RQh+/h9rETD1+vaPbkIGCg”
_dx.lud.org. 120 IN RRSIG TXT 5 3 120 20170630110908 20170531110908 10287 lud.org. f8+HE5fw6cSsEwI1CznT2CUoJsIE57Bb/PsoLM4eNlkaIMoWtsLrh8sL EJ2GG2UtoJihLLEXLn+cmEFP7HT9971qd309et48oZCvwBfki0MG0HLy 9rEoG0XgrWODjBU5BKQcSC/dqOogiqTul55TjnCTGBNydYCklolcCQzK Wprsoa2qiBcrW8GFOMDKeXDgx6W7nZiaiYQs5n4aCAHtbXDODz/c89qi FI/5DAFvw/weVwAeRjqNBed4AsCp3UVSu+M4arqItrMagqb53G9OORH/ g4+olkIxNKw0Wqvcez6yHiZyETj0zChiM3zOwk3nrkAw+jrp6Yvztzon Xf/qjQ==

So next step would be to use the dime tool in dev/tools/dime in order to test some addresses. Of course, it would not be that easy!

dime

I built this tool and executed it in order to verify a particular user signet using the Full fingerprint of “lud.org” organizational signet. This came out:

../../dime -f ‘UlVB9t7+GcttbVzhDgdHqyZWFWRZQOfFXW8HQIoZX+Rk6JkuuizQzqmyRriQ1pU667FnhSzODVT9tPugwQjvUg’ “magma@lud.org”
Querying DIME management record for: lud.org
Establishing connection to DX server…
Error: could not connect to DX server.
[0]: src/providers/dime/signet-resolver/dmtp.c:178 [_sgnt_resolv_dmtp_connect()]: 4 (an unspecified error has occurred), errno = 0, aux = “could not establish DMTP connection to host: DIME management record DNSSEC signature was invalid”

So I re-ran the command using the “-v -v -v” flags to add some verbosity:

../../dime -f ‘UlVB9t7+GcttbVzhDgdHqyZWFWRZQOfFXW8HQIoZX+Rk6JkuuizQzqmyRriQ1pU667FnhSzODVT9tPugwQjvUg’ “magma@lud.org” -v -v -v -n

— Started parsing DIME management record…
— — VERSION: — [1]
— — PUBLIC KEY: — [E7gyvx3E6ksBVkg9CD5XBoXX18txj45iFSqtn9NLqjA]
— — DX: — [192.168.56.100]

….
— DIME management record for: — hashed —
—— version : 1
—— pok : 13b832bf1dc4ea4b0156483d083e570685d7d7cb718f8e62152aad9fd34baa30 [1]
—— tlssig : [not present]
—— policy : experimental
—— syndicates: [not present]
—— dx : 192.168.56.100 [1]
—— expiry : [not present]
—— subdomain : strict
****** This record was retrieved with an INVALID DNSSEC signature.

According to the previous output, the DIME management record was successfully read but its signature (RRSIG) was invalid.

Again, going through the sources I ended up here:

int _load_dnskey_file(const char *filename) {
 
// The ttl is unlimited and we don't want to save this entry to the cache.
if (!(dk = _add_dnskey_entry_rsa(dname, flags, algorithm, pubkey, keytag, rdata, rdlen, 0, 0, 1))) 
    fclose(fp);
    RET_ERROR_INT(ERR_UNSPEC, "unable to import DNS root key entry");
}
 
// Any key from a local file is automatically validated.
dk-->validated = 1;

In my case, the “root-anchor.key” file located in /root/.dime/ already had the “.” and “lud.org” DNSKEYS. According to the previous code, any DNSKEY in that file would be automatically validated. Clearly, this was not the case. The memcached daemon was working fine, so I though that maybe the issue should reside in the way the object was constructed and then added to the cache. So I modified the  _add_dnskey_entry_rsa function so that, inside it, the key was already validated before adding the object to the cache:

dnskey->pubkey = pubkey;
dnskey->keytag = keytag;
dnskey->do_cache = do_cache;
 
//TCG:
dnskey->validated = 1;

Once this change was made, I re-built the dime tool and tried again:

++++++ This record WAS retrieved with a valid DNSSEC signature.
Establishing connection to DX server…
— Returning cached DIME record.
– Attempting DMTP connection to DIME record-supplied DX server #1 at dmtp.lud.org:26 …
—- Initialized openssl library.
—- Initialized SSL context with cipher list: ECDHE-RSA-AES256-GCM-SHA384
— Established TCP connection (IPV4) to dmtp.lud.org:26.
– Attempting validation in x509 certificate chain: localhost.localdomain (level 0); verified = no / 18
– Attempting validation in x509 certificate chain: localhost.localdomain (level 0); verified = yes / 18
— Successfully established TLS connection to dmtp.lud.org:26.

So far so good; but after passing the signature validation of the RRSIG field, the tool blocked right after establishing a valid TLS connection to the DMTP Magma server. Again, reading the sources I discovered where the new issue resided:

if (!(banner = _sgnt_resolv_read_dmtp_line(session, NULL, &bcode, 0))) {
        RET_ERROR_INT(ERR_UNSPEC, "unable to read DMTP banner");
}
while (nleft && (!(lbreak = strstr((char *)session->_inbuf, "\r\n")))) {
 ....
}

Dime was expecting the server’s banner; inside _sgnt_resolv_read_dmtp_line the freezing happened because the tool was expecting “\r\n” as the end-of-line, but the server was returning only “\n” (this would be clearly demonstrated by looking inside the commands.c file in the srv/servers/dmtp directory later on; keep reading) So the loop never ended, and the whole dime tool blocked. I altered the line like this:

while (nleft && (!(lbreak = strstr((char *)session->_inbuf, "\n")))) {

This time it worked:

– Continuing verification of self-signed DX TLS certificate …
– DX TLS certificate matched DIME record signature.
– DX TLS certificate verification succeeded automatically (TLS cert match + dnssec).
– DX certificate successfully verified.
– Attempting to verify fingerprint (UlVB9t7+GcttbVzhDgdHqyZWFWRZQOfFXW8HQIoZX+Rk6JkuuizQzqmyRriQ1pU667FnhSzODVT9tPugwQjvUg) for signet: magma@lud.org
Error: signet verification failed.

The connection to the DMTP server was fine this time, but the signet could not be verified. That was odd, because I made sure to have that user signet already installed in the database. So I tried the connection to the DMTP server myself using the openssl connect command:

openssl s_client -crlf -connect 192.168.56.100:26
CONNECTED(00000003)


220 lud.org DSMTP Magma

Then, I tried the EHLO command and the VRFY command to no avail:

EHLO HOST=dmtp.lud.org
250 lud.org
VRFY DOMAIN=lud.org FINGERPRINT=”UlVB9t7+GcttbVzhDgdHqyZWFWRZQOfFXW8HQIoZX+Rk6JkuuizQzqmyRriQ1pU667FnhSzODVT9tPugwQjvUg”
250 VRFY COMMAND COMPLETE

I tried some other commands, according to the specs document reference to no avail either. The only ones working were: RST, NOOP, and QUIT. So I went, once again, through the sources and I found simple function placeholders for the DMTP commands:

**
 * @brief       Specify the destination domain for a message in response to an DMTP RCPT command.
 * @param       con             the DMTP client connection issuing the command.
 * @return      This function returns no value.
 */
void dmtp_rcpt(connection_t *con) {
        con_write_bl(con, "250 RCPT COMMAND COMPLETE\n", 26);
        return;
}
 
**
 * @brief       Specify the origin domain for a message in response to an DMTP MAIL command.
 * @param       con             the DMTP client connection issuing the command.
 * @return      This function returns no value.
 */
void dmtp_mail(connection_t *con) {
        // Spit back the all clear.
        con_write_bl(con, "250 MAIL COMMAND COMPLETE\n", 26);
        return;
}
 
/**
 * @brief       Process an DMTP MAIL command.
 * @param       con             the DMTP client connection issuing the command.
 * @return      This function returns no value.
 */
void dmtp_data(connection_t *con) {
        con_write_bl(con, "451 DATA FAILED - INTERNAL SERVER ERROR - PLEASE TRY AGAIN LATER\n", 65);
        dmtp_requeue(con);
        return;
}

So, basically, the DMTP server is not functional at all yet. This surprised me a lot, because there’s a talk at Defcon in 2014 where Ladar shown his MAGMA server capabilities (there’s even a video showing the direct communication and sending of DMTP commands via Telnet). If you are as mystified as I am, go watch it: https://www.youtube.com/watch?v=TWzvXaxR6us (minute 40:00).

So maybe they have been tuning things a bit, or they have decided to start afresh, or whatever. I opened a new issue on @github but they have deleted it. So, well, as far as I’m concerned, DIME and MAGMA are far from being something to test and deploy. Hopefully, this great idea would come to fruition, eventually. In the meantime, I suggest to keep an eye on this project.