remote kernel crash dump for Debian and Ubuntu

A few years ago, while I started to participate to the packaging of makedumpfile and kdump-tools for Debian and ubuntu. I am currently applying for the formal status of Debian Maintainer to continue that task.

For a while now, I have been noticing that our version of the kernel dump mechanism was lacking from a functionality that has been available on RHEL & SLES for a long time : remote kernel crash dumps. On those distribution, it is possible to define a remote server to be the receptacle of the kernel dumps of other systems. This can be useful for centralization or to capture dumps on systems with limited or no local disk space.

So I am proud to announce the first functional beta-release of kdump-tools with remote kernel crash dump functionality for Debian and Ubuntu !

For those of you eager to test or not interested in the details, you can find a packaged version of this work in a Personal Package Archive (PPA) here :

https://launchpad.net/~louis-bouchard/+archive/networked-kdump

New functionality : remote SSH and NFS

In the current version available in Debian and Ubuntu, the kernel crash dumps are stored on local filesystems. Starting with version 1.5.1, they are stored in a timestamped directory under /var/crash. The new functionality allow to either define a remote host accessible through SSH or an NFS mount point to be the receptacle for the kernel crash dumps.

A new section of the /etc/default/kdump-tools file has been added :

# ---------------------------------------------------------------------------
# Remote dump facilities:
# SSH - username and hostname of the remote server that will receive the dump
# and dmesg files.
# SSH_KEY - Full path of the ssh private key to be used to login to the remote
# server. use kdump-config propagate to send the public key to the
# remote server
# HOSTTAG - Select if hostname of IP address will be used as a prefix to the
# timestamped directory when sending files to the remote server.
# 'ip' is the default.
# NFS - Hostname and mount point of the NFS server configured to receive
# the crash dump. The syntax must be {HOSTNAME}:{MOUNTPOINT} 
# (e.g. remote:/var/crash)
#
# SSH="<user@server>"
#
# SSH_KEY="<path>"
#
# HOSTTAG="hostname|[ip]"
# 
# NFS="<nfs mount>"
#

The kdump-config command also gains a new option : propagate which is used to send a public ssh key to the remote server so passwordless ssh commands can be issued to the remote SSH host.

Those options and commands are nothing new : I simply based my work on existing functionality from RHEL & SLES. So if you are well acquainted with RHEL remote kernel crash dump mechanisms, you will not be lost on Debian and Ubuntu. So I want to thank those who built the functionality on those distributions; it was a great help in getting them ported to Debian.

Testing on Debian

First of all, you must enable the kernel crash dump mechanism at the kernel level. I will not go in details as it is slightly off topic but you should :

  1. Add crashkernel=128M to /etc/default/grub in GRUB_CMDLINE_LINUX_DEFAULT
  2. Run udpate-grub
  3. reboot

Install the beta packages

The package in the PPA can be installed on Debian with add-apt-repository. This command is in the software-properties-common package so you will have to install it first :

$ apt-get install software-properties-common
$ add-apt-repository ppa:louis-bouchard/networked-kdump

Since you are on Debian, the result of the last command will be wrong, as the serie defined in the PPA is for Utopic. Just use the following command to fix that :

$ sed -i -e 's/sid/utopic/g' /etc/apt/sources.list.d/louis-bouchard-networked-kdump-sid.list 
$ apt-get update
$ apt-get install kdump-tools makedumpfile

Configure kdump-tools for remote SSH capture

Edit the file /etc/default/kdump-tools and enable the kdump mechanism by setting USE_KDUMP to 1 . Then set the SSH variable to the remote hostname & credentials that you want to use to send the kernel crash dump. Here is an example :

USE_KDUMP=1
...
SSH="ubuntu@TrustyS-netcrash"

You will need to propagate the ssh key to the remote SSH host, so make sure that you have the password of the remote server’s user you defined (ubuntu in my case) for this command :

root@sid:~# kdump-config propagate
Need to generate a new ssh key...
The authenticity of host 'trustys-netcrash (192.168.122.70)' can't be established.
ECDSA key fingerprint is 04:eb:54:de:20:7f:e4:6a:cc:66:77:d0:7c:3b:90:7c.
Are you sure you want to continue connecting (yes/no)? yes
ubuntu@trustys-netcrash's password: 
propagated ssh key /root/.ssh/kdump_id_rsa to server ubuntu@TrustyS-netcrash

If you have an existing ssh key that you want to use, you can use the SSH_KEY option to point to your own key in /etc/default/kdump-tools :

SSH_KEY="/root/.ssh/mykey_id_rsa"

Then run the propagate command as previously :

root@sid:~/.ssh# kdump-config propagate
Using existing key /root/.ssh/mykey_id_rsa
ubuntu@trustys-netcrash's password: 
propagated ssh key /root/.ssh/mykey_id_rsa to server ubuntu@TrustyS-netcrash

It is a safe practice to verify that the remote SSH host can be accessed without password. You can use the following command to test (with your own remote server as defined in the SSH variable in /etc/default/kdump-tools) :

root@sid:~/.ssh# ssh -i /root/.ssh/mykey_id_rsa ubuntu@TrustyS-netcrash pwd
/home/ubuntu

If the passwordless connection can be achieved, then everything should be all set. You can proceed with a real crash dump test if your setup allows for it (not a production environment for instance).

Configure kdump-tools for remote NFS capture

Edit the /etc/default/kdump-tools file and set the NFS variable with the NFS mount point that will be used to transfer the crash dump :

NFS="TrustyS-netcrash:/var/crash"

The format needs to be the syntax that normally would be used to mount the NFS filesystem. You should test that your NFS filesystem is indeed accessible by mounting it manually :

root@sid:~/.ssh# mount -t nfs TrustyS-netcrash:/var/crash /mnt
root@sid:~/.ssh# df /mnt
Filesystem 1K-blocks Used Available Use% Mounted on
TrustyS-netcrash:/var/crash 6815488 1167360 5278848 19% /mnt
root@sid:~/.ssh# umount /mnt

Once you are sure that your NFS setup is correct, then you can proceed with a real crash dump test.

Testing on Ubuntu

As you would expect, setting things on Ubuntu is quite similar to Debian.

Install the beta packages

The package in the PPA can be installed on Debian with add-apt-repository. This command is in the software-properties-common package so you will have to install it first :

$ sudo add-apt-repository ppa:louis-bouchard/networked-kdump

Packages are available for Trusty and Utopic.

$ sudo apt-get update
$ sudo apt-get -y install linux-crashdump

Configure kdump-tools for remote SSH capture

Edit the file /etc/default/kdump-tools and enable the kdump mechanism by setting USE_KDUMP to 1 . Then set the SSH variable to the remote hostname & credentials that you want to use to send the kernel crash dump. Here is an example :

USE_KDUMP=1
...
SSH="ubuntu@TrustyS-netcrash"

You will need to propagate the ssh key to the remote SSH host, so make sure that you have the password of the remote server’s user you defined (ubuntu in my case) for this command :

ubuntu@TrustyS-testdump:~$ sudo kdump-config propagate
[sudo] password for ubuntu: 
Need to generate a new ssh key...
The authenticity of host 'trustys-netcrash (192.168.122.70)' can't be established.
ECDSA key fingerprint is 04:eb:54:de:20:7f:e4:6a:cc:66:77:d0:7c:3b:90:7c.
Are you sure you want to continue connecting (yes/no)? yes
ubuntu@trustys-netcrash's password: 
propagated ssh key /root/.ssh/kdump_id_rsa to server ubuntu@TrustyS-netcrash
ubuntu@TrustyS-testdump:~$
If you have an existing ssh key that you want to use, you can use the SSH_KEY option to point to your own key in /etc/default/kdump-tools :
SSH_KEY="/root/.ssh/mykey_id_rsa"

Then run the propagate command as previously :

ubuntu@TrustyS-testdump:~$ kdump-config propagate
Using existing key /root/.ssh/mykey_id_rsa
ubuntu@trustys-netcrash's password: 
propagated ssh key /root/.ssh/mykey_id_rsa to server ubuntu@TrustyS-netcrash

It is a safe practice to verify that the remote SSH host can be accessed without password. You can use the following command to test (with your own remote server as defined in the SSH variable in /etc/default/kdump-tools) :

ubuntu@TrustyS-testdump:~$sudo ssh -i /root/.ssh/mykey_id_rsa ubuntu@TrustyS-netcrash pwd
/home/ubuntu

If the passwordless connection can be achieved, then everything should be all set.

Configure kdump-tools for remote NFS capture

Edit the /etc/default/kdump-tools file and set the NFS variable with the NFS mount point that will be used to transfer the crash dump :

NFS="TrustyS-netcrash:/var/crash"

The format needs to be the syntax that normally would be used to mount the NFS filesystem. You should test that your NFS filesystem is indeed accessible by mounting it manually (you might need to install the nfs-common package) :

ubuntu@TrustyS-testdump:~$ sudo mount -t nfs TrustyS-netcrash:/var/crash /mnt 
ubuntu@TrustyS-testdump:~$ df /mnt
Filesystem 1K-blocks Used Available Use% Mounted on
TrustyS-netcrash:/var/crash 6815488 1167488 5278720 19% /mnt
ubuntu@TrustyS-testdump:~$ sudo umount /mnt

Once you are sure that your NFS setup is correct, then you can proceed with a real crash dump test.

 Miscellaneous commands and options

A few other things are under the control of the administrator

The HOSTTAG modifier

When sending the kernel crash dump, kdump-config will use the IP address of the server to as a prefix to the timestamped directory on the remote host. You can use the HOSTTAG variable to change that default. Simply define in /etc/default/kdump-tools :

HOSTTAG="hostname"

The hostname of the server will be used as a prefix instead of the IP address.

Currently, this is only implemented for the SSH method, but it will be available for NFS as well in the final version.

kdump-config show

To verify the configuration that you have defined in /etc/default/kdump-tools, you can use kdump-config’s show command to review your options.

ubuntu@TrustyS-testdump:~$ sudo kdump-config show
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr: 0x2d000000
SSH: ubuntu@TrustyS-netcrash
SSH_KEY: /root/.ssh/kdump_id_rsa
HOSTTAG: ip
current state: ready to kdump
kexec command:
 /sbin/kexec -p --command-line="BOOT_IMAGE=/vmlinuz-3.13.0-24-generic root=/dev/mapper/TrustyS--vg-root ro console=ttyS0,115200 irqpoll maxcpus=1 nousb" --initrd=/boot/initrd.img-3.13.0-24-generic /boot/vmlinuz-3.13.0-24-generic

If the remote crash kernel dump functionality is setup, you will see the options listed in the output of the commands.

Conclusion

As outlined at the beginning, this is the first functional beta version of the code. If you are curious, you can find the code I am working on here :

http://anonscm.debian.org/gitweb/?p=collab-maint/makedumpfile.git;a=shortlog;h=refs/heads/networked_kdump_beta1

Don’t hesitate to test & let me know if you find issues

Ce contenu a été publié dans Canonical, Technical, Ubuntu Server. Vous pouvez le mettre en favoris avec ce permalien.

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *