crashdc : The architecture

So you might want to know more about how crashdc is organized.  Well here are a few pointers.

crashdc is a shell script that is called from a kdump mechanism which is triggered when a vmcore dump file gets created. The file it creates is named crash-data-{date}.txt Unfortunately (for me), this mechanism is different in all three implementation that crashdc supports :

  • RHEL5   : kdump_post
  • SLES10 : KDUMP_TRANSFER
  • SLES11 : KDUMP_POSTSCRIPT

It can also be run interactively on an existing vmcore to generate the crash-data-{date}.txt file afterward.

I will try to go in a little more details about each of the automated mechanisms. But it is important to know that crashdc itself is identical for all three distributions. The only thing that changes is the run-crashdc-{distro}.sh shell that is executed by each one of those mechanisms.

RHEL5’s kdump_post

This is the most straightforward. When uncommented, the kdump_post variable in /etc/sysconfig/kdump is setup to run the /var/crash/script/kdump_post.sh script. This script is in fact a symlink to /usr/bin/run-crashdc-rhel5.sh which takes charge of finding vmcore’s location, and to invoke crashdc.

SLES10’s KDUMP_TRANSFER
SLES10’s implementation is more complex. When KDUMP_TRANSFER is defined, the command it invokes becomes responsible for reading vmcore’s data out of /proc/vmcore and to store in in a file. Otherwise, it is the save_core shell function from /etc/init.d/kdump that does that.

So /usr/bin/run-crashdc-sles10.sh has an identical copy of the save_core function (yes I copied it) that will copy the vmcore file. The rest of the script does the same thing than for RHEL5, which is to invoke crashdc with the appropriate parameters.

SLES11’s KDUMP_POSTSCRIPT
Once again, SLES11’s implementation is different.  When invoked from witihin the kexec environment, the script pointed to by KDUMP_POSTSCRIPT runs in an environment where the / file system is memory resident, and the real disk-based / file system is mounted in /root.  So the /usr/bin/run-crashdc-sles11.sh knows all so it is still able to invoke crashdc correctly.

Those three mechanisms are all KDUMP specific handles. This mean that they run right after a kernel panic, when the KDUMP kernel is running and before a reboot has been done.

Post Reboot processing
Some system administrators might not like the idea that unnecessary processing is done in that context. This is why one last script has been developped : /etc/init.d/crashdc This script can be used when the system reboots to its regular environment to generate the crash-data-{date}.txt file.

With this method, the crash-data-{date}.txt will only get created when the system reboots on the regular kernel.  The current limitation is that it will only process the latest vmcore created in order to avoid extra processing in case of multiple kernel panics.  Yet, there should be an option to manually process all vmcores where crash-data-{date}.txt has not been found.

Right now, this script only exists in my mind so there are chances that it might evolve quite a lot before it gets done.

So I hope that this brief post is clear enough to highlight the main mechanisms used by crashdc.

Publié dans crashdc | Laisser un commentaire

crashdc

Sorry for those of you who are used to read me in french, but you will see appear here, from time to time, notes, posts and information about an opensource project that I’m working on : crashdc.

crashdc is a set of scripts to be used in conjunction with Dave Anderson’s crash tool. Without crash, crashdc is useless. With it, crashdc can automate data collection when a new kernel crash dump occurs.

Dave Anderson has kindly accepted to include crashdc in the crash package, so when crashdc will be ready, this is where you will be able to find it. I also created a sourceforge project for it called obviously crashdc, where you can find standalone bits and information about crashdc itself.

The Sf environment does not intend to ack as a parallel project, but just a place for me to host my development which itself will make its way to the crash rpm.

Right now, don’t look for the crashdc bits : they’re not available yet.  Being employed by a major computer constructor, I must first clear out the project internally so I can have the right to publicly distribute crashdc. This shoudln’t be too long. And since crashdc is mostly Alpha code right now, this would not bring you much to get it as it is.

So bear with me a little longer, and it should become available for a first round of testing soon.

Publié dans crashdc | Laisser un commentaire

Little update on LinuxCOE

Of course, nothing has happened last month since I was on vacation. But just before leaving, I was able to get my RPM working on RHEL and SLES. More testing is required, but I’m getting there.

Now a suggestion from Bryan makes a lot of sense, but will probably delay availability of the RPM packages. We should not make the RPM for SystemDesigner available unless the RPM are also available for the overlays. This is true since not much can be done without an overlay.

So I now have to figure out how to generate the O/S images from a script, instead of including them right in the RPM as it is done currently. This is needed so the RPM don’t contain binary archives and, more importantly, don’t get to heavy in size.

So I will get my good ol Camel Book out and get going.

Publié dans LinuxCOE | Laisser un commentaire

Two days with BryanG in Grenoble

I’m back at my desk, after two days in Grenoble with Bryan Gartner and Bruno Cornec. This is really good to be able to see them face to face. Virtual communications are great, but it always help to share some portion of reality for a little while.

We had a lot of discussion on short term goals for LinuxCOE, waystation setup and such. Got some tests packages going while they were at work setting up a waystation on eurolinux.

Now I have working packages for :

  • SystemDesigner
  • RHEL overlay
  • Fedora Overlay
  • The Documentation

I should be able to have a few more soon, since the overlays are rather easy to create, once the first one is OK.

Publié dans LinuxCOE | Marqué avec | Laisser un commentaire

The docs & RHEL modules are mostly there

A little LinuxCOE update.

I got the docs module working earlier today, and it looks like I got the first overlay module almost working also : the RHEL module.

Now I need to figure out what is needed to do in order to get normal setup working, document the whole thing and add it to the docs module.

Publié dans LinuxCOE | Laisser un commentaire

Joining LinuxCOE

I recently indicated in a previous post that I was joining the LinuxCOE Open Source effort.

It has been a few weeks since (well actually almost a month now) this post and some work has already been done on creating a RPM package for SystemDesigner. We now have a roughly working package but without any of the Distro modules, it is not of much interest. The next step is to get one of the Distro module packaged and to see if I can get a working COE environment.

Now regarding this specific post, it is in english, as will probably be most of the LinuxCOE related posts, mainly because I don’t want to restrict my audience to the french speaking people. Right now, I don’t think that LinuxCOE has much of an audience in this area.

I also decided to mix these posts in my public personal blog, because I’m lazy and don’t want to go through the trouble of writing and maintaining two blogs. It is better for me to concentrate on working on LinuxCOE rather than being a full-time blog manager.

So keep posted for more LinuxCOE news.

Publié dans LinuxCOE | Marqué avec | Laisser un commentaire