First off, I am a Novell employee, but do not speak for the company at all, the following is my personal opinion, but one that is held by a number of different people at different companies...
I've been hearing from a lot of different companies and users about how the current "enterprise" Linux distros manage their kernels and the current problems they are having with them. With all products that bundle up an every-moving target of an OpenSource projects, there are trade-offs and for the past five years or so the two big Linux distros, SUSE and Red Hat have been trying to walk the line between stability and new features and doing so quite well.
But now that we have been living with these models for a few years, a number of problems have come to light, with the way things currently work. This document describes how the current kernels are managed in these two distros, the problems companies and users are having with them, and three proposed solutions.
Current Model of kernels
An "enterprise" Linux kernel as packaged by Red Hat and SUSE is made up of a kernel.org kernel taken from a specific point in time, and then it is tested, tuned, stabilized and then packaged up and supported for usually a long period of time (5-9 years, depending on the product.) After the product is first released, a series of "service packs" or "maintenance updates" (from now on called "major update") are released, about one every 8-18 months depending on the vendor and the length of time the product has been released.
During the time between these major updates, the distro works to ensure that any bugfix or security update that happens will not break any of the current kernel API or internally visible ABI. It does this so that any third-party kernel modules will not need to change or be "recertified" because of them.
Currently both distros provide a way to have third-party module creators register with them and be notified when an internal ABI change is going to happen in order for them to be able to respin their pre-built modules. They also provide ways for third-party modules to be easily created and build against these kernels so as to ease the build issues that can be caused when attempting to build and package against a distro kernel.
When a major update is released, it usually consists of the original kernel version for the product, with a wide range of new features, drivers, and other fixes backported from the currently released kernel.org kernel to the originally released kernel. This enables new hardware to be supported and new features that are requested by the distro's users and partners to be rolled into the product. This release almost always has ABI and API changes due to the new features and drivers.
At the time of this newly released update, all third party modules must be rebuilt and sometimes reworked due to the new features and backports that were applied to the kernel. Hopefully all third-party module vendors work with the distro to be aware of the proposed changes, but sometimes there is a lag before the new modules are available for the customers to use.
An example of how this works can be seen in the latest Novell SLES10 Service Pack 1 release. Originally the SLES10 kernel was based on the 2.6.16 kernel release with a number of bugfixes added to it. At the time of the Service Pack 1 release, it was still based on the 2.6.16 kernel version, but the SCSI core, libata core, and all SATA drivers were backported from the 2.6.20 kernel.org kernel release to be included in this 2.6.16 based kernel package. This changed a number of ABI issues for any external SCSI or storage driver that they would need to be aware of when producing an updated version of their driver for the Service Pack 1 release.
Problems caused by this model
Here are some of the problems that I have heard with customers and partners who have been working with this kind of release model for a number of years:
Potential solutions and pros and cons of them
So, what to do? Currently this discussion is happening in the major distros and their partners. Let me know what you think the model should be for enterprise Linux distros in the future.
Do you like one of the currently proposed models above, or do you have some other model you feel would work better?
Thanks to the many different people who read earlier versions of this document and helped to form the ideas here. This includes the whole Novell/SuSE kernel team who while did not all agree with the ideas here, helped out immensely. Also a number of people at different companies helped out, but probably do not wish to be named here, I want to thank them anyway.
posted Tue, 19 Jun 2007 in [/linux]
My Linux Stuff