Let’s Build a Kernel!

Or not.

Many who read this aren’t even going to have Linux based computers, or have the fortitude to try building their own kernel. However, if you want to read the experiences of someone who has done it many times, read on.

WARNING: This is a technical article!

The Problem

So, this whole kernel rebuild nonsense came about when I finally tried to get my on-board ethernet working on my Thinkpad. I knew it didn’t work under Linux, but I never really had a reason to use it, until I started this semester. The driver for Atheros in Linux seems to be underpowered, whereas in Windows, I can go to the exact same spot and receive double the signal strength I get in Linux. This semester, in order to do SVN (subversion for programming projects with teams), I needed continuous Intranet (inter-campus because that’s where the SVN servers are located) connection. However, due to the weak wireless signal, I would constantly drop out, making staying connected reliably impossible.

So I figured, why not use the on-board ethernet? It’s a guaranteed stable (and ULTRA fast) connection. I plug the cable in and get… nothing. The lights come on, and activity is indicated, but Linux shows that nothing is happening… in fact:

ifconfig

Shows absolutely no hardwired connection, just my ath0 (wireless, ath for atheros, obviously) and lo (default loopback interface).

Well, I figured it’s time to start the troubleshooting. Since I knew that IBM happily supports Linux, there was a good chance someone else is having the problem, or at least there is some type of support.

So after googling

thinkpad t60 ubuntu no ethernet

t60 ubuntu no ethernet

I found out it’s a known problem… in fact, it’s been a known problem for over a year and they just haven’t really done much to fix it. The rundown of the error is this:

[ 4.158782] 0000:02:00.0: 0000:02:00.0: The NVM Checksum Is Not Valid

[ 4.169053] e1000e 0000:02:00.0: PCI INT A disabled

[ 4.169061] e1000e: probe of 0000:02:00.0 failed with error -5

This is how the error is listed in

dmesg

As you can see, it’s pointing to the kernel module e1000e (before I updated my kernel is was a VERY similar error, but on the e1000 module). Research shows that is apparently some deeply embedded power saving feature that causes the checksum to barf. Lenovo has acknowledged that the power saving feature’s advantage is neglegible and should be fixed. Different people have pointed out different faults such as that it happens when the adapter is put to sleep and cannot be woken back up (a super fast break of initial inactivity when first booting), others say it has something to do with checksum reporting.

Since a total of 0 solutions worked for me, I don’t know what’s making it freak either. The curious thing, is that most people say the ethernet will work as long as you have an ethernet cable plugged in from the beginning that is live, preventing it from ever sleeping. However, I still get the same checksum nonsense, indicating a more serious hardware-software issue.

Well, since I know that the kernel is what controls devices and this is a related module, hopefully there is a new fix in the kernel,

I now know I was wrong, but it was the most viable solution.

Building the new Kernel

Well, let’s start with the obvious. We are going to need a copy of the latest kernel. Kernel source is found at kernel.org. The topmost entry that is labeled stable is the best choice. There are several letters next to the current version, click on the F to download the latest, stable kernel source.

Once that is downloaded, you need to unpackage it in /usr/src. You MUST be root in order to do this and the next steps. Once you have unpackaged the file, you can open up a terminal…

sudo -i

To get a permanent change into the root environment, cd on over to /usr/src/linux-your-kernel.number.version. In my case it’s linux-2.6.27.3, once you are in this directory you are going to execute the single most important part of building a kernel… making the configuration file. Go ahead and

make menuconfig

It should bring up a very ugly in-console menu with all your kernel options. Here is where Linux shines… customization. With configuring your own kernel, you can choose ONLY the elements you need, unlike Windows or even Mac, you don’t get a generic kernel bloated with all kinds of unnecessary support for ten year old hardware you will NEVER use. The bloat causes your computer to load slower, have a higher fail rate (in my opinion), and have a larger kernel image (hence the slowness). Why would you build support for ten different internal wireless chipsets when you only have one? That’s a complete waste even for modulization.

An example of modularizing only what you need (or may think you need) and leaving out the junk.

An example of modularizing only what you need (or may think you need) and leaving out the junk.

There’s only a few basic principles you must understand when configuring your kernel.

1) Modulization – Instead of support for the selected item to be built directly into the kernel, it can be modularized. Modulization allows an element to be loaded into the kernel ONLY when it is needed (or in 90% of all cases it must be done manually). This is really handy if you are unsure if you need built-in support or if modulizatoin would be better (because you might be building in support for hardware you don’t have). Modules are stored externally, away from the kernel, so it is significantly slower if all of your hardware is built as modules, modulizing things that need to be built-in may break stuff instead.

2) Built-in – Built in means the modules will be directly built in to the kernel. They will always be loaded even if not present and your system will produce the errors associated with them. For example, I have some Toshiba module I didn’t take out and it’s producing some hardware missing error as appropriate.

3) Excluded – This is what you ideally want for every single thing you don’t need or isn’t necessary. These will be omitted from the kernel both externally and internally. The can be built-in or modulized later if necessary, but that would involve rebuilding the kernel.

One major thing about kernel building is you MUST know about the inner working of kernel subsystems and be very in-tune with your hardware. The key is knowing exactly what you DON’T have, this way you can select only the things you need and leave or modulize the rest.

So after very carefully selecting everything in menuconfig, save it. Now, in the same directory run

make-kpkg clean

This will do some preliminary junk necessary to build .deb kernel packages for Ubuntu. That will take a minute or two. This next step is the last step in the process, and requires HOURS, depending on your processor and hard drive speed. Building a kernel from the ground up is extremely processor intensive, so be prepared.

fakeroot make-kpkg –initrd –append-to-version=-custom kernel_image kernel_headers

The thing to pay attention to is the final two arguments, kernel_image and kernel_headers, they are immutable, so don’t think about changing them, you’re simply saying you want those two elements built in your package. However, the headers are optional, you need kernel headers in order to build source files from scratch, most users won’t need kernel headers.

After an hour or two (depending on what you configured), you will have two images in the /usr/src directory. One will be the

linux-image-2.version-custom

and

linux-headers-2.version-custom

Just double-click them and install, whenever asked if you want to  change your menu.lst, agree to install the package maintainers version, it will automatically update the menu.lst appropriately, so you don’t have to mess with that. Fun fact: you are the package mainter in this case. I’m sure there is an argument to tell it you are the author, but it’s not something to bother with.

So, if it all worked, reboot into your new working (hopefully) kernel. If something DIDN’T go right, you need to reboot into your old kernel, and rebuild the missing modules.

For instance, my new kernel the first time didn’t have sound working, and was also crashing hal (a necessary gnome component).

/usr/sbin/hald –verbose

Told me that I was missing inotify support in my kernel and to build that. Also, I dug deeper into my sound configuration and found out I hadn’t selected intel-hda components, so no support for my hardware was built. In actuality, I ran into a wireless error first, then sound, then I failed to add inotify support, and I had built the pre 2.6.27.3 (2.6.27) kernel, so I needed to update immediately. The 2.6.27 kernel had a MAJOR bug that could actually fry the Intel PRO/Gigabit adapter, it would completely destroy them.

What’s fantastic is the failsafe-ness of the procedure. Unless ugly bugs exist or you super screw up (I don’t even know if that can mess much up), you can just reboot into your working kernel, and just rebuild. Yes, this is REALLY time consuming and cannot be done in a hurry, but you’re the one who screwed up.

Even though I didn’t fix the problem, through this process I learned exactly what I’m going to need to do when they upgrade the prepatch to stable. So until then,

Cheers,

Michael

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: