Things look quiet here. But I've been doing a lot of blogging at
dan.langille.org because I prefer WordPress now.
Not all my posts there are FreeBSD related.
I am in the midst of migrating The FreeBSD Diary over to WordPress
(and you can read about that here).
Once the migration is completed, I'll move the FreeBSD posts into the
new FreeBSD Diary website.
I have a system that freezes during the reboot process. The computer is running FreeBSD 6.1-STABLE
and is very reliable and, well, umm, stable. This is not a random reboot problem. It is a problem that occurs
during reboot. The freeze occurs after the BIOS screen and just after the SCSI card loads its BIOS. What should
be appearing on the screen is the preload information. Usually, you see an underscore at the top left of the console.
Then it drops down one line and goes into the spinning charcter associated with a FreeBSD preload. That underscrore
never appears. The console goes completely blank and stays that way.
Pressing ctl-alt-del does not reboot the system. Pressing the RESET button on the system seems to have no affect.
A power cycle does reboot the system. After every such power cycle, the BIOS has reported that I either need to
press F1 to modify the BIOS settings or press F2 to load defaults. I have been pressing F1, then F10 to save and reboot.
The box then properly reboots.
This problem is annoying because I the system is frequently rebooted to use a new kernel. Having to trudge down into the
basement to power cycle, and then press F1, F10, and waid is annoying. Especially when it should not be necessary.
Testing the reboot process
This problem has been going on, more or less, since I got this box. It is an AMD 3000+ processor on an AV8 Deluxe motherboard.
It has 1GB of RAM and an 80GB SATA drive.
One Saturday, I was especially annoyed by this problem, so I rescued the system from the basement, and set it up on the dining
room table. "How long will this be here" I was asked. "Until it's fixed", which of course, could be forever. It is just as well
I fixed, or at least think I fixed, the problem yesterday. Tomorrow is a birthday BBQ, and we need the dining room table back.
With the system in the dining room, I can reboot it as I pass by. It's a high traffic area. I have to go past the system any time I move
about the house. This makes it ideal. I can reboot the system without having to attend to it at all times. I would press ctl-alt-del
each time I went by. Sometimes it would hang. Sometimes it would reboot. There was no pattern.
For a while, I thought that the freeze only occurred after a "shutdown -r now". I eventually proved that wrong. Then I thought it only
occured when rebooting over ssh. Wrong. There was no pattern. At one point, I'd done over a dozen reboots without a single freeze. I thought
I had it solved. I had been reseating cards, moving them around, removing them, reseating memory, cleaning off dust, and generally getting
frustrated with the whole problem. When it went away. OK. Great. Fixed!
Then the problem returned.
Is it ACPI?
For a while, I was chatting with people on IRC about this, we thought the problem might be ACPI.
I tried some recent ACPI patches. The patch appeared to make things worse, but probably had no effect
at all. The problem continued to occur after I backed out the patches.
I tried booting with and without ACPI. I played with BIOS settings. Nothing seemed to affect the problem.
The video card!
During my testing, I had tried removing the NIC (Intel 82559 Pro/100 Ethernet) and the SCSI card (Adaptec 2944 Ultra SCSI adapter) to see
if that affected the results. It did not. I tried moving around the VGA card. No changes. The problem still occurred on a seemingly random
basis. Yesterday, I was getting the problem for about 8 consecutive reboots.
I had a stash of about 4 video cards in the basement. I retrieved them and started further testing. I discovered that the problem did
not occur when I was using an AGP card. The problem did occur at least once whenever I used either one of the two PCI video cards.
Right now, I'm using an AGP card from ATI Technologies Inc (RV100 Radeon 7000 / Radeon VE). I've done about 20 reboots in a row without
a single freeze. Hopefully the problem has gone away.
The PCI VGA cards I tried (both of which were used during at least one reboot freeze) were:
MGA 1064SG Hurricane/Cyclone 64-bit graphics chip from Matrox Electronic Systems Ltd.
It may not be the particular card in question... it may just be the use of a PCI card versus an AGP card.
Witness my testing
For what it's worth, here is the history of the reboots and shutdowns:
Some interesting stats:
There have been 168 shutdowns in May.
There has been 167 reboots.
Today (the first day after I found the AGP video solution), I have done 26 shutdowns
and 27 reboots.
That is about 6 reboots per day over the past 4 weeks.
So what caused the problem?
I always prefer to know the exact cause of a problem. With that knowledge you can positively identify the cause and verify the fix.
You can therefore prove that the problem has been fixed. However, in this case, I'm not 100% sure I've found the cause. But I do think/hope
Do you have any theories? If so please use the comment link at the bottom right of this page. Thank you.