The FreeBSD Diary

The FreeBSD Diary (TM) Remember
I remember
[ HOME | TOPICS | INDEX | WEB RESOURCES | BOOKS | CONTRIBUTE | SEARCH | FEEDBACK | FAQ | FORUMS ]

Things look quiet here. But I've been doing a lot of blogging at dan.langille.org because I prefer WordPress now. Not all my posts there are FreeBSD related. I am in the midst of migrating The FreeBSD Diary over to WordPress (and you can read about that here). Once the migration is completed, I'll move the FreeBSD posts into the new FreeBSD Diary website.

mknod - create the device, then mount 5 January 2004
Need more help on this topic? Click here
This article has no comments
Show me similar articles

My primary mail server went down on 1 January. In the process of analyzing the problem, I leaned about a new tool: mknod. This article documents how I used that tool, a live filesystem CD, and a floppy disk to look at the disk of the dead box.

Happy New Years!

I first noticed a problem on New Years day. I couldn't ssh into the box. Nor was it accepting email. Attempts to connect were met with:

$ ssh m20
Password:
Last login: Thu Jan 1 11:38:35 2004 from betty
Copyright (c) 1980, 1983, 1986, 1988, 1990, 1991, 1993, 1994
The Regents of the University of California. All rights reserved.

-bash: /etc/profile: Device not configured
Connection to m20.example.org closed.
$
smtp was also sick:
$ telnet m20 25
Trying 10.0.0.1...
Connected to m20.example.org.
Escape character is '^]'.
220 m20.example.org ESMTP Postfix
helo bast.example.org
250 m20.example.org
mail from: dan@example.org
250 Ok
rcpt to:eric@example.com
250 Ok
data
354 End data with <CR><LF>.<CR><LF>
test msg via m20
.
451 Error: queue file write error
quit
221 Bye
Connection closed by foreign host.

Being a holiday, I wasn't able to get access to the collocation facility. It wasn't until January 4th that I was able to get there.

Take a camera!

As I was driving to the collocation facility, I remembered my camera. I thought about turning around to collect it, but didn't. Bad idea. I've lost useful information because of that decision. The console contained messages which might have been useful. Next time, I hope I remember.

What I do remember is messages about see tuning(7). That's it. Nothing else. If I'd had a camera, I would have taken a picture and we'd both be able to learn something from it. What a silly mistake.

I hit enter once, and that started a stream of messages far too rapid to read. CONTROL-S didn't halt it, nor did SCROLL-LOCK. I tried another virtual console. I got a login prompt. But as soon as I touched a key, the tty died with the following message:

/: create / symlink failed, no inodes free

This happened with each virtual console I tried.

I went back to the main console to look closely at the scrolling messages. I could read nothing. I pressed the power switch, and that stopped the messages for a short time, before they started again. I was able to read something like this:

vm_fault_pager read error pid 1 init

So... it looks like init was having problems. This was a sick system. I rebooted the box.

The first reboot

The first reboot did nothing. It could not find the disk drive. I went into the BIOS setup and found that nothing was listed for the primary drive. Auto-detection found nothing. I had no choice but to take the system home with me.

booting at home

At home, I wanted to examine the system before booting it up in case I lost anything by writing to the drive. I booted up from a CD I had, but couldn't mount any drives. I also had a 4.7-RELEASE from FreeBSD Mall. Disk 2 contains a live filesystem, which you can boot from and obtain a working FreeBSD system with very little effort. I booted, and tried to mount my disk. dmesg(8) showed that the disk (ad0) was found. But I could not mount it because /dev/ad0s1e did not exist, but /dev/ad0s1 did. /dev/MAKEDEV was not present on this live filesystem.

I was talking out loud about this in an IRC channel, when Anton Berezin had this great idea:

mkdir -p /tmp/dev
cd /tmp/dev
/sbin/mknod ad0s1e c 116 0x00020000 root:operator

I tried it, but ran into a problem. This live filesystem CD did not have mknod(8)

Another great idea from Anton: no mknod, no device. copy mknod to a floppy :-)

Remember: The 4.9-RELEASE live filesystem ISO image contains mknod. I wouldn't have needed the floppy if I'd have that ISO just sitting around ready to go. I now have a CD ready to go....

Floppy basics
I went back to my documentation on floppies. I fetched a fresh floppy from a box and did this:
fdformat /dev/rfd0
disklabel -w -r /dev/rfd0 fd1440
newfs /dev/rfd0
mount /dev/fd0 /mnt
cp /sbin/mknod /mnt
umount /mnt
That gives me a floppy with mknod. From the live filesystem machine, I mounted the floppy and copied the file to /tmp for future use.
Trying mknod again
Then I tried the original command again:
/tmp/mknod ad0s1e c 116 0x00020000 root:operator
Now I had an error about no such group. There was no /etc/group file in this machine. Not to worry. You can use the numbers instead of the names.
/tmp/mknod ad0s1e c 116 0x00020000 0:0
This translates to root:wheel. Check /etc/passwd and /etc/group and you'll see why.

This worked. I then mounted that new device:

mount -r /tmp/dev/ad0s1e /mnt
That was was it. I had my drive mounted. I check around, found nothing unusual. I then repeated the procedure for each slice on my drive.
/tmp/mknod ad0s1a c 116 0x00020000 0:0
/tmp/mknod ad0s1f c 116 0x00020000 0:0
/tmp/mknod ad0s1g c 116 0x00020000 0:0

A brief explanation:

  • The c means a character type devices.
  • 116 is the major number for this type of device, as found from /dev/MAKEDEV.
  • 0x00020000 is a bitmask. You can see that here:
    crw-r----- 2 root operator 116, 0x00020000 Aug 15 16:44 /dev/ad0s1a
    crw-r----- 2 root operator 116, 0x00020001 Aug 15 16:44 /dev/ad0s1b
    crw-r----- 2 root operator 116, 0x00020002 Aug 15 16:45 /dev/ad0s1c
    crw-r----- 2 root operator 116, 0x00020003 Aug 15 16:45 /dev/ad0s1d
    crw-r----- 2 root operator 116, 0x00020004 Aug 15 16:45 /dev/ad0s1e
    crw-r----- 2 root operator 116, 0x00020005 Aug 15 16:45 /dev/ad0s1f
    crw-r----- 2 root operator 116, 0x00020006 Aug 15 16:45 /dev/ad0s1g
    crw-r----- 2 root operator 116, 0x00020007 Aug 15 16:45 /dev/ad0s1h
This information was obtained from a working system.... Hopefully you'll have one somewhere that you can access.

For some reason I was unable to mount more than one slice at a time. I kept getting a "device busy" message.

But I was able to examine the drive and find nothing obviously wrong. I then booted the system into single user mode by pressing the space bar during the boot count down, and then issued boot -s. For a bit more information about single user mode, please read this this FAQ.

When I booted into single user mode, I had to run fsck in order to clean the file systems. They were marked as dirty because of reboot. They would be marked clean if I had done a proper shutdown, which was not possible.

fsck /dev/ad0s1a
fsck /dev/ad0s1f
fsck /dev/ad0s1g
fsck /dev/ad0s1e
Kids, don't try this at home!

I don't plan to use this every day. In fact, I hope never to have to do it again. But it is nice to know how when you need to do it. This will help.


Share
Need more help on this topic? Click here
This article has no comments
Show me similar articles