The FreeBSD Diary

The FreeBSD Diary (TM)

Providing practical examples since 1998

If you buy from Amazon USA, please support us by using this link.
[ HOME | TOPICS | INDEX | WEB RESOURCES | BOOKS | CONTRIBUTE | SEARCH | FEEDBACK | FAQ | FORUMS ]
hypermail - creating an www interface to a mailing list archive 7 November 1999
Need more help on this topic? Click here
This article has no comments
Show me similar articles
hypermail is a program that takes a file of mail messages in UNIX mailbox format and generates a set of cross-referenced HTML documents.  It allows for an online archive of mail messages.  It's ideal for providing a www interface to a mailing list archive, which is what I'm going to do with it..

The hypermail homepage is http://www.landfield.com/hypermail/ and contains a few examples of the interface.

The background
Regular readers will know that I've recently been Creating a digest and archive for a majordomo mailing list.   This article documents the next step.  I wish to provide an on-line archive of the mailing with a www interface.  I liked the archive used by ipfilter and noticed that it was hypermail.  Then I noticed FreeBSD had a port for it.  Great start!
The installation
The first step was to install hypermail from the ports.  I followed the instructions found in the FreeBSD handbook for compiling ports from the internet.  You may want to see compiling port from CDROM.  I had a problem in that my ports were out of date.  It was installing version 1.something and I noticed on the hypermail home page that the latest version was 2.something.  So I referred to my article on Updating the ports collection to refresh the ports tree.  It had been several months since I did this.

After the refresh, here's what I did:

cd /usr/ports/www/hypermail
make
make install  
The first run
I found /usr/ports/www/hypermail/work/hypermail-20b3/tests to be useful for testing.  I suggest you do that test right after installing.  Note that there are several tests within the file which are commented out.  You might want to invoke them.

I took /usr/ports/www/hypermail/work/hypermail-20b3/configs/hmrc.example as my starting config file.  Here are the changes I made to this file:

# diff -urN hmrc.example /usr/local/hypermail/adsl/hmrc.adsl 
--- hmrc.example        Sun Nov  7 00:15:52 1999
+++ /usr/local/hypermail/adsl/hmrc.adsl Sun Nov  7 00:15:25 1999
@@ -18,7 +18,7 @@
 # This is the default title you want to call your archives.
 # Set this to NONE to use the name of the input mailbox.
 
-hm_label = Hypermail Development List
+hm_label = ADSL Mailing list
 
 # hm_archives = [ URL | NONE ]
 #
@@ -200,7 +200,7 @@
 # The <link...> header can be disabled by default by setting
 # mailto to NONE.
 
-hm_mailto = webmaster@landfield.com
+hm_mailto = webmaster@freebsddiary.org
 
 # hm_domainaddr = [ domainname | NONE ]
 #
@@ -210,7 +210,7 @@
 # to domain-ize these addresses for delivery. In such cases, 
 # hypermail will add the DOMAINADDR to the email address.
 
-hm_domainaddr = landfield.com
+hm_domainaddr = freebsddiary.org
 
 # hm_body = [ HTML <BODY> statement | NONE ]
 #
@@ -225,7 +225,7 @@
 # used to submit a new message to the list served by the 
 # hypermail archive.
 # "NONE" means don't use it.
 
-hm_hmail = hypermail@landfield.com
+hm_hmail = adsl@freebsddiary.cx
 
 # hm_ihtmlheader = [ path to index header template file | NONE ]
 #

So now I was ready for my first test run.  I used one of the existing archive files from within /usr/local/majordomo/lists/adsl.archive and used that as the input.

hypermail -p -m "/usr/local/majordomo/lists/adsl.archive.9911" \
               -c "hmrc.adsl" \
               -d "/usr/local/www/data/freebsddiary/adsl"

This directs the output to /usr/local/www/data/freebsddiary/adsl.   Well, that worked just fine.  I hope yours does too.

Data conversion
I strongly urge you to make digests/archives of every list you create.  It doesn't take much time and it require very little additional system resources.  I wish I'd done that when I started my ADSL mailing list.  But then, if I had, I wouldn't have written the "Creating a digest and archive for a majordomo mailing list" article.

But I didn't.  So now I'm paying the price for that omission.  Luckily, I was able to enlist the assistance and expertise from other list members.  I had saved all of the list messages within my email client (Pegasus, a great Windows client; see http://www.pmail.gen.nz/ for details).  I sorted the messages according to year/month and saved each group to a file.  Then I ran an awk script over the files to convert them to the required format.  There wasn't actually much of a change required.  Here's a before and after:

Return-Path: owner-adsl
Received: (from majordom@localhost)
        by ducky.freebsddiary.cx (8.9.3/8.9.3) id WAA14705
        for adsl-outgoing; Wed, 30 Jun 1999 22:37:54 +1200 (NZST)
Received: from metis.host4u.net (metis.host4u.net [209.150.128.22])
        by ns.freebsddiary.cx (8.9.3/8.9.3) with ESMTP id WAA14519
        for <adsl@freebsddiary.cx>; Wed, 30 Jun 1999 22:35:35 +1200 (NZST)
Received: from wocker (210-55-152-83.ipnets.xtra.co.nz 
                                                    [210.55.152.83])
        by metis.host4u.net (8.8.5/8.8.5) with SMTP id FAA17866
        for <adsl@freebsddiary.cx>; Wed, 30 Jun 1999 05:34:56 -0500
Message-Id: <199906301034.faa17866@metis.host4u.net>
From: "Dan Langille" <dan.langille@dvl-software.com>
Organization: DVL Software Limited
To: adsl@freebsddiary.cx
Date: Wed, 30 Jun 1999 22:35:13 +1200

The above had to be converted to this:

From dan.langille@dvl-software.com  Wed Jun 30 22:35:13 1999
Received: (from majordom@localhost)
        by ducky.freebsddiary.cx (8.9.3/8.9.3) id WAA14705
        for adsl-outgoing; Wed, 30 Jun 1999 22:37:54 +1200 (NZST)
Received: from metis.host4u.net (metis.host4u.net [209.150.128.22])
        by ns.freebsddiary.cx (8.9.3/8.9.3) with ESMTP id WAA14519
        for <adsl@freebsddiary.cx>; Wed, 30 Jun 1999 22:35:35 
                                                       +1200 (NZST)
Received: from wocker (210-55-152-83.ipnets.xtra.co.nz 
                                                    [210.55.152.83])
        by metis.host4u.net (8.8.5/8.8.5) with SMTP id FAA17866
        for <adsl@freebsddiary.cx>; Wed, 30 Jun 1999 05:34:56 -0500
Message-Id: <199906301034.FAA17866@metis.host4u.net>
From: "Dan Langille" <dan.langille@dvl-software.com>
Organization: DVL Software Limited
To: adsl@freebsddiary.cx
Date: Wed, 30 Jun 1999 22:35:13 +1200

As you can see, it's the first line of the message headers which has to be changed.   Not much, but it had to be modified.  I didn't want to write to code, so I had a friend do it for me.  The code appears below.  It worked for my needs and is particular to the delimiters my mailer was using but perhaps you can use it as a starting point for your situation.

My thanks to Don Stokes <don@daedalus.co.nz> for writing this code.

#! /usr/bin/awk -f
#
# Convert Pegasus file of saved messages to Unix mailbox format.
#
#   Don Stokes, <don@daedalus.co.nz>    7 November 1999
#
BEGIN {
   for(;;) {                  # For each message in the file:
      if((getline) <= 0) exit # Skip the Return-Path: line
      hdr = ""                # EOF now means we're done.
      h = ""
      date = ""
      from = "huh"

      #
      # Parse the header, looking for the Date: and From: lines.
      # Deal with header continuation currectly.
      #
      while((getline) > 0) {
         if(!$1) break       # Blank line indicates end of
                             # headers
         hdr = hdr $0 "\n"   # Add header to saved headers
         if(substr($0,1,1) > " ") { # If new header...
            h = tolower($1)
            if(h == "date:") date = substr($0,length($1)+2)
            if(h == "from:") from = substr($0,length($1)+2)
         } else {      # else continuation from 
                  # previous line
            if(h == "date:") date = date $0
            if(h == "from:") from = from $0
         }
      }

      #
      # Parse From: address
      # Remove whitespace and RFC 822 comments in ()s
      # If an address in <>s is found, use that, otherwise
      # what is left after removing whitespace and comments is
      # the address
      #
      gsub("[\t ]", "", from)      # Kill whitespace
      gsub("\\(.*\\)", "", from)   # Remove (comment)s
      if(i = index(from, "<")) {   # Extract <u@h> if present
         from = substr(from, i+1)
         from = substr(from, 1, index(from, ">") - 1)
      }

      #
      # Parse the date
      # If no day ("Day,") found, assume it was Monday.  
      # Deal with 2-digit years
      #
      $0 = date
      if(!gsub(",","",$1)) $0 = "Mon " $0
      if($4 < 70) $4 += 2000      # <70 = 20xx
      if($4 < 1900) $4 += 1900   # <1900 = 19xx

      #
      # Output Unix From line:
      # From user@host  Day Mon DD HH:MM:SS YYYY
      # Follow that with the saved header.  Note that header 
      # terminates with a LF, so we don't need to add another.
      #
      printf "From %s  %s %s %2d %s %s\n", 
         from, $1, $3, $2, $5, $4
      print hdr

      #
      # Read the body and put it to the file, until we hit EOF
      # or the Pegesus delimiter "-- End --".  Quote any lines
      # starting with a naked From with a ">" in the canonical
      # (broken) Unix mail way.
      #
      while((getline) > 0) {
         if($0 == "-- End --") break
         if(substr($0,1,5) == "From ") printf ">"
         print
      }
   }
}
Hooking it all together
I ran the above code over the message files and then through hypermail.  Like this:
awk -f pegasus.conversion.awk may99.txt > may99.txt.out
hypermail -p -m "may99.txt.out" -c "hmrc.adsl" -d 
                 "/usr/local/www/data/freebsddiary/adsl/199905

Then all I had to do was create an index.html for the top directory and I was up and running.

What's left to do?
See periodic - using it to run shell scripts for details on how to automagically update the online archives at the end of each day. 

I also wrote up how I capture these messages in redirecting majordomo mailing lists.


Need more help on this topic? Click here
This article has no comments
Show me similar articles