Objectives:
Instructions: In this assignment, you are asked to write a simple email filter in Perl.
Simple email messages have a simple format: they start with a set of headers, then a blank line, and then a body. Headers generally follow the form of /^[-a-zA-Z0-9_]+: .*$/ followed by zero or more continuation lines that start with whitespace (usually a tab) /\t.*/.
For instance, here's an email message:
Received: from mail.cs.fsu.edu (mail.cs.fsu.edu [128.186.120.4]) by newmail.cs.fsu.edu (Postfix) with ESMTP id 06476175D4C Received: by mail.cs.fsu.edu (Postfix) id 95D01F2DC4; Sat, 7 Jun 2008 03:54:40 -0400 (EDT) Delivered-To: langley Message-ID: <484A3E21.4090704@fsu.edu> Date: Sat, 07 Jun 2008 03:52:01 -0400 From: Tom Kitterman To: nolenet, OTC Help Desk Staff Subject: [Nolenet] Mailman listserv website down X-fsucs-MailScanner-SpamCheck: not spam, SpamAssassin (cached, score=-2.599, required 5, autolearn=not spam, BAYES_00 -2.60) X-Spam-Status: No Hi, There's something wrong with the mailman listserv website on lists.fsu.edu. This happened when we moved it to the new hardware. It's almost 4AM and I've run out of ideas on how to fix it at the moment so I'm going home to get some sleep and try again tomorrow. So for now that website is non-functional. The mailman list software is processing messages so this should mostly affect list owners. Until we get it fixed list owners should open a ticket through the help desk in the normal manner for any critical issues. Sorry for the inconvenience. Tom K. _______________________________________________ https://lists.fsu.edu/mailman/listinfo/nolenet -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Your task is to write a filter that reads standard input for one email message (please do not use the diamond operator since it will also read from files listed on the command line and that's not what this filter should do.)
It checks the message to see if there exists a Subject: header and if so, does that Subject: header contain either the marker [SPAM] or the marker {SPAM}; if a Subject: header in the message does contain either of those then no further processing happens (i.e., it "drops the message").
However, if it is not spam, your program should create a file in /tmp that has a filename of the form filter-HOSTNAME-TIMESTAMP where HOSTNAME is the value of running the program hostname and the TIMESTAMP is the time in the format YYYYMMDDHHMMSS. The file should have in it only the body of the message (i.e., just the message body with all headers removed — see the examples below.)
Thus in the above example, the output file would be named /tmp/filter-sophie.cs.fsu.edu-20081016114906 and would have the contents:
Hi, There's something wrong with the mailman listserv website on lists.fsu.edu. This happened when we moved it to the new hardware. It's almost 4AM and I've run out of ideas on how to fix it at the moment so I'm going home to get some sleep and try again tomorrow. So for now that website is non-functional. The mailman list software is processing messages so this should mostly affect list owners. Until we get it fixed list owners should open a ticket through the help desk in the normal manner for any critical issues. Sorry for the inconvenience. Tom K. _______________________________________________ https://lists.fsu.edu/mailman/listinfo/nolenet -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Your Perl program should be named filter-20081016.pl.
Here are a number of example input and output files that you can use for testing:
Sample Input File | Corresponding Sample Output File |
---|---|
1 | 1 |
2 | NO OUTPUT |
3 | 3 |
Homework submission
Please email your script filter-20081016.pl as an attachment to langley@cs.fsu.edu by no later than the beginning of class on Tuesday, October 21.