icon Get the most out of Surmunity, read our tips here! Need an interesting blog to read? You've got to read the Surpass Blog! | Welcome! Please register to access all of our features.

» Surpass Web Hosting Forums » Discussions » Email » SpamAssassin to work with maildir

Email General questions, webmail, mailing lists.

Reply
 
LinkBack Thread Tools Search this Thread Rate Thread
Old March 1st, 2007, 1:15 PM   #1 (permalink)
Registered User
Fresh Surpasser
 
Joined in Feb 2006
Lives in Oklahoma
Hosted on pass80
18 posts
Gave thanks: 3
Thanked 0 times
SpamAssassin to work with maildir

I had spam assassin working with mbox and now have been changed to maildir. I sent a ticket in but got this response back below, can any one help with how to set this to work with all my emails now with maildir as the instruction on the sticky above do not work with maildir

Thanks
Victor

Hello,

We could understand that your script works only for the mbox format where mbox is a flat file ie, all of the messages for a particular mail folder, such as your Inbox, are all stuffed into a single file.
Your script has been set up for processing that single file.

In these here modern times maildir is the format of choice, because it is
faster and more fault-tolerant.Since it solves the reliability problems that
plague mbox files and mh folders, we had to convert most of our servers to maildir format.Here,each message is stored in a separate file.

So you should recorrect your script to go through each of the spam/ham folders instead of a single spam file.
Since we donot provide code support, you need to contact any script developer to recorrect it.

System Administrator
SurpassHosting.com
__________________
vcsok.com

Server reseller pass80
ironleg is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old March 1st, 2007, 5:01 PM   #2 (permalink)
Surpass Fan
Comfy Contributor
 
jdcopelin's Avatar
 
Joined in Feb 2004
Lives in Norfolk, England
Hosted on Pass32
167 posts
Gave thanks: 23
Thanked 19 times
Hello,

This was covered in another slightly unrelated thread some time ago, but I can't find it right now. Basically you need to adjust your script a little bit to allow for the fact that the maildir format stores messages as single files instead of putting them all in one big file. I won't post the whole script here, but this should get you started...

for SPAMFOLDER in `find $HOME/mail -name .spam -print`
do
find $SPAMFOLDER/cur -type f -name \*S | sa-learn --spam --no-sync -f -
done

The first line finds all ".spam" directories under all the /home/username/mail directory.
The third line takes each ".spam/cur" directory and searches for read message files. The output is passed to the sa-learn command.

You would then have to repeat this sequence separately and remove each file found to clear the learned spam/ham instead of running sa-learn.

EDIT - the third line needs the "-" at the end.

Hope this helps.
Jonathan
__________________
Server: Pass32 and dedicated server

Last edited by jdcopelin; March 1st, 2007 at 5:03 PM.. Reason: typo in the command
jdcopelin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old March 2nd, 2007, 9:14 PM   #3 (permalink)
Registered User
Fresh Surpasser
 
Joined in Feb 2006
Lives in Oklahoma
Hosted on pass80
18 posts
Gave thanks: 3
Thanked 0 times
Does this look correct.

#!/bin/sh
echo "Learning SPAM"
for FILE in `find $HOME -name .spam -print`
do
find $SPAMFOLDER/cur -type f -name \*S | sa-learn --spam --no-sync -f -
do
echo "Processing $FILE"
sa-learn --spam --maildir $FILE
done

echo "Learning HAM"
for FILE in `find $HOME -name .ham -print`
do
find $HAMFOLDER/cur -type f -name \*S | sa-learn --spam --no-sync -f -
do
echo "Processing $FILE"
sa-learn --ham --maildir $FILE
rm $FILE
touch $FILE
done
echo "Done"
__________________
vcsok.com

Server reseller pass80
ironleg is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old March 3rd, 2007, 11:15 AM   #4 (permalink)
Surpass Fan
Comfy Contributor
 
jdcopelin's Avatar
 
Joined in Feb 2004
Lives in Norfolk, England
Hosted on Pass32
167 posts
Gave thanks: 23
Thanked 19 times
Hi,

I don't think your script will do what it is supposed to. For the latest version of spamassassin you don't have to specify the maildir format in the command line. Also, you don't have to do a "touch" on the message file because you will just see lots of empty message bodies and headers in your IMAP folders.

Here is a simpler version of one I run on my server, it learns and removes only spam marked as read (a crude form of user-validation). It also moves all learned messages to the trash folder rather than deleting them, useful when testing scripts like this! Hope you find this useful.

Code:
#!/bin/sh

for SPAMFOLDER in `find /home/username/mail -name .spam -print`
        do
                echo " "
                echo -e "\tProcessing verified spam in $SPAMFOLDER"
                echo -n -e "\t\t"
                find $SPAMFOLDER/cur -type f -name \*S | sa-learn --spam --no-sync -f -
                echo -e "\t\tMoving messages marked as read..."
                cd $SPAMFOLDER
                cd ../.Trash
                echo -n -e "\t\tCurrent Directory: " ; pwd
                echo -n -e "\t\t"
                for MSG in `find $SPAMFOLDER/cur -type f -name \*S`
                        do
                                #echo -e -n "\t\t\t" ; mv -v $MSG cur/
                                echo -n "." ; mv $MSG cur/

                done
done

echo " "
echo "Learning from HAM"
for HAMFOLDER in `find /home/username/mail -name .ham -print`
        do
                echo " "
                echo -e "\tProcessing $HAMFOLDER/cur"
                echo -n -e "\t\t"
                find $HAMFOLDER/cur -type f -name \*, -or -name \*S | sa-learn --ham --
no-sync -f -

                echo -e "\t\tRemoving ham messages..."
                echo -n -e "\t\t"
                for MSG in `find $HAMFOLDER/cur -type f  -name \*, -or -name \*S`
                        do
                                #rm -v $MSG
                                echo -n "." ; rm $MSG
                done
done

echo " "
echo "Synchronising spam database"
#sa-learn --sync --progress
Cheers
Jonathan
__________________
Server: Pass32 and dedicated server
jdcopelin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
These users thank jdcopelin for this great post!
ironleg (March 3rd, 2007), tch3 (March 4th, 2007)
Old March 4th, 2007, 8:17 PM   #5 (permalink)
Destroyer of Evil Robots
Excelling Contributor
 
tch3's Avatar
 
Joined in Oct 2003
Lives in Atlanta, GA
760 posts
Gave thanks: 17
Thanked 9 times
Thanks for that new script. I've just been going in and running the commands by hand on the cur directory and never got around to automating the process after changing from mbox to maildir.
__________________
clair
http://tch3.com
(dedicated)
tch3 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
This user thanks tch3 for this great post!
jdcopelin (March 5th, 2007)
Old March 5th, 2007, 2:47 PM   #6 (permalink)
Registered User
Fresh Surpasser
 
Joined in Feb 2006
Lives in Oklahoma
Hosted on pass80
18 posts
Gave thanks: 3
Thanked 0 times
Do I need to do this for each email box or will this cover all the emails under our account

thanks in advance
__________________
vcsok.com

Server reseller pass80
ironleg is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old March 5th, 2007, 3:20 PM   #7 (permalink)
Destroyer of Evil Robots
Excelling Contributor
 
tch3's Avatar
 
Joined in Oct 2003
Lives in Atlanta, GA
760 posts
Gave thanks: 17
Thanked 9 times
Quote:
Originally Posted by ironleg View Post
Do I need to do this for each email box or will this cover all the emails under our account

thanks in advance
With the find scripts, it will cover all the emails. as long as you start it in the ~/mail directory.
__________________
clair
http://tch3.com
(dedicated)
tch3 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old March 5th, 2007, 3:32 PM   #8 (permalink)
Surpass Fan
Comfy Contributor
 
jdcopelin's Avatar
 
Joined in Feb 2004
Lives in Norfolk, England
Hosted on Pass32
167 posts
Gave thanks: 23
Thanked 19 times
Hi there,
You can run this script via cron for each domain you have. It searches the /home/yourusername/mail/* tree for spam and ham folders.

There are a couple of things you might find interesting, or encourage you to add to the script if you feel confident enough! It only looks in the /cur directory and not /new which means any emails that get delivered while you haven't logged in to your account won't be scanned by the script. If the spam messages aren't ever marked as read, then the spam folder will get bigger and bigger. To avoid taking each account over quota when lazy users don't mark them as read, you can add a third section to the script which does a search for files that were created more than 7 days ago. That should normally give users enough time to locate false-positives.

If you run this script on a shared server for email accounts with over 100 messages, I would only run it on an individual email account at a time. It can take 8 minutes to run the sa-learn process on 2000 emails. The support staff might freak out when they see that running! Once you've done that, just run it once/twice a day for the whole domain.

Regards
Jonathan


PS - I can't edit my last post, but on the last line I had the command commented out while I tested the script before posting it. I meant to remove the # after I pasted the text, you should uncomment it to allow spamassassin to synchronise it journal and database.
__________________
Server: Pass32 and dedicated server
jdcopelin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old August 14th, 2007, 12:46 AM   #9 (permalink)
minor deity
Super #1
 
Bigjohn's Avatar
 
Joined in Apr 2004
Lives in Georgia
Hosted on XEON
7,395 posts
Gave thanks: 28
Thanked 94 times
Quote:
Originally Posted by jdcopelin View Post
Hi there,
<snip>
If you run this script on a shared server for email accounts with over 100 messages, I would only run it on an individual email account at a time. It can take 8 minutes to run the sa-learn process on 2000 emails. The support staff might freak out when they see that running! Once you've done that, just run it once/twice a day for the whole domain.
So much for Maildir being more "efficient". My old script ran spam and ham on 14 accounts in one domain in 2 minutes total.
__________________
Proud to be a Surmunity Mod!
XEON PASS60 PASS61
Make a fundamental difference!
My Sites:
Curious about Brewing Beer? Join the community!
>>>>> Some Change is GOOD! Keep your paycheck! Support the Fair Tax
Get into an Art museum
Victorian London
It's your brain -ON WEB - mybrainhost.com (under development)
What SHOULD Government do? Much Less than it Does!
Bigjohn is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On