icon Get the most out of Surmunity, read our tips here! Need an interesting blog to read? You've got to read the Surpass Blog! | Welcome! Please register to access all of our features.

» Surpass Web Hosting Forums » Discussions » Email » new spamassassin script

Email General questions, webmail, mailing lists.

Reply
 
LinkBack Thread Tools Search this Thread Rate Thread
Old July 31st, 2007, 6:18 PM   #19 (permalink)
Surpass Fan
Comfy Contributor
 
pseudoswede's Avatar
 
Joined in Jun 2003
Lives in Denver
Hosted on D9
142 posts
Gave thanks: 4
Thanked 3 times
Another thing...

How can I have this script automagically remove spam that is older than, say, 30 days?

Thanks in advance.
__________________
"In the end, everything will be fine - if it is not fine, it is not the end."
PseudoSwede
larvez.com
Dime9
pseudoswede is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old August 1st, 2007, 5:26 PM   #20 (permalink)
Surpass Fan
Comfy Contributor
 
jdcopelin's Avatar
 
Joined in Feb 2004
Lives in Norfolk, England
Hosted on Pass32
167 posts
Gave thanks: 23
Thanked 19 times
Hello,

Thanks for trying out my script. I will try to answer your questions.

Quote:
Originally Posted by pseudoswede View Post
Okay, I'm testing this out on one of my domains. I ran the cron job just now, and I got this result... Is this running correctly?
Unfortunately, no. There are some important lines missing from the output (see one of my previous post in this thread for an example of intended output).

Quote:
Here are my user-configurable variables...
Code:
# 1. spammassassin spam delivery folder name
MYSPAM=SPAM

# 2. learning spam (spam that arrived in the inbox, missed by spamassassin
MYLEARNSPAM=SPAM

# 3. ham (spam which got marked as spam when it was a genuine email
MYHAM=HAM
You need to put a "." at the start of the name of each folder. This is just the way maildir stores the actual folder and the script has to search for the actual name of the folder.

Quote:
How can I have this script automagically remove spam that is older than, say, 30 days?
The script sort of already does this. All messages present in a (spam) folder since the last login get moved to the specified archive folder SPAMDIR=$HOME/spamdb HAMDIR=$HOME/hamdb after 7 days. Somewhere around line 85 is this line of code:
Code:
SPAM_FILES=`find $SPAMFOLDER/cur -name \*, -mtime +7 -type f`
Just modify the +7 to +30 to change it to 30 days.
If you don't want to archive the files, then change the line of code a few lines down from 85
Code:
mv -f --target-directory=$SPAMDIR $SPAM_FILES
to
Code:
rm $SPAM_FILES
Feel free to ask if you have any more questions.

Cheers
Jonathan
__________________
Server: Pass32 and dedicated server
jdcopelin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old August 2nd, 2007, 5:38 PM   #21 (permalink)
Surpass Fan
Comfy Contributor
 
pseudoswede's Avatar
 
Joined in Jun 2003
Lives in Denver
Hosted on D9
142 posts
Gave thanks: 4
Thanked 3 times
Thanks for the help.

It kind of worked, but I do have two error messages...

Code:
Running jicoweb.com spam assassin training script

JicoScript learnspam script  0.6.20070609
SpamAssassin version 3.2.2

Currently running as user: ****

Checking spam and ham repositories exist...
/home/****/spamdb exists!
/home/****/hamdb exists!

Learning from SPAM

        /home/****/mail/****/mailtrap/.SPAM
                learning from spam ...
                Learned tokens from 0 message(s) (2 message(s) examined)
                learning from old spam ...
                /home/****/script/learnspam: /usr/bin/sa-learn: /usr/bin/perl: bad
interpreter: Argument list too long
/home/****/script/learnspam: line 94: /bin/mv: Argument list too long

        /home/****/mail/****/innebandy/.SPAM
                learning from spam ...
                Learned tokens from 351 message(s) (365 message(s) examined)
                learning from old spam ...
                Learned tokens from 14 message(s) (14 message(s) examined)

Learning from uncaught SPAM

        /home/****/mail/****/mailtrap/.SPAM
                learning from messages ...
                /home/****/script/learnspam: /usr/bin/sa-learn: /usr/bin/perl: bad
interpreter: Argument list too long
/home/****/script/learnspam: line 119: /bin/mv: Argument list too long

        /home/****/mail/****/innebandy/.SPAM
                learning from messages ...
                Learned tokens from 1 message(s) (1 message(s) examined)

Learning HAM
 
Synchronising spam database
bayes: synced databases from journal in 0 seconds: 154 unique entries (154 total
entries)
expired old bayes database entries in 19 seconds
157777 entries kept, 0 deleted
token frequency: 1-occurrence tokens: 71.65%
token frequency: less than 8 occurrences: 18.03%

Done
(username and domain name have been removed)

There are probably around 1000 spams that are older than 30 days in that folder.

The line(s) in question are..
Code:
mv -f --target-directory=$SPAMDIR $SPAM_FILES
__________________
"In the end, everything will be fine - if it is not fine, it is not the end."
PseudoSwede
larvez.com
Dime9

Last edited by pseudoswede; August 2nd, 2007 at 5:49 PM.. Reason: my grammar ain't so good
pseudoswede is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old August 3rd, 2007, 6:45 PM   #22 (permalink)
Surpass Fan
Comfy Contributor
 
jdcopelin's Avatar
 
Joined in Feb 2004
Lives in Norfolk, England
Hosted on Pass32
167 posts
Gave thanks: 23
Thanked 19 times
Quote:
Originally Posted by pseudoswede View Post
Thanks for the help.

It kind of worked, but I do have two error messages...

Code:
learning from old spam ...
/home/****/script/learnspam: /usr/bin/sa-learn: /usr/bin/perl: bad
interpreter: Argument list too long
/home/****/script/learnspam: line 94: /bin/mv: Argument list too long
There are probably around 1000 spams that are older than 30 days in that folder.

The line(s) in question are..
Code:
mv -f --target-directory=$SPAMDIR $SPAM_FILES

Oh no! That looks like a bug in my script (or limitation of how many files you can pass as a parameter in the command shell). Unfortunately my home PC has just suffered a hard drive failure so I don't have access to any of my files right now to investigate this and fix it. As soon as it is fixed, I will look into this for you.

Cheers
Jonathan
__________________
Server: Pass32 and dedicated server
jdcopelin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
This user thanks jdcopelin for this great post!
pseudoswede (August 3rd, 2007)
Old August 3rd, 2007, 6:56 PM   #23 (permalink)
Surpass Fan
Comfy Contributor
 
pseudoswede's Avatar
 
Joined in Jun 2003
Lives in Denver
Hosted on D9
142 posts
Gave thanks: 4
Thanked 3 times
I'm a software tester by trade, so I like to break things. Whether on purpose or accidentally.

Unfortunately, I'm not a developer, so I can't fix them.

Thanks for your help, I appreciate it.
__________________
"In the end, everything will be fine - if it is not fine, it is not the end."
PseudoSwede
larvez.com
Dime9

Last edited by pseudoswede; August 3rd, 2007 at 6:57 PM..
pseudoswede is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old August 14th, 2007, 5:29 PM   #24 (permalink)
Surpass Fan
Comfy Contributor
 
jdcopelin's Avatar
 
Joined in Feb 2004
Lives in Norfolk, England
Hosted on Pass32
167 posts
Gave thanks: 23
Thanked 19 times
Quote:
Originally Posted by pseudoswede View Post
I'm a software tester by trade, so I like to break things. Whether on purpose or accidentally.

Unfortunately, I'm not a developer, so I can't fix them.

Thanks for your help, I appreciate it.
pseudoswede, I haven't forgotten about you! My PC has only just come back online after my HDD crash - my DVD drive failed as well so couldn't even restore a backup until I got a new drive fitted

Anyway, I have done some more testing and the only way around the command line arguments limit that I can find will make the whole script grossly inefficient. If you know BASH scripting I think the only way would be to call sa-learn and mv individually for each maildir message file that the "find" command lists? That will definitely slow the script down.

I think I can reduce the files sent to sa-learn by using grep to filter out spam already learnt as spam (assuming false negatives and caught spam are kept in the same folder).

Any comments are welcome!

Cheers
Jonathan
__________________
Server: Pass32 and dedicated server
jdcopelin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old August 14th, 2007, 6:01 PM   #25 (permalink)
minor deity
Super #1
 
Bigjohn's Avatar
 
Joined in Apr 2004
Lives in Georgia
Hosted on XEON
7,395 posts
Gave thanks: 28
Thanked 94 times
Does the Spam Assassin WIKI help this process at all?
http://wiki.apache.org/spamassassin/BayesInSpamAssassin

We've gone from a 10 line simple, efficient script on MBOX to a convoluted mess of code due to maildir...*sigh*
__________________
Proud to be a Surmunity Mod!
XEON PASS60 PASS61
Make a fundamental difference!
My Sites:
Curious about Brewing Beer? Join the community!
>>>>> Some Change is GOOD! Keep your paycheck! Support the Fair Tax
Get into an Art museum
Victorian London
It's your brain -ON WEB - mybrainhost.com (under development)
What SHOULD Government do? Much Less than it Does!
Bigjohn is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old August 15th, 2007, 9:10 AM   #26 (permalink)
Surpass Fan
Comfy Contributor
 
jdcopelin's Avatar
 
Joined in Feb 2004
Lives in Norfolk, England
Hosted on Pass32
167 posts
Gave thanks: 23
Thanked 19 times
Quote:
Originally Posted by Bigjohn View Post
We've gone from a 10 line simple, efficient script on MBOX to a convoluted mess of code due to maildir...*sigh*
Sorry you think my code is a "convoluted mess". My post in the thread that you and Cowboy contributed heavily to was a fix to get the script to run on a server that switched from MBOX to MAILDIR. The script here does a lot more than that original script. Unfortunately it has fallen foul of a kernel limitation in the size in memory arguments passed to programs via the command line.

I am about to go on holiday so won't be reading the forum for about a fortnight, but when I get back I will look through older scripts I based on Cowboys MBOX version and post it here.

Cheers
Jonathan
__________________
Server: Pass32 and dedicated server
jdcopelin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old August 15th, 2007, 10:49 AM   #27 (permalink)
minor deity
Super #1
 
Bigjohn's Avatar
 
Joined in Apr 2004
Lives in Georgia
Hosted on XEON
7,395 posts
Gave thanks: 28
Thanked 94 times
Quote:
Originally Posted by jdcopelin View Post
Sorry you think my code is a "convoluted mess". My post in the thread that you and Cowboy contributed heavily to was a fix to get the script to run on a server that switched from MBOX to MAILDIR. The script here does a lot more than that original script. Unfortunately it has fallen foul of a kernel limitation in the size in memory arguments passed to programs via the command line.

I am about to go on holiday so won't be reading the forum for about a fortnight, but when I get back I will look through older scripts I based on Cowboys MBOX version and post it here.

Cheers
Jonathan
Jonathan - convoluted mess was not at all ment to be a disparagement, I hope you'll understand. My point was to simply state that something very elegant and simple as the original script cannot work due to the implementation of MAILDIR, enforcing further my conclusion that the actual "benefits" of maildir are probably limited to people who want GMAIL sized mail boxes and folders on their servers, where the limitations (speed of search, spam-assassin training, etc.) are foisted off on all.

Enjoy your holiday!
__________________
Proud to be a Surmunity Mod!
XEON PASS60 PASS61
Make a fundamental difference!
My Sites:
Curious about Brewing Beer? Join the community!
>>>>> Some Change is GOOD! Keep your paycheck! Support the Fair Tax
Get into an Art museum
Victorian London
It's your brain -ON WEB - mybrainhost.com (under development)
What SHOULD Government do? Much Less than it Does!
Bigjohn is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On