The Community Forums

Interact with an entire community of cPanel & WHM users!
  1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

cpanel user run SA-learn? (spam assassin learner?)

Discussion in 'General Discussion' started by bigjohn, Feb 2, 2004.

  1. bigjohn

    bigjohn Well-Known Member

    Joined:
    Jun 7, 2003
    Messages:
    77
    Likes Received:
    0
    Trophy Points:
    6
    Is there a way that an end user can run the SPAM Assassin learner (SA-learn) on mail directed to his domain(s)?

    Please advise. It appears that Cpanel installs with SA:autolearn=off. Lots of spam with gibberish text is getting through.

    John
     
  2. bigjohn

    bigjohn Well-Known Member

    Joined:
    Jun 7, 2003
    Messages:
    77
    Likes Received:
    0
    Trophy Points:
    6
    I guess this is a no?
     
  3. h2oski

    h2oski Well-Known Member

    Joined:
    Dec 12, 2001
    Messages:
    68
    Likes Received:
    0
    Trophy Points:
    6
  4. bigjohn

    bigjohn Well-Known Member

    Joined:
    Jun 7, 2003
    Messages:
    77
    Likes Received:
    0
    Trophy Points:
    6
  5. bigjohn

    bigjohn Well-Known Member

    Joined:
    Jun 7, 2003
    Messages:
    77
    Likes Received:
    0
    Trophy Points:
    6
    bump!

    is this possible? ANYONE have a script that will run SA-LEARN? I'm a php/perl/java newbie... HELP

    John
     
  6. perlchild

    perlchild Well-Known Member

    Joined:
    Sep 1, 2002
    Messages:
    279
    Likes Received:
    0
    Trophy Points:
    16
    why do you want to run manually a script instead of letting spamassassin's autolearn feature(configured through a config file) do it's job?

    Just check an email that's gone through your server, and check the headers. Mine have autolearn=ham
    which means that it autolearns(the equivalent of sa-learn) "good" email, but not spam. You might want to look at the spamassassin.org for additional details, or the spamassassin manpage
     
  7. bigjohn

    bigjohn Well-Known Member

    Joined:
    Jun 7, 2003
    Messages:
    77
    Likes Received:
    0
    Trophy Points:
    6
    because the host I'm with has SA autolearn = off, and a lot of the messages that I'm getting would not be -spammed- by the autolearn rules (autolearn must score greater than 12 to be auto-spammed).

    So, I want to feed these messages, scoring between .2 and 4 that ARE spam to the bayesian filter so that it can learn the patterns that these spammers are using to fool it.

    John
     
    #7 bigjohn, Feb 9, 2004
    Last edited: Feb 9, 2004
  8. bigjohn

    bigjohn Well-Known Member

    Joined:
    Jun 7, 2003
    Messages:
    77
    Likes Received:
    0
    Trophy Points:
    6
    example email not detected as spam

    Return-path: <xxx@dime4.dizinc.com>
    Envelope-to: xxx@xxxxx.net
    Delivery-date: Mon, 09 Feb 2004 17:52:51 -0500
    Received: from xxxxxxby xxxxx.xxxxx.com with local-bsmtp (Exim 4.24)
    id 1AqKGo-0004Pn-NJ
    for xxxx@xxxxx.net; Mon, 09 Feb 2004 17:52:51 -0500
    Received: from [24.200.127.250] (helo=modemcable250.127-200-24.mc.videotron.ca)
    by xxxxx.xxxxx.com with smtp (Exim 4.24)
    id 1AqKGn-0004PZ-Mp
    for xxxxx@xxxxx.net; Mon, 09 Feb 2004 17:52:50 -0500
    Received: from [24.200.127.250] by 60.165.224.252 with HTTP;
    Tue, 10 Feb 2004 20:59:54 +0600
    From: "Jacques Malone" <excretioneuphrates@attbi.com>
    To: xxxx@xxxxx.xxx
    Subject: Re: order # 7957
    Mime-Version: 1.0
    X-Mailer: mPOP Web-Mail 2.19
    X-Originating-IP: [60.165.224.252]
    Date: Tue, 10 Feb 2004 16:55:54 +0200
    Reply-To: "Jacques Malone" <spraindauphine@attbi.com>
    Content-Type: multipart/alternative;
    boundary="=_NextPart_000_000D_18B28UR8_6496Z4461"
    Message-Id: <BSJHCKD-0003593608140@subservient>
    X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on
    dime4.dizinc.com
    X-Spam-Status: No, hits=0.4 required=4.5 tests=HTML_MESSAGE,
    MIME_BOUND_NEXTPART,SUBJ_HAS_UNIQ_ID autolearn=no version=2.61
    X-Spam-Level:


    --=_NextPart_000_000D_18B28UR8_6496Z4461
    Content-Type: text/plain; charset=us-ascii
    Content-Transfer-Encoding: 8bit

    lebensraum chromate livid bassett clomp ceres stew apotheosis salle godwin
    logarithm eleanor trilobite annular mulish courage
    ontology hyperboloidal octoroon licensable bashful

    --=_NextPart_000_000D_18B28UR8_6496Z4461
    Content-Type: text/html; charset=us-ascii
    Content-Transfer-Encoding: 8bit

    <html>


    <body>

    <p>Why not purchase some G.e.n.e.r.i.c V I A G R A - Order On-line - Fast Delivery! </p>

    <p>Costs over 50% less than Viagra® </p>

    <a href="http://www.2IO.vloebdfogt.com/gp/default.asp?ID=bw">

    <p>http://www.49D.care5678.com/gp/default.asp?ID=bw</a></p>

    <p>We also have these medications in highly discounted generic form:<br>
    <br>
    Ambien, Xanax, Phentermine, Lipitor, Nexium, Paxil, and Vioxx.<br>
    </p>

    <p>Physician Consultation: FREE! <br>
    Free discreet shipping</p>

    <p>EZ online form</p>


    <p><br>
    <br>
    <a href="http://www.OSx.store456.com/er/er.asp?Folder=gp">I want to say adios</a></p>


    <br><br>

    </body>
    </html>

    brian prostate lifeguard reactant disburse affectionate caper accipiter freshen conformation japanese clifton troutman abstractor cuba aspersion buffalo pogrom torpid bullhead dress chronography hicks downriver moore hurwitz arrear demon dibble rubin position ask leaf bureau commonwealth breastwork irksome dour fierce holstein televise rubbery joyride flack trio dialogue grab patriotic athabascan rangeland muskellunge rummage console tonnage lunacy anticipate bookend bahrein immune remunerate nip elsie quadrangle analogy contradict triptych nicodemus broil aquarius abbe incubate flotation
    <br><br>
    drummond crabmeat totemic flop gauze shield crater anglican confiscate allspice bouffant excoriate stereo pandemonium cowhand emitting rapt circumspect affair mormon demarcate asphalt coastline miscible oak carib eft damascus ainu gunk codetermine committeewoman selectric prof boardinghouse bow spalding balance musicale ingrown grievance anthem stopcock hinge matriculate moment floodlight pincushion cattle caviness gaze coefficient opprobrium p polygynous comparison laura wiggly chow wheezy puerto cognate cast coffee counterbalance winters fivefold


    --=_NextPart_000_000D_18B28UR8_6496Z4461--
     
    #8 bigjohn, Feb 9, 2004
    Last edited: Feb 9, 2004
  9. vivarey

    vivarey Registered

    Joined:
    Mar 11, 2004
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    It's possible to turn autolearn on without shell access. All you have to do is edit the /.spamassassin/user_prefs file and add the following 2 lines:

    use_bayes 1
    auto_learn 1

    Is that helpful?
     
  10. bigjohn

    bigjohn Well-Known Member

    Joined:
    Jun 7, 2003
    Messages:
    77
    Likes Received:
    0
    Trophy Points:
    6
    YES!

    can I train it against stored mail?
     
  11. vivarey

    vivarey Registered

    Joined:
    Mar 11, 2004
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    That's what I'm still trying to figure out. It SHOULD be possible though a cron job... I just don't know how to do it yet.
     
  12. perlchild

    perlchild Well-Known Member

    Joined:
    Sep 1, 2002
    Messages:
    279
    Likes Received:
    0
    Trophy Points:
    16
    You should research on the spamassasin web site how to remove the headers first(there's a spamassasin cmd line item to do that) or else you will train with the headers spamassassin adds itself, and you'll end up with a feedback loop, which will lead to a lot of false negatives and false positives.
     
  13. bigjohn

    bigjohn Well-Known Member

    Joined:
    Jun 7, 2003
    Messages:
    77
    Likes Received:
    0
    Trophy Points:
    6
    I want to train against mail that SpamAssassin MISSES.

    I have over 1000 messages that SA should be able to learn to catch - those messages that are padded with garbage text and bogus <imawebtag /> html 'tags'.

    John
     
  14. albatroz

    albatroz Well-Known Member

    Joined:
    Mar 6, 2003
    Messages:
    258
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    Virtual Orbis / Peru
    cPanel Access Level:
    Root Administrator
    Any news on this?


     
  15. bigjohn

    bigjohn Well-Known Member

    Joined:
    Jun 7, 2003
    Messages:
    77
    Likes Received:
    0
    Trophy Points:
    6
    A guy I know wrote a script that you can run as a cron job.

    Each user in the domain should use HORDE to make a SPAM and HAM folder, and MOVE SPAM to the SPAM folder, while at the same time copying some 'good' email to the HAM folder.

    This is the script:
    Code:
    #!/bin/sh
    echo "Learning SPAM"
    for FILE in `find $HOME -name SPAM -print`
    do
    echo "Processing $FILE"
    sa-learn --spam --mbox $FILE
    rm $FILE
    touch $FILE
    done
    
    echo "Learning HAM"
    for FILE in `find $HOME -name HAM -print`
    do
    echo "Processing $FILE"
    sa-learn --ham  --mbox $FILE
    rm $FILE
    touch $FILE
    done
    echo "Done"
    
    this is in a directory, ABOVE public_html (user 'root', i guess you'd call it), called script. the script file is called learnspam.

    learnspam is called by the cron job every week.

    John
     
  16. albatroz

    albatroz Well-Known Member

    Joined:
    Mar 6, 2003
    Messages:
    258
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    Virtual Orbis / Peru
    cPanel Access Level:
    Root Administrator
    Does it work fine?

    Is there a way to share your SPAM autolearned database
    with your other domains?

     
  17. bigjohn

    bigjohn Well-Known Member

    Joined:
    Jun 7, 2003
    Messages:
    77
    Likes Received:
    0
    Trophy Points:
    6
    I've not tried it - but I would think that you could duplicate the baysian database. I'm not sure.

    It does work, by the way. Once it had run 2 or three times (it deletes the processed messages, so it's not running on the same message twice...) and learned from about 150 'spams', the bayes scorer kicked in.

    I opened my inbox this morning and I had ZERO spam mail.

    John
     
  18. albatroz

    albatroz Well-Known Member

    Joined:
    Mar 6, 2003
    Messages:
    258
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    Virtual Orbis / Peru
    cPanel Access Level:
    Root Administrator
    Finally why do you post as it only works with Horde?
    what about Squirrelmail it also uses IMAP so mailboxes
    files would be same as if I where using Horde...

     
  19. bigjohn

    bigjohn Well-Known Member

    Joined:
    Jun 7, 2003
    Messages:
    77
    Likes Received:
    0
    Trophy Points:
    6
    I've only Tested it with Horde, as did the guy who wrote the script.

    Use with another program may not produce the same results.

    John
     
  20. albatroz

    albatroz Well-Known Member

    Joined:
    Mar 6, 2003
    Messages:
    258
    Likes Received:
    0
    Trophy Points:
    16
    Location:
    Virtual Orbis / Peru
    cPanel Access Level:
    Root Administrator
    Hi!
    I just tested the script from the command line and worked fine..
    However I have a question:

    your script crawls inside your directories looking for mboxes called SPAM, right?, so is it necessary to have such script running for each domain or it can be runned from the one general cron script for all the server... ?????


     
Loading...

Share This Page