Operating System & Version
Centos 7
cPanel & WHM Version
v84.0.21

cPanelLauren

Product Owner II
Staff member
Nov 14, 2017
13,266
1,301
363
Houston
This is done with sa-learn. Information on its use can be found here: sa-learn - train SpamAssassin's Bayesian classifier

There are two primary types of training you can apply Supervised and Unsupervised (bayes_auto_learn)

Supervised learning
This means keeping a copy of all or most of your mail, separated into spam and ham piles, and periodically re-training using those. It produces the best results, but requires more work from you, the user.

(An easy way to do this, by the way, is to create a new folder for 'deleted' messages, and instead of deleting them from other folders, simply move them in there instead. Then keep all spam in a separate folder and never delete it. As long as you remember to move misclassified mails into the correct folder set, it is easy enough to keep up to date.)

Unsupervised learning from SpamAssassin rules
Also called 'auto-learning' in SpamAssassin. Based on statistical analysis of the SpamAssassin success rates, we can automatically train the Bayesian database with a certain degree of confidence that our training data is accurate.

It should be supplemented with some supervised training in addition, if possible.

This is the default, but can be turned off by setting the SpamAssassin configuration parameter bayes_auto_learn to 0.

  • One great way to implement this (that's pretty user-friendly is to set up the following:
    • In each mail account that will be participating in AutoLearn Create two folders Spam and Ham
    • Have your users move mail that is Spam or Ham to these folders
    • Create two cron jobs that run sa-learn to those folders so:
      • sa-learn --ham
      • sa-learn --spam
The Definition of HAM can be found here