I've been training spamassassin like crazy, but it's not doing any good - many users are getting around 50% spam, and the reason is that the bayes scores are too low.
Check this example - I have copied an example spam message that was processed by the mail server, to /root/tmp/spam-example. If I cat that file, then try running that file through spamassassin manually, I see very different Bayes results:
The message as processed normally:
Here's the same message run through spamassassin from the command line in debug mode:
So when run through manually, the Bayes probability is very high - but as processed normally by exim, it gets a negative Bayes score!
Ideas? Thanks.
Check this example - I have copied an example spam message that was processed by the mail server, to /root/tmp/spam-example. If I cat that file, then try running that file through spamassassin manually, I see very different Bayes results:
The message as processed normally:
Code:
cat /root/tmp/spam-example
X-Spam-Status: No, score=1.4
X-Spam-Score: 14
X-Spam-Bar: +
pts rule name description
---- ---------------------- --------------------------------------------------
1.7 URIBL_BLACK Contains an URL listed in the URIBL blacklist
[URIs: fcagahaujeqaraf.tk]
-0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
domain
1.6 URIBL_WS_SURBL Contains an URL listed in the WS SURBL blocklist
[URIs: fcagahaujeqaraf.tk]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
0.0 HTML_MESSAGE BODY: HTML included in message
X-Spam-Flag: NO
Code:
spamassassin -t -D < /root/tmp/spam-example
Content analysis details: (6.8 points, 2.0 required)
pts rule name description
---- ---------------------- --------------------------------------------------
1.7 URIBL_BLACK Contains an URL listed in the URIBL blacklist
[URIs: fcagahaujeqaraf.tk]
1.6 URIBL_WS_SURBL Contains an URL listed in the WS SURBL blocklist
[URIs: fcagahaujeqaraf.tk]
3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%
[score: 1.0000]
0.0 SINGLE_HEADER_3K A single header contains 3K-4K characters
-0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
domain
0.0 HTML_MESSAGE BODY: HTML included in message
So when run through manually, the Bayes probability is very high - but as processed normally by exim, it gets a negative Bayes score!
Ideas? Thanks.