New high rate of false positives in Mailscanner?

dory36

Well-Known Member
Aug 30, 2003
179
0
166
Over the past week or two I have been getting a lot of false alarms on mail that had been handled fine for many months. By "a lot" I mean maybe a dozen a day, versus about 1 a week until recently.

Bayes seems to be to blame for many, but not all.

I have seen before, but can't find now, info on rebuilding the bayes database. Any pointers, or other suggestions?
 

graham_w

Well-Known Member
May 25, 2004
54
0
156
I too noticed an increase, and the those emails had Bayes_99 scoring, which scored those particular emails at 5 and marked them as Spam. I went looking for a way to rebuild the Bayes database but couldn't find it. So I disabled Bayes checking until I had some spare time to look into it further. It's stopped the false positives, and seems to be catching all the Spam without it, so I may leave it disabled. From what i've read Bayes adds a load to the server when checking.
 

chirpy

Well-Known Member
Verifed Vendor
Jun 15, 2002
13,466
31
473
Go on, have a guess
It's due to new rules included in the base set for SpamAssassin v3.2.0. Left running they can disrupt the bayesian database and you see an increase in false-positives. You can either:

1. Do as graham_w suggests and disable bayes, though this can be a very useful resource

2. Increase the default low scoring spam value, though that could allow through more flase-negatives

3. Hunt down the rules causing the problems from the false-positive email headers and adjust their scores in a custom SA ruleset in /etc/mail/spamassassin/
 

verdon

Well-Known Member
Nov 1, 2003
923
11
168
Northern Ontario, Canada
cPanel Access Level
Root Administrator
Here's one that's typical of some of the false positives I've been getting.

cached not
score=9.592
8 required
5.00 BAYES_99 Bayesian spam probability is 99 to 100%
3.20 FROM_LOCAL_NOVOWEL From: localpart has series of non-vowel letters
0.00 HTML_MESSAGE HTML included in message
1.40 MIME_QP_LONG_LINE Quoted-printable line longer than 76 chars
-0.00 SPF_PASS SPF: sender matches SPF record

... hard to know what to adjust. Other than the the BAYES_99 the bulk of the score is because of the from address which is not that odd of an address, [email protected] ... the message itself is pretty normal stuff
 

sct1061

Registered
Mar 24, 2004
1
0
151
UK
I would just add the following to what chirpy has suggested:

4. If it seems that your bayes database has been "poisoned" by the increase in false positives, remove the bayesian database and start over. Depending on your email traffic, it may take a few days before bayes has enough spam and non-spam tokens to start scoring email again (I think it needs 200 of each before it starts working). To wipe out the bayesian db, do the following:

rm -Rvf /var/spool/mqueue/.spamassassin