cpanel's exim not RFC compliant? (550 errors are ignored)

sehh

Well-Known Member
Feb 11, 2006
579
5
168
Europe
I found the following message while searching the net, which seems to be true, anyone can comment on this please?

I've noticed there are actually quite lot of domains are violating
RFCs and ignoring 550 errors. These are Exim servers (all versions)
configured by cPanel which many large hosting companies seem to be
using.

They apparently have this auto_thaw parameter configured which
unfreezes messages stuck in the queue. Messages are frozen when they
are found to be undeliverable so usually after 4 days or after they
get a 550 error. The problem is these stupid servers will accept
bogus messages with viruses, not be able to deliver them (550 Virus
found) then try to bounce them back to the bogus sender (you or me or
some other third party). Using auto_thaw they will retry sending the
messages usually every 2 hours about 27 times over the course of 2
days despite 550 User Unknown or 550 Virus found responses... each
time the message will again be refrozen only to be auto_thaw'ed by
their stupid configuration.

By ignoring 550 errors, they are violating RFCs.

Because this usually happens without human involvement, usually virus
related, no one has every noticed or someone may wrongly think it is
an infected machine repeatedly sending viruses when it is actually
only a single message a poorly configured server is retrying to
deliver. Going through my logs after I did notice, I quickly and
easily found dozens of example systems. I want to blacklist them all,
but I'll probably won't be able to do anything more than complain
about it.
 

sehh

Well-Known Member
Feb 11, 2006
579
5
168
Europe
I found what the problem is, apparently the default cpanel exim configuration has very lax rules about delivery failures.

The default is to re-send emails over and over again for a week, as a result the exim queue becomes huge on a busy server.

After careful tests, i came up with these:

auto_thaw = 1d
ignore_bounce_errors_after = 12h
timeout_frozen_after = 2d

it means:
force a retry for queued emails after 1 day in the queue
timeout and delete queued emails after 2 days in the queue
but bounce errors are deleted earlier, within 12hours.
 

cPanelNick

Administrator
Staff member
Mar 9, 2015
3,482
35
208
cPanel Access Level
DataCenter Provider
I found what the problem is, apparently the default cpanel exim configuration has very lax rules about delivery failures.

The default is to re-send emails over and over again for a week, as a result the exim queue becomes huge on a busy server.

After careful tests, i came up with these:

auto_thaw = 1d
ignore_bounce_errors_after = 12h
timeout_frozen_after = 2d

it means:
force a retry for queued emails after 1 day in the queue
timeout and delete queued emails after 2 days in the queue
but bounce errors are deleted earlier, within 12hours.
You should probably set auto_thaw to something larger then both of the other values.
 

sehh

Well-Known Member
Feb 11, 2006
579
5
168
Europe
You should probably set auto_thaw to something larger then both of the other values.
but if i do that then "timeout_frozen_after" will take first precedence and delete frozen emails before "auto_thaw" has a change to resend them, right?
 

cPanelNick

Administrator
Staff member
Mar 9, 2015
3,482
35
208
cPanel Access Level
DataCenter Provider
but if i do that then "timeout_frozen_after" will take first precedence and delete frozen emails before "auto_thaw" has a change to resend them, right?
By making auto thaw lower, you are creating the same problem you are trying to avoid:



auto_thaw Use: main Type: time Default: 0s

If this option is set to a time greater than zero, a queue runner will try a new delivery attempt on any frozen message, other than a bounce message, if this much time has passed since it was frozen. This may result in the message being re-frozen if nothing has changed since the last attempt. It is a way of saying “keep on trying, even though there are big problems”.

Note: This is an old option, which predates timeout_frozen_after and ignore_bounce_errors_after. It is retained for compatibility, but it is not thought to be very useful any more, and its use should probably be avoided.
 

sehh

Well-Known Member
Feb 11, 2006
579
5
168
Europe
I see what you mean. Unfortunately i see no results either way.

As per your suggestion i changed my configuration to:

Code:
ignore_bounce_errors_after = 12h
timeout_frozen_after = 2d
auto_thaw = 3d
unfortunately my email queue is still huge and growing (685 emails, it was around 300 before).

many of them are 4 or 3 days old, which means the "timeout_frozen_after" isn't doing anything.
 

sehh

Well-Known Member
Feb 11, 2006
579
5
168
Europe
Since nothing i try can delete my frozen and old messages from the queue, i found this little command that i run as a cronjob and does it all !!!

Code:
exiqgrep -i -o 259200 | xargs /usr/sbin/exim -Mrm
it will delete all queued/frozen messages that have been in the queue for 3 days.
 

gundamz

Well-Known Member
Mar 27, 2002
245
0
316
I got:

[email protected] [~]# exiqgrep -i -o 259200 | xargs /usr/sbin/exim -Mrm
Line mismatch: 692d 1EFkjY-0006KK-GY <>
exim: no message ids given after -Mrm option
 

sehh

Well-Known Member
Feb 11, 2006
579
5
168
Europe
indeed, you'll get "no message ids" when there are no emails to delete

you may ignore that error, or remove it completely by adding " >/dev/null 2>/dev/null" at the end of the command.
 

gundamz

Well-Known Member
Mar 27, 2002
245
0
316
indeed, you'll get "no message ids" when there are no emails to delete

you may ignore that error, or remove it completely by adding " >/dev/null 2>/dev/null" at the end of the command.
okay.

exiqgrep -i -o 259200 | xargs /usr/sbin/exim -Mrm > /dev/null 2>&1 will get server to delete the old mail?
 

sehh

Well-Known Member
Feb 11, 2006
579
5
168
Europe
thats correct, i'm using that in a cron job.