lfd on server.hostname: Email queue size alert

Kurieuo

Well-Known Member
Dec 13, 2002
106
0
166
Australia
About the same time I upgraded to cPanel 11.36 I've been receiving the following email notifications from LFD:

Time: Tue Apr 23 01:26:05 2013 +1000
Unable to obtain exim queue length within 30 seconds - Timed out

However, the email queues are pretty much empty at this time -- as Relayers don't show any abnormality.

There is a correlation between when cpup and cpbackup runs, which can be resource intensive. But the server otherwise runs a consistently below 0.5 and less.

Running sar shows some high %iowait times near the times.

I'm running CloudLinux and have a RAID1 setup. This server is getting on, but it otherwise performs very smoothly and runs like a dream. So this is bugging me.

Now I've started monitoring the server load more closely to see what is happening and how high it peaks at the said times.

Is anyone else experiencing this? Or, does anyone have any ideas why this has started happening now?

Thanks!
 

arunsv84

Well-Known Member
Oct 20, 2008
372
1
68
127.0.0.1
cPanel Access Level
Root Administrator
Have you tried checking the exim queue manually from shell? is it taking so much time? Also restart exim server and see the results.

Thanks!
 

Kurieuo

Well-Known Member
Dec 13, 2002
106
0
166
Australia
Thanks Arunsv, but that's not it.

After submittinga cPanel ticket, it was pointed out the IO% wait was high at certain intervals (via "sar" at shell). The server CPU speed definitely isn't being overloaded during the periods the LFD alerts are received.

This kind of correlated with the times I received the messages, but not always. And often it would be when cpup and cpbackup were scheduled to run, but again randomly outside this interval.

However, the issue actually started happening Sept last year -- I found a message going all the way back then. So I'll need to do a bit more investigation as to what changed then.

Otherwise, the issue doesn't really seem to be causing any problems beyond an I/O bottleneck generally during certain cronned events.
 
Last edited: